How I Do Voice Cloning in Other Languages with Tortoise TTS - Dataset and Tokenizer

Поделиться
HTML-код
  • Опубликовано: 20 сен 2024
  • Links referenced in the video:
    Github - github.com/Jar...
    Karpathy's tokenizer video - • Let's build the GPT To...
    Timestamps:
    0:40 - Explaining the process
    1:23 - ytdlp script
    3:30 - Transcription script with whisperx
    7:20 - Merge folders after transcription
    8:30 - Resampling to 22k hz
    13:25 - Uploaded scripts :)!
    14:03 - Making a tokenizer in another language
    16:30 - What is the tokenizer for?
    21:05 - Quick explanation on tortoise cleaners
    Hardware for my PC:
    Graphics Card - amzn.to/3pcREux
    CPU - amzn.to/43O66Ir
    Cooler - amzn.to/3p98TwX
    RAM - amzn.to/3NBAsIq
    SSD Storage - amzn.to/42NgMFR
    Power Supply (PSU) - amzn.to/430bIhy
    PC Case - amzn.to/447499T
    Mother Board - amzn.to/3CziMXI
    Alternative prebuilds to my PC:
    Corsair Vengeance i7400 - amzn.to/3p64r22
    MSI MPG Velox - amzn.to/42MnJHl
    Cheapest and PC recommended:
    Cyberpower 3060 - amzn.to/3XjtZoP
    Come join The Learning Journey!
    Discord - / discord
    Github - github.com/Jar...
    TikTok - / jarodsjourney
    If you found anything helpful, please consider supporting me and the content I am trying to produce!
    www.buymeacoff...
  • НаукаНаука

Комментарии • 17

  • @thienmytvho3096
    @thienmytvho3096 6 месяцев назад +1

    Will this gonna be on github later?Also I appreciate your effort on making these kind of video. Keep up the good work.

  • @radioketnoi
    @radioketnoi Месяц назад

    I don't understand where I went wrong. I'm training Vietnamese language. I used about 1 hour of my voice for training, created tokenzier with your python file for Vietnamese language "vi". Then I tested it with a sentence that was already in the audio sample. It produced a sound that was my voice. However, the sound produced was meaningless, not Vietnamese at all. Please tell me where I went wrong??

  • @bomar920
    @bomar920 6 месяцев назад +1

    Thanks 🙏 you deserve more subscribers.

    • @9-volt247
      @9-volt247 6 месяцев назад

      I deserve more subscribers, too, not only Jarods.

  • @diogenes848
    @diogenes848 6 месяцев назад +1

    Is there some place we can just download some working voices? I don't needs a "specific" voice just something as polished as possible. I'm wondering if I can use this to do "higher" quality TTS to listen to documents or ebooks. The processing time seems like it is going to just make that impossible regardless but I'd like to have some reasonable voices in the can just to play with... I've tried making a couple voices... they work... they're not great. Just want to download a sample voice that is polished if possible.

  • @shovonjamali7854
    @shovonjamali7854 6 месяцев назад

    Wow! Outstanding! Can you please tell me while taking the playlists for training, were those from single speakers or several ones?

  • @soorenapars
    @soorenapars 6 месяцев назад

    Nice explanation

  • @SAnsAN091190
    @SAnsAN091190 6 месяцев назад

    Jarod, what do you think, if there is a haginface dataset that contains audio tracks and decryption text for them, is it possible to use such a dataset with this project, without having to extract audio from it?
    P.S. A very useful video, especially about how english_cleaners breaks non-English languages) I'm going to screw on the Slavic tokenizer))
    P.P.S. I'm looking forward to the second part of the preparation!)))

    • @SAnsAN091190
      @SAnsAN091190 6 месяцев назад

      I'm also thinking about checking the decryption of audio tracks from such datasets. Since I saw for myself that in some cases the transcription and the sound in the audio do not match (sometimes people mess around and record just unrelated sounds). Well, exclude tracks that mostly do not match what is indicated in the transcript.

  • @ssix9448
    @ssix9448 5 месяцев назад

    Hi sir!
    You're doing a great job with TTS. Are you planning to release the Hindi TTS model?

    • @Jarods_Journey
      @Jarods_Journey  5 месяцев назад

      Wont be releasing the model, but the code to train it will all be available

  • @codemaster911
    @codemaster911 6 месяцев назад

    Thank you! what is your recommendation for the dataset length for high quality result?

  • @michikoangelineoey980
    @michikoangelineoey980 6 месяцев назад

    do we need to make new token if the language only using latin character?

  • @sukhpalsukh3511
    @sukhpalsukh3511 6 месяцев назад

    Suppose I have Hindi language audios with transcription I manually created or use any script,

  • @exelyugure
    @exelyugure 2 месяца назад +1

    You changed the code to the point that many stuffs are broken. For now, its unusable

  • @novatft5597
    @novatft5597 6 месяцев назад

    Can you share your code using in this video

  • @sukhpalsukh3511
    @sukhpalsukh3511 6 месяцев назад

    Appreciate your work, but it's complicated to understand could you please explain with just simple examples