Train / Finetune Custom Voice With Piper TTS

Natlamir

Просмотров 9 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 21 ноя 2024

Комментарии • 76

@Hugozen Год назад ⁺¹
The subtitles within the first minute is fantabulous, hollywood industrial level ... pretty creative! thumb up and salute to Nat Lamir.
@Natlamir Год назад ⁺¹
😂🙏
@Jonas_Fox Месяц назад
This is some great work. Thanks for sharing your progress through example.
I was looking forward to using fine tuning to come up with some variations of the same voice and some kind of program to know when to switch or vary the outputs similar to real human speech. This yelling voice works amazing and I look forward to what else you come up with.
@Natlamir Месяц назад
Thanks for your enthusiasm, Jonas! Your idea about dynamically switching outputs is intriguing and could lead to even more realistic AI speech synthesis.
@mitchmorgan6359 2 месяца назад ⁺¹
Dude, this was one of the funniest videos I have seen in a while. Comedy gold.
@Natlamir 2 месяца назад
Glad you enjoyed it! I aim to make learning about AI both informative and entertaining.
@mitchmorgan6359 2 месяца назад
@@Natlamir Yes, but this was was on another level. Besides knowledge, teaching and entertainment to keep audience engaged. The voices chosen, the script, the comedic timing, responses and editing was truly entertainingly.. Dare I say distracting from it's academic purpose. Obviously you have a great sense of humor. It took me 30mins to watch it in it's entirety because my laughter was drowning out the content, causing me to seek back every few seconds, so I could absorb what you were trying to convey. I had to download so when I'm ready to tackle this endeavor It's backed up in case the YT AI revisits and flags something in appropriate by the off chance.
@Natlamir 2 месяца назад
@@mitchmorgan6359 haha! That is awesome! Thank you! I 🙏 would like to do this kind of thing in future videos more often.
@spr0k3t 11 месяцев назад
I'm totally going to set up Farnsworth's voice from Futurama with this.
@Natlamir 2 месяца назад
That's an awesome idea! Farnsworth's voice would be perfect for this kind of project. Good luck with it!
@HAV0X_ 5 месяцев назад
something i just found out by trying this - if you want to finetune a medium model, the quality of the model under cell "4. Settings" needs to match the quality of the base model. i.e., medium model = medium quality, high model = high quality.
@Natlamir 2 месяца назад
Great observation! Thanks for sharing this tip about matching model quality settings. It'll definitely help others avoid potential issues.
@hikaroto2791 9 месяцев назад
This voice is just fantastic 🎉
@Natlamir 2 месяца назад
Thank you! I'm glad you like the voice.
@pktechentertainment863 Месяц назад ⁺¹
Sir there is only Pretrained ckpt file but no last ckpt file i can't find file?
@Natlamir Месяц назад ⁺¹
What is the folder structure for your google drive. Is there a lightning_logs folder?
@Mehdi0montahw Год назад ⁺²
If you followed it with a video of installing it locally, that would be very nice
@Natlamir Год назад ⁺²
I will need to look into that at some point, I think some components may require linux, however.
@schubertcardozo 11 месяцев назад ⁺¹
Extremely useful, thanks for the upload, but why are you yelling haha! nah, this was epic and funny.. and it's a great example of the power of ML.. Thanks again
@Natlamir 2 месяца назад
Haha, thanks! I'm glad you found it both useful and entertaining. That's exactly what I was going for!
@SvenSvenson-y6s 2 месяца назад
Great video!
I am only getting a the onnx.json file generated at the end though, not the voice model too.... Any idea why?!
@Natlamir 2 месяца назад
Interesting, I plan to re-visit this in the future when I may need to re-train my original voice again.
@SvenSvenson-y6s 2 месяца назад
@@Natlamir from having a dig around, looks like it's an issue that has occurred before.... Hopefully rectified soon!
@Natlamir 2 месяца назад
@@SvenSvenson-y6s Thanks for looking into it. I appreciate your effort in investigating the issue. 🙏
@LucidFirAI 8 месяцев назад ⁺³
It's so deeply frustrating I think the only tutorial you've made that has straight up worked is the basic Piper install, and that's just an exe.
I get all the way through to inference colab (which you say is optional, but isn't?) and the colab won't load the ckpt. It says it has, but then the final step says no voice is loaded.
@Natlamir 2 месяца назад
I'm sorry you're having difficulties. The inference colab is meant to be optional, but it's useful for testing. For the ckpt loading issue, double-check your file paths and ensure the checkpoint file is in the correct location. I would like to re-visit this at some point to see if anything is changed with the process.
@mayowaosibodu 5 месяцев назад
Super helpful video 👌🏾👌🏾👏🏾👏🏾👏🏾👏🏾
@Natlamir 2 месяца назад
Thank you! I'm happy you found it helpful.
@البيت_1 10 месяцев назад ⁺¹
Great How do I retrain the model again ? On new audio clips
@TechFoxNZ 5 месяцев назад
Yeah, I fineuned an existing model and I'm not sure how to continue training.
@Natlamir 2 месяца назад ⁺¹
To retrain on new audio clips, you'll need to prepare your new dataset, adjust the training configuration, and run the training process again. There might be a way to continue where you left off while also adding more audio. Might need to check with the official documentation.
@pktechentertainment863 Месяц назад
excatly what software we need to install already to my pc
@Natlamir Месяц назад
If using Google Colab, you shouldn't need any software because it will run on Google's T4 GPU.
@pktechentertainment863 Месяц назад
i am unable to get last ckpt in drive folder please metion it clearly can you please i have 20 wav file but it is my first time? Thank you
@Natlamir Месяц назад
There should be a file in the lightning_logs/version_0/checkpoints
@beysachpromax Год назад
Great video 😀👍
@Natlamir Год назад ⁺¹
🙏
@_zproxy Месяц назад
how to do it locally. without any colab?
@Natlamir Месяц назад
It can be done with WSL on windows.
@_zproxy Месяц назад
@@Natlamir is there alex jones onnx already?
@Natlamir Месяц назад
@@_zproxy 😂 Let me know if you find one
@احمدصبيح-خ7و 10 месяцев назад
The content of this channel is very wonderful. Thank you. Is it possible to bring pre-trained files and download them from the Internet? I think I won't be able to train voices like you. It's hard for me. I am looking for the Arabic language. Please help.. For your information, the Arabic language in the Piper program is weak and does not read texts well
@Natlamir 2 месяца назад
Thank you for your kind words. While pre-trained Arabic models might be available online, their quality can vary. For better results, you might want to consider training a model with high-quality Arabic voice data.
@احمدصبيح-خ7و 2 месяца назад
@@Natlamir Welcome back.. It's been a long time since I wrote the previous comment - I've downloaded many tools to generate voice and coqui was the best for pronouncing Arabic - I'm now looking for a tool to convert image to speaking video with lip sync..
@pktechentertainment863 Месяц назад
i have 8gb graphics card is it ok. Gforce RTX 8 GB
@Natlamir Месяц назад
If you are running locally, that might work, you might have to adjust some parameters like batch size.
@haha-hk9tx 7 месяцев назад
Do you have a download link for this old man model?
@Natlamir 2 месяца назад
I don't think I have this one anymore unfortunately, I will have to try to find it.
@ali-mf9pu Год назад
Can we use Persian language in the program?
@Natlamir Год назад
unfortunately it doesn't look like Persian is available on that huggingface page with the models.
@bolon667 Год назад ⁺¹
Old voice, horay 🎉
@Natlamir Год назад
🙏
@inout3394 7 месяцев назад
Nice xD
@Natlamir 2 месяца назад
Thanks! Glad you enjoyed it.
@DucNguyen-99 Год назад
can we control the speed of the voice ?
@Natlamir Год назад ⁺¹
yes, there is an open issue for that, i will be researching that and then will work to implement that at some point: github.com/natlamir/PiperUI/issues/3
@rmcpantoja 11 месяцев назад
You can do that with the inference (CKPT) or (ONNX) notebooks through the interface.
@celso2951 Год назад
hahaha great
@Natlamir Год назад
🙏
@ParadoxicalMuslim 11 месяцев назад
is that bengali voice avalable?
@Natlamir 2 месяца назад
I don't have specific information about Bengali voice models. You might need to train one yourself or search for pre-trained Bengali models online.
@greypsyche5255 2 месяца назад
I can't listen to this voice, it makes me anxious
@Natlamir 2 месяца назад
Thanks for your feedback. I understand the voice may not be for everyone. You might prefer watching with subtitles or reading the video transcript instead.
@teenudahiya01 Год назад
Everything is fine but it's not work in indian languages like hindi, sad to see Partility of developer
@Natlamir Год назад
unfortunately there is no hindi voice, just nepali

Следующие

Автовоспроизведение