This is some great work. Thanks for sharing your progress through example. I was looking forward to using fine tuning to come up with some variations of the same voice and some kind of program to know when to switch or vary the outputs similar to real human speech. This yelling voice works amazing and I look forward to what else you come up with.
Thanks for your enthusiasm, Jonas! Your idea about dynamically switching outputs is intriguing and could lead to even more realistic AI speech synthesis.
@@Natlamir Yes, but this was was on another level. Besides knowledge, teaching and entertainment to keep audience engaged. The voices chosen, the script, the comedic timing, responses and editing was truly entertainingly.. Dare I say distracting from it's academic purpose. Obviously you have a great sense of humor. It took me 30mins to watch it in it's entirety because my laughter was drowning out the content, causing me to seek back every few seconds, so I could absorb what you were trying to convey. I had to download so when I'm ready to tackle this endeavor It's backed up in case the YT AI revisits and flags something in appropriate by the off chance.
something i just found out by trying this - if you want to finetune a medium model, the quality of the model under cell "4. Settings" needs to match the quality of the base model. i.e., medium model = medium quality, high model = high quality.
Extremely useful, thanks for the upload, but why are you yelling haha! nah, this was epic and funny.. and it's a great example of the power of ML.. Thanks again
It's so deeply frustrating I think the only tutorial you've made that has straight up worked is the basic Piper install, and that's just an exe. I get all the way through to inference colab (which you say is optional, but isn't?) and the colab won't load the ckpt. It says it has, but then the final step says no voice is loaded.
I'm sorry you're having difficulties. The inference colab is meant to be optional, but it's useful for testing. For the ckpt loading issue, double-check your file paths and ensure the checkpoint file is in the correct location. I would like to re-visit this at some point to see if anything is changed with the process.
To retrain on new audio clips, you'll need to prepare your new dataset, adjust the training configuration, and run the training process again. There might be a way to continue where you left off while also adding more audio. Might need to check with the official documentation.
The content of this channel is very wonderful. Thank you. Is it possible to bring pre-trained files and download them from the Internet? I think I won't be able to train voices like you. It's hard for me. I am looking for the Arabic language. Please help.. For your information, the Arabic language in the Piper program is weak and does not read texts well
Thank you for your kind words. While pre-trained Arabic models might be available online, their quality can vary. For better results, you might want to consider training a model with high-quality Arabic voice data.
@@Natlamir Welcome back.. It's been a long time since I wrote the previous comment - I've downloaded many tools to generate voice and coqui was the best for pronouncing Arabic - I'm now looking for a tool to convert image to speaking video with lip sync..
yes, there is an open issue for that, i will be researching that and then will work to implement that at some point: github.com/natlamir/PiperUI/issues/3
Thanks for your feedback. I understand the voice may not be for everyone. You might prefer watching with subtitles or reading the video transcript instead.
The subtitles within the first minute is fantabulous, hollywood industrial level ... pretty creative! thumb up and salute to Nat Lamir.
😂🙏
This is some great work. Thanks for sharing your progress through example.
I was looking forward to using fine tuning to come up with some variations of the same voice and some kind of program to know when to switch or vary the outputs similar to real human speech. This yelling voice works amazing and I look forward to what else you come up with.
Thanks for your enthusiasm, Jonas! Your idea about dynamically switching outputs is intriguing and could lead to even more realistic AI speech synthesis.
Dude, this was one of the funniest videos I have seen in a while. Comedy gold.
Glad you enjoyed it! I aim to make learning about AI both informative and entertaining.
@@Natlamir Yes, but this was was on another level. Besides knowledge, teaching and entertainment to keep audience engaged. The voices chosen, the script, the comedic timing, responses and editing was truly entertainingly.. Dare I say distracting from it's academic purpose. Obviously you have a great sense of humor. It took me 30mins to watch it in it's entirety because my laughter was drowning out the content, causing me to seek back every few seconds, so I could absorb what you were trying to convey. I had to download so when I'm ready to tackle this endeavor It's backed up in case the YT AI revisits and flags something in appropriate by the off chance.
@@mitchmorgan6359 haha! That is awesome! Thank you! I 🙏 would like to do this kind of thing in future videos more often.
I'm totally going to set up Farnsworth's voice from Futurama with this.
That's an awesome idea! Farnsworth's voice would be perfect for this kind of project. Good luck with it!
something i just found out by trying this - if you want to finetune a medium model, the quality of the model under cell "4. Settings" needs to match the quality of the base model. i.e., medium model = medium quality, high model = high quality.
Great observation! Thanks for sharing this tip about matching model quality settings. It'll definitely help others avoid potential issues.
This voice is just fantastic 🎉
Thank you! I'm glad you like the voice.
Sir there is only Pretrained ckpt file but no last ckpt file i can't find file?
What is the folder structure for your google drive. Is there a lightning_logs folder?
If you followed it with a video of installing it locally, that would be very nice
I will need to look into that at some point, I think some components may require linux, however.
Extremely useful, thanks for the upload, but why are you yelling haha! nah, this was epic and funny.. and it's a great example of the power of ML.. Thanks again
Haha, thanks! I'm glad you found it both useful and entertaining. That's exactly what I was going for!
Great video!
I am only getting a the onnx.json file generated at the end though, not the voice model too.... Any idea why?!
Interesting, I plan to re-visit this in the future when I may need to re-train my original voice again.
@@Natlamir from having a dig around, looks like it's an issue that has occurred before.... Hopefully rectified soon!
@@SvenSvenson-y6s Thanks for looking into it. I appreciate your effort in investigating the issue. 🙏
It's so deeply frustrating I think the only tutorial you've made that has straight up worked is the basic Piper install, and that's just an exe.
I get all the way through to inference colab (which you say is optional, but isn't?) and the colab won't load the ckpt. It says it has, but then the final step says no voice is loaded.
I'm sorry you're having difficulties. The inference colab is meant to be optional, but it's useful for testing. For the ckpt loading issue, double-check your file paths and ensure the checkpoint file is in the correct location. I would like to re-visit this at some point to see if anything is changed with the process.
Super helpful video 👌🏾👌🏾👏🏾👏🏾👏🏾👏🏾
Thank you! I'm happy you found it helpful.
Great How do I retrain the model again ? On new audio clips
Yeah, I fineuned an existing model and I'm not sure how to continue training.
To retrain on new audio clips, you'll need to prepare your new dataset, adjust the training configuration, and run the training process again. There might be a way to continue where you left off while also adding more audio. Might need to check with the official documentation.
excatly what software we need to install already to my pc
If using Google Colab, you shouldn't need any software because it will run on Google's T4 GPU.
i am unable to get last ckpt in drive folder please metion it clearly can you please i have 20 wav file but it is my first time? Thank you
There should be a file in the lightning_logs/version_0/checkpoints
Great video 😀👍
🙏
how to do it locally. without any colab?
It can be done with WSL on windows.
@@Natlamir is there alex jones onnx already?
@@_zproxy 😂 Let me know if you find one
The content of this channel is very wonderful. Thank you. Is it possible to bring pre-trained files and download them from the Internet? I think I won't be able to train voices like you. It's hard for me. I am looking for the Arabic language. Please help.. For your information, the Arabic language in the Piper program is weak and does not read texts well
Thank you for your kind words. While pre-trained Arabic models might be available online, their quality can vary. For better results, you might want to consider training a model with high-quality Arabic voice data.
@@Natlamir Welcome back.. It's been a long time since I wrote the previous comment - I've downloaded many tools to generate voice and coqui was the best for pronouncing Arabic - I'm now looking for a tool to convert image to speaking video with lip sync..
i have 8gb graphics card is it ok. Gforce RTX 8 GB
If you are running locally, that might work, you might have to adjust some parameters like batch size.
Do you have a download link for this old man model?
I don't think I have this one anymore unfortunately, I will have to try to find it.
Can we use Persian language in the program?
unfortunately it doesn't look like Persian is available on that huggingface page with the models.
Old voice, horay 🎉
🙏
Nice xD
Thanks! Glad you enjoyed it.
can we control the speed of the voice ?
yes, there is an open issue for that, i will be researching that and then will work to implement that at some point: github.com/natlamir/PiperUI/issues/3
You can do that with the inference (CKPT) or (ONNX) notebooks through the interface.
hahaha great
🙏
is that bengali voice avalable?
I don't have specific information about Bengali voice models. You might need to train one yourself or search for pre-trained Bengali models online.
I can't listen to this voice, it makes me anxious
Thanks for your feedback. I understand the voice may not be for everyone. You might prefer watching with subtitles or reading the video transcript instead.
Everything is fine but it's not work in indian languages like hindi, sad to see Partility of developer
unfortunately there is no hindi voice, just nepali