i wonder if there's any way to run the game voice thing though CLI. im bulk translating all the voice lines in an already existing game to learn a new language but theres thousands of lines so i dont want to copy paste each one by hand and download and name them... also my GPU is from like 2013
yeah, that's what I did - I found that it was still a bit of work to get all the dependencies happy though. I've also found that for a bunch of folks new to coding Docker is more accessible than a conda environment
Tortoise is good but very slow. Is the reason for this that it starts over from the voice training set every time? You mentioned the ability to save an intermediate vector of the voice - could you cover that in a video and whether it improves the speed. Thanks.
yeah, it's name is apt. I've seen a few derivative projects kicking around which are claiming significant speedups. I'm waiting until I find one with good voice cloning and multi lingual abilities and a commercially available license to do the follow up.
Hi there! Great video. I came across it when I was looking for a script for Linux that uses Mimic3 to pronounce highlighted text, similar to a built-in feature in MacOS. I've been trying to make it work for a few weeks now but so far without success. I would appreciate it if you could create a video on this. Thank you!
copy that, i'll give it a think - what part of the project are you finding the hardest? The getting the text part, turning it into speech or something else?
@@LearnCodeWithJV Truth be told, I have a script but it doesn't work. I've been trying to make it work with xbindkeys. Plus, I found another script online but for Espeak. I haven't managed to modify it, so, I would appreciate your help in this matter. In my opinion, it's a very useful MacOS feature, which I'd like to have on Linux with a decent voice. Espeak is too robotic and outdated.
What I would find amazing would be to find out how to build stuff like tortoise myself. I'm a 4th semester, but we didn't do a lot about machine learning stuff and I have no clue how to make advanced stuff like tortoise tts
You might want to reach out to the person who created tortoise. I've seen them comment on github that they were thinking of publishing their training methodology.
cool cool video. I've been having a wild time trying to get tortoise-tts to work on the gpu. it works on cpu but very very slow. was suggested to look into pytorth gpu tutorials. looks like my conda enviroment wasnt allowing it but now I'm able to run cuda and use the gpu for thing. now I'm trying to get back into tortoise tts with prehaps a fork instal in hopes i can get some practicality from it. my gpu is tiny. 1660 but its better then the cpu. looks like quite a few folks have noticed outdated information. any additional advice appriciated.
I was finding it slow going on my 3060 so I can imagine what it felt like on a 1660, and cpu would be really tough to get useful work from it. When I build a Dockerfile for the project I'll play around a bit more to see if there are any obvious performance tweaks that might be useful.
So Tortoise TTS is great but using a model trained by someone else (so I know it works from the output); 2 out of 8 sentences were spoken in a male voice, it was a Melina (female) voice. If you have ANY clue why that would happen... I would love to know, this is all way beyond me I'm a mere end user and not a programmer.
I would love a docker file if you have one? Thanks for the video. If you are into the training of Text to image, Text to Speech and cloning of voices and finetuning for user cases (tuning an LLM to write stories in a genre or review data and report in a specific way, I for one would love to see how you approach it. Thanks again.
I would like to second that request - I've been wanting to try Tortoise for a while, but the python dependencies issues have been keeping me from mustering the energy to do so. @@LearnCodeWithJV
I just went to look at doing this and turns out someone added one last week - github.com/neonbjb/tortoise-tts#docker Looks like development pace has picked up on the repo in the last few months.
I remember it only showing up some of the time and being disabled when a voice was rendering. Have a half memory of it only working for built in voices but I can't recall.
I believe the review, demo and installation in that order are the future of video reviews. Good on you!🥇
thanks
This is an excellent video on these three TTS, thank you for your work JV.
Just the video I was looking for. Your eyes can be a bit brighter now, I've subscribed!
They are indeed. Thanks!
a tortoise docker file would be awesome!
copy that, I've got an introduction to Docker video on my todo list and my current plan is to use tortoise as a demo project for it.
i wonder if there's any way to run the game voice thing though CLI. im bulk translating all the voice lines in an already existing game to learn a new language but theres thousands of lines so i dont want to copy paste each one by hand and download and name them... also my GPU is from like 2013
Good channel, look forward to future content.
thanks for the encouragement :)
Thank you very very much. You helped me a lot. This is what I was looking for.
You can use Conda to easily set up an eviroment for running tortoise with python 3.9.
yeah, that's what I did - I found that it was still a bit of work to get all the dependencies happy though. I've also found that for a bunch of folks new to coding Docker is more accessible than a conda environment
Tortoise sounds great but I was NOT ready to hear the "deniro" model sound British instead of Bronx LOL
Tortoise is good but very slow. Is the reason for this that it starts over from the voice training set every time? You mentioned the ability to save an intermediate vector of the voice - could you cover that in a video and whether it improves the speed. Thanks.
yeah, it's name is apt. I've seen a few derivative projects kicking around which are claiming significant speedups. I'm waiting until I find one with good voice cloning and multi lingual abilities and a commercially available license to do the follow up.
Hi there! Great video. I came across it when I was looking for a script for Linux that uses Mimic3 to pronounce highlighted text, similar to a built-in feature in MacOS. I've been trying to make it work for a few weeks now but so far without success. I would appreciate it if you could create a video on this. Thank you!
copy that, i'll give it a think - what part of the project are you finding the hardest? The getting the text part, turning it into speech or something else?
@@LearnCodeWithJV Truth be told, I have a script but it doesn't work. I've been trying to make it work with xbindkeys. Plus, I found another script online but for Espeak. I haven't managed to modify it, so, I would appreciate your help in this matter. In my opinion, it's a very useful MacOS feature, which I'd like to have on Linux with a decent voice. Espeak is too robotic and outdated.
What I would find amazing would be to find out how to build stuff like tortoise myself. I'm a 4th semester, but we didn't do a lot about machine learning stuff and I have no clue how to make advanced stuff like tortoise tts
You might want to reach out to the person who created tortoise. I've seen them comment on github that they were thinking of publishing their training methodology.
What GPU do you have? As reference for just how hefty the last one is.
RTX 3060 so pretty light weight
@@LearnCodeWithJV I have the 4060TI, would that heavy one run well on mine?
cool cool video. I've been having a wild time trying to get tortoise-tts to work on the gpu. it works on cpu but very very slow.
was suggested to look into pytorth gpu tutorials. looks like my conda enviroment wasnt allowing it but now I'm able to run cuda and use the gpu for thing. now I'm trying to get back into tortoise tts with prehaps a fork instal in hopes i can get some practicality from it. my gpu is tiny. 1660 but its better then the cpu. looks like quite a few folks have noticed outdated information. any additional advice appriciated.
I was finding it slow going on my 3060 so I can imagine what it felt like on a 1660, and cpu would be really tough to get useful work from it. When I build a Dockerfile for the project I'll play around a bit more to see if there are any obvious performance tweaks that might be useful.
So Tortoise TTS is great but using a model trained by someone else (so I know it works from the output); 2 out of 8 sentences were spoken in a male voice, it was a Melina (female) voice.
If you have ANY clue why that would happen... I would love to know, this is all way beyond me I'm a mere end user and not a programmer.
I don't have any insights into that particular issues sorry.
You could update that mimic3 is "moved" to piper and is still improved by the same author
I wasn't aware of that - thanks for sharing, Updating the description now.
I would love a docker file if you have one? Thanks for the video. If you are into the training of Text to image, Text to Speech and cloning of voices and finetuning for user cases (tuning an LLM to write stories in a genre or review data and report in a specific way, I for one would love to see how you approach it. Thanks again.
I've got it on to my list and plan to create one at some stage if no one else does.
I would like to second that request - I've been wanting to try Tortoise for a while, but the python dependencies issues have been keeping me from mustering the energy to do so. @@LearnCodeWithJV
I just went to look at doing this and turns out someone added one last week - github.com/neonbjb/tortoise-tts#docker
Looks like development pace has picked up on the repo in the last few months.
I see no Advanced Editor on the coqui web-site in menu
I remember it only showing up some of the time and being disabled when a voice was rendering. Have a half memory of it only working for built in voices but I can't recall.
any opinion on Tacotron 2?
You must also include bark and vall-e
Yes, they would be good to include if I make a follow up video.
if possible please compare llms for voice assistants and also how to finetune them
copy that, I've got it on my list
Great video
Thanks!
What are the requirements are they using
which project were you referring to?
tortoise is the best, its intonation sounds like real human, however it's quite slow =]]
yeah, it definitely lives up to its namesake
what about voice cloning ?
It's on the list, stay tuned.
Well F**k me lol...
Coqui is shutting down.
Thank you for all your support! ❤
Coqui-ai shut down in the meantime. The website UI I mean, not the repo.
yeah, it's a shame they couldn't make a go of it