Free Speech: Reviewing Coqui-ai, Mycroft Mimic3 and Tortoise TTS Libraries

Learn Code With JV

Просмотров 20 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 18 окт 2024

Комментарии • 52

@ForTheEraOfLove 7 месяцев назад
I believe the review, demo and installation in that order are the future of video reviews. Good on you!🥇
@LearnCodeWithJV 6 месяцев назад
thanks
@timemasheen5031 4 месяца назад
This is an excellent video on these three TTS, thank you for your work JV.
@chrisBruner Год назад
Just the video I was looking for. Your eyes can be a bit brighter now, I've subscribed!
@LearnCodeWithJV Год назад
They are indeed. Thanks!
@adamabbassi Год назад ⁺⁸
a tortoise docker file would be awesome!
@LearnCodeWithJV Год назад
copy that, I've got an introduction to Docker video on my todo list and my current plan is to use tortoise as a demo project for it.
@someuser4166 4 месяца назад
i wonder if there's any way to run the game voice thing though CLI. im bulk translating all the voice lines in an already existing game to learn a new language but theres thousands of lines so i dont want to copy paste each one by hand and download and name them... also my GPU is from like 2013
@nathandscott1 Год назад ⁺¹
Good channel, look forward to future content.
@LearnCodeWithJV Год назад
thanks for the encouragement :)
@WeilianLi-h6q 7 месяцев назад
Thank you very very much. You helped me a lot. This is what I was looking for.
@JesseJuup Год назад ⁺²
You can use Conda to easily set up an eviroment for running tortoise with python 3.9.
@LearnCodeWithJV Год назад ⁺²
yeah, that's what I did - I found that it was still a bit of work to get all the dependencies happy though. I've also found that for a bunch of folks new to coding Docker is more accessible than a conda environment
@giacomosiiii 7 месяцев назад
Tortoise sounds great but I was NOT ready to hear the "deniro" model sound British instead of Bronx LOL
@stevecato Год назад
Tortoise is good but very slow. Is the reason for this that it starts over from the voice training set every time? You mentioned the ability to save an intermediate vector of the voice - could you cover that in a video and whether it improves the speed. Thanks.
@LearnCodeWithJV Год назад
yeah, it's name is apt. I've seen a few derivative projects kicking around which are claiming significant speedups. I'm waiting until I find one with good voice cloning and multi lingual abilities and a commercially available license to do the follow up.
@obeyoutube Год назад ⁺³
Hi there! Great video. I came across it when I was looking for a script for Linux that uses Mimic3 to pronounce highlighted text, similar to a built-in feature in MacOS. I've been trying to make it work for a few weeks now but so far without success. I would appreciate it if you could create a video on this. Thank you!
@LearnCodeWithJV Год назад ⁺²
copy that, i'll give it a think - what part of the project are you finding the hardest? The getting the text part, turning it into speech or something else?
@obeyoutube Год назад ⁺¹
@@LearnCodeWithJV Truth be told, I have a script but it doesn't work. I've been trying to make it work with xbindkeys. Plus, I found another script online but for Espeak. I haven't managed to modify it, so, I would appreciate your help in this matter. In my opinion, it's a very useful MacOS feature, which I'd like to have on Linux with a decent voice. Espeak is too robotic and outdated.
@SumriseHD Год назад ⁺¹
What I would find amazing would be to find out how to build stuff like tortoise myself. I'm a 4th semester, but we didn't do a lot about machine learning stuff and I have no clue how to make advanced stuff like tortoise tts
@LearnCodeWithJV Год назад ⁺¹
You might want to reach out to the person who created tortoise. I've seen them comment on github that they were thinking of publishing their training methodology.
@MistakingManx 6 месяцев назад
What GPU do you have? As reference for just how hefty the last one is.
@LearnCodeWithJV 6 месяцев назад
RTX 3060 so pretty light weight
@MistakingManx 6 месяцев назад
@@LearnCodeWithJV I have the 4060TI, would that heavy one run well on mine?
@ROBBIEP Год назад ⁺²
cool cool video. I've been having a wild time trying to get tortoise-tts to work on the gpu. it works on cpu but very very slow.
was suggested to look into pytorth gpu tutorials. looks like my conda enviroment wasnt allowing it but now I'm able to run cuda and use the gpu for thing. now I'm trying to get back into tortoise tts with prehaps a fork instal in hopes i can get some practicality from it. my gpu is tiny. 1660 but its better then the cpu. looks like quite a few folks have noticed outdated information. any additional advice appriciated.
@LearnCodeWithJV Год назад
I was finding it slow going on my 3060 so I can imagine what it felt like on a 1660, and cpu would be really tough to get useful work from it. When I build a Dockerfile for the project I'll play around a bit more to see if there are any obvious performance tweaks that might be useful.
@LucidFirAI Год назад
So Tortoise TTS is great but using a model trained by someone else (so I know it works from the output); 2 out of 8 sentences were spoken in a male voice, it was a Melina (female) voice.
If you have ANY clue why that would happen... I would love to know, this is all way beyond me I'm a mere end user and not a programmer.
@LearnCodeWithJV Год назад
I don't have any insights into that particular issues sorry.
@gfhdlsk Год назад
You could update that mimic3 is "moved" to piper and is still improved by the same author
@LearnCodeWithJV Год назад
I wasn't aware of that - thanks for sharing, Updating the description now.
@clunescrossingangus2219 Год назад
I would love a docker file if you have one? Thanks for the video. If you are into the training of Text to image, Text to Speech and cloning of voices and finetuning for user cases (tuning an LLM to write stories in a genre or review data and report in a specific way, I for one would love to see how you approach it. Thanks again.
@LearnCodeWithJV Год назад
I've got it on to my list and plan to create one at some stage if no one else does.
@morganandreason Год назад ⁺¹
I would like to second that request - I've been wanting to try Tortoise for a while, but the python dependencies issues have been keeping me from mustering the energy to do so. @@LearnCodeWithJV
@LearnCodeWithJV Год назад ⁺¹
I just went to look at doing this and turns out someone added one last week - github.com/neonbjb/tortoise-tts#docker
Looks like development pace has picked up on the repo in the last few months.
@jamiediromero5016 Год назад
I see no Advanced Editor on the coqui web-site in menu
@LearnCodeWithJV Год назад
I remember it only showing up some of the time and being disabled when a voice was rendering. Have a half memory of it only working for built in voices but I can't recall.
@mdsalahuddin2841 9 месяцев назад
any opinion on Tacotron 2?
@blender_wiki Год назад
You must also include bark and vall-e
@LearnCodeWithJV Год назад
Yes, they would be good to include if I make a follow up video.
@DevasheeshMishra Год назад
if possible please compare llms for voice assistants and also how to finetune them
@LearnCodeWithJV Год назад ⁺²
copy that, I've got it on my list
@Yeeeeeehaw Год назад
Great video
@LearnCodeWithJV Год назад
Thanks!
@abiramyk3514 Год назад
What are the requirements are they using
@LearnCodeWithJV Год назад
which project were you referring to?
@sinhnguyen4155 Год назад
tortoise is the best, its intonation sounds like real human, however it's quite slow =]]
@LearnCodeWithJV Год назад
yeah, it definitely lives up to its namesake
@youssefgamal7461 Год назад
what about voice cloning ?
@LearnCodeWithJV Год назад ⁺¹
It's on the list, stay tuned.
@werthersoriginal 3 месяца назад
Well F**k me lol...
Coqui is shutting down.
Thank you for all your support! ❤
@pablomax9376 9 месяцев назад
Coqui-ai shut down in the meantime. The website UI I mean, not the repo.
@LearnCodeWithJV 8 месяцев назад
yeah, it's a shame they couldn't make a go of it

Следующие

Автовоспроизведение

Fine-tune Text-to-Speech Models for any Language: Introduction to TTS