Free Speech: Reviewing Coqui-ai, Mycroft Mimic3 and Tortoise TTS Libraries

Поделиться
HTML-код
  • Опубликовано: 18 окт 2024

Комментарии • 52

  • @ForTheEraOfLove
    @ForTheEraOfLove 7 месяцев назад

    I believe the review, demo and installation in that order are the future of video reviews. Good on you!🥇

  • @timemasheen5031
    @timemasheen5031 4 месяца назад

    This is an excellent video on these three TTS, thank you for your work JV.

  • @chrisBruner
    @chrisBruner Год назад

    Just the video I was looking for. Your eyes can be a bit brighter now, I've subscribed!

  • @adamabbassi
    @adamabbassi Год назад +8

    a tortoise docker file would be awesome!

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад

      copy that, I've got an introduction to Docker video on my todo list and my current plan is to use tortoise as a demo project for it.

  • @someuser4166
    @someuser4166 4 месяца назад

    i wonder if there's any way to run the game voice thing though CLI. im bulk translating all the voice lines in an already existing game to learn a new language but theres thousands of lines so i dont want to copy paste each one by hand and download and name them... also my GPU is from like 2013

  • @nathandscott1
    @nathandscott1 Год назад +1

    Good channel, look forward to future content.

  • @WeilianLi-h6q
    @WeilianLi-h6q 7 месяцев назад

    Thank you very very much. You helped me a lot. This is what I was looking for.

  • @JesseJuup
    @JesseJuup Год назад +2

    You can use Conda to easily set up an eviroment for running tortoise with python 3.9.

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад +2

      yeah, that's what I did - I found that it was still a bit of work to get all the dependencies happy though. I've also found that for a bunch of folks new to coding Docker is more accessible than a conda environment

  • @giacomosiiii
    @giacomosiiii 7 месяцев назад

    Tortoise sounds great but I was NOT ready to hear the "deniro" model sound British instead of Bronx LOL

  • @stevecato
    @stevecato Год назад

    Tortoise is good but very slow. Is the reason for this that it starts over from the voice training set every time? You mentioned the ability to save an intermediate vector of the voice - could you cover that in a video and whether it improves the speed. Thanks.

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад

      yeah, it's name is apt. I've seen a few derivative projects kicking around which are claiming significant speedups. I'm waiting until I find one with good voice cloning and multi lingual abilities and a commercially available license to do the follow up.

  • @obeyoutube
    @obeyoutube Год назад +3

    Hi there! Great video. I came across it when I was looking for a script for Linux that uses Mimic3 to pronounce highlighted text, similar to a built-in feature in MacOS. I've been trying to make it work for a few weeks now but so far without success. I would appreciate it if you could create a video on this. Thank you!

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад +2

      copy that, i'll give it a think - what part of the project are you finding the hardest? The getting the text part, turning it into speech or something else?

    • @obeyoutube
      @obeyoutube Год назад +1

      @@LearnCodeWithJV Truth be told, I have a script but it doesn't work. I've been trying to make it work with xbindkeys. Plus, I found another script online but for Espeak. I haven't managed to modify it, so, I would appreciate your help in this matter. In my opinion, it's a very useful MacOS feature, which I'd like to have on Linux with a decent voice. Espeak is too robotic and outdated.

  • @SumriseHD
    @SumriseHD Год назад +1

    What I would find amazing would be to find out how to build stuff like tortoise myself. I'm a 4th semester, but we didn't do a lot about machine learning stuff and I have no clue how to make advanced stuff like tortoise tts

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад +1

      You might want to reach out to the person who created tortoise. I've seen them comment on github that they were thinking of publishing their training methodology.

  • @MistakingManx
    @MistakingManx 6 месяцев назад

    What GPU do you have? As reference for just how hefty the last one is.

    • @LearnCodeWithJV
      @LearnCodeWithJV  6 месяцев назад

      RTX 3060 so pretty light weight

    • @MistakingManx
      @MistakingManx 6 месяцев назад

      @@LearnCodeWithJV I have the 4060TI, would that heavy one run well on mine?

  • @ROBBIEP
    @ROBBIEP Год назад +2

    cool cool video. I've been having a wild time trying to get tortoise-tts to work on the gpu. it works on cpu but very very slow.
    was suggested to look into pytorth gpu tutorials. looks like my conda enviroment wasnt allowing it but now I'm able to run cuda and use the gpu for thing. now I'm trying to get back into tortoise tts with prehaps a fork instal in hopes i can get some practicality from it. my gpu is tiny. 1660 but its better then the cpu. looks like quite a few folks have noticed outdated information. any additional advice appriciated.

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад

      I was finding it slow going on my 3060 so I can imagine what it felt like on a 1660, and cpu would be really tough to get useful work from it. When I build a Dockerfile for the project I'll play around a bit more to see if there are any obvious performance tweaks that might be useful.

  • @LucidFirAI
    @LucidFirAI Год назад

    So Tortoise TTS is great but using a model trained by someone else (so I know it works from the output); 2 out of 8 sentences were spoken in a male voice, it was a Melina (female) voice.
    If you have ANY clue why that would happen... I would love to know, this is all way beyond me I'm a mere end user and not a programmer.

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад

      I don't have any insights into that particular issues sorry.

  • @gfhdlsk
    @gfhdlsk Год назад

    You could update that mimic3 is "moved" to piper and is still improved by the same author

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад

      I wasn't aware of that - thanks for sharing, Updating the description now.

  • @clunescrossingangus2219
    @clunescrossingangus2219 Год назад

    I would love a docker file if you have one? Thanks for the video. If you are into the training of Text to image, Text to Speech and cloning of voices and finetuning for user cases (tuning an LLM to write stories in a genre or review data and report in a specific way, I for one would love to see how you approach it. Thanks again.

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад

      I've got it on to my list and plan to create one at some stage if no one else does.

    • @morganandreason
      @morganandreason Год назад +1

      I would like to second that request - I've been wanting to try Tortoise for a while, but the python dependencies issues have been keeping me from mustering the energy to do so. @@LearnCodeWithJV

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад +1

      I just went to look at doing this and turns out someone added one last week - github.com/neonbjb/tortoise-tts#docker
      Looks like development pace has picked up on the repo in the last few months.

  • @jamiediromero5016
    @jamiediromero5016 Год назад

    I see no Advanced Editor on the coqui web-site in menu

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад

      I remember it only showing up some of the time and being disabled when a voice was rendering. Have a half memory of it only working for built in voices but I can't recall.

  • @mdsalahuddin2841
    @mdsalahuddin2841 9 месяцев назад

    any opinion on Tacotron 2?

  • @blender_wiki
    @blender_wiki Год назад

    You must also include bark and vall-e

    • @LearnCodeWithJV
      @LearnCodeWithJV  Год назад

      Yes, they would be good to include if I make a follow up video.

  • @DevasheeshMishra
    @DevasheeshMishra Год назад

    if possible please compare llms for voice assistants and also how to finetune them

  • @Yeeeeeehaw
    @Yeeeeeehaw Год назад

    Great video

  • @abiramyk3514
    @abiramyk3514 Год назад

    What are the requirements are they using

  • @sinhnguyen4155
    @sinhnguyen4155 Год назад

    tortoise is the best, its intonation sounds like real human, however it's quite slow =]]

  • @youssefgamal7461
    @youssefgamal7461 Год назад

    what about voice cloning ?

  • @werthersoriginal
    @werthersoriginal 3 месяца назад

    Well F**k me lol...
    Coqui is shutting down.
    Thank you for all your support! ❤

  • @pablomax9376
    @pablomax9376 9 месяцев назад

    Coqui-ai shut down in the meantime. The website UI I mean, not the repo.

    • @LearnCodeWithJV
      @LearnCodeWithJV  8 месяцев назад

      yeah, it's a shame they couldn't make a go of it