Applio vs. Kits: Multilingual TTS (and lip sync Face Swap!)

Поделиться
HTML-код
  • Опубликовано: 9 июн 2024
  • Download Kits.ai free: tinyurl.com/bob-doyle
    Download Applio: applio.org
    Download FaceFusion via Pinokio app: pinokio.computer
    There are several services that allow you to translate your voice into other languages.
    There are fewer services that allow you to use that translated audio to drive lip movement in videos.
    Today we compare the Multilingual Text to Speech abilities of two platforms we’ve covered on the channel several times, Applio and Kits, and then we’ll use that audio to drive a lip-sync video WITH face swap (much like I do at the end of all my videos.) using FaceFusion 2.5
    NONE of this is perfect, but it’s a glimpse into what will clearly only get better and more accessible to people as time goes on.
    00:00 Welcome to the Multilingual AI Wonderland!
    00:12 Diving Into Text-to-Speech Technologies
    01:02 Comparing Aplio and KITS: A Voice Cloning Showdown
    02:24 Tuning for the Perfect Pitch: A Voice Cloning Deep Dive
    05:06 Breaking Language Barriers: Multilingual Voice Cloning
    08:44 The Magic of Lip Sync with Face Fusion
    12:56 Troubleshooting and Perfecting the Lip Sync Process
    14:26 Final Thoughts and Future Adventures in AI
    Previous video reference:
    Kits: • Change the Singing Voi...
    Applio: • FREE Text to Speech wi...
    FaceFusion: • Can We Really Achieve ... (updated video coming)
    👍 LIKE If you found this video valuable. 🙂
    🥰 SHARE If you know someone who might enjoy this video.
    ⏬ DOWNLOAD or ADD This video to your PLAYLIST for easy access later.
    💬 COMMENT Your thoughts and questions are welcome!
    📝 SUBSCRIBE / @bobdoylemedia
    That's what keeps me going!
    🔔 ​​And make sure to hit the NOTIFICATION BELL to stay updated! 🔔
    🌎EXPLORE
    bobdoylemedia.com
    🗓️ MY HISTORY
    🧠 meetbobdoyle.com
    🤫 Featured Law Of Attraction and Neuroplasticity expert in the book and film, “The Secret”. • The Secret
    🎙️ Voice Over Artist 30+ years
    📻 Broadcaster / Actor / Creative
    ❓ASK ME
    Got a Question? bobdoyle@me.com
    📌 FOLLOW ME
    Facebook: / bobdoylemedia
    Instagram: / bobdoylemedia
    🪙SUPPORT
    If you want to support me, the best thing to do is to share the content… sharing is caring!
    If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
    www.thebobdoyleshow.com/tip
    Learn more from Bob Doyle Media: www.bobdoylemedia.com
  • НаукаНаука

Комментарии • 39

  • @bygimenez
    @bygimenez 10 дней назад +1

    Hello, here is an Applio developer. It is a great achievement for us that a programme developed by four young people is compared to a service with millions of investments. In our first year of development, we have reached goals that some companies do not achieve.
    We have many more goals to achieve and many more products to create, always free and open source for the community.
    Thanks for your review Bob! 💚🚀

  • @Mimic_217
    @Mimic_217 Месяц назад +2

    Thanks again for all these videos, they've been great for my channel.

  • @saas_money
    @saas_money 23 дня назад +2

    Thanks so much for your work bob❤😊

  • @NeedaNewAlias
    @NeedaNewAlias Месяц назад +1

    For German, the Applio voice sounded more synthetic and the Kitt voice had a bit of an English accent. But not as synthetic. Thx for your great work!

    • @BobDoyleMedia
      @BobDoyleMedia  Месяц назад +1

      I feel like that was where I forgot to change the language to German, so it would have been holding onto whatever the setting was before. That way my bad...

  • @Edbrad
    @Edbrad Месяц назад +1

    Text to speech in Udio is literally the best text to speech. It’s the most expressive. You should focus on testing Udio PURELY on spoken word. What’s great is it can go from spoken word to musical, it’s extremely expressive. After playing with this for a while I’m convinced it has some understanding of the meaning of what text is there. Which is incredible. Like it can’t just be random.
    It may however be less consistent if you need that and you may have to work with it to get the exact thing you want. But I wouldn’t ever use a different one now.
    Who’d have thought a music Ai can do spoken word, standup, arguments, lectures etc
    There so much creative detailed stuff you can do here. We’ve not even touched the surface here!

    • @BobDoyleMedia
      @BobDoyleMedia  Месяц назад +1

      My hesitation with getting too invested in creating content I'd use commercially with Udio, is the legal aspect of ownership and copyright. I"ve been watching a fair amount of videos about the legal aspects, and the fine print in the Udio user agreement.
      But I agree that Udio creates wonderful TTS results. Comedy routines, musicals with different singers and speech - you know all this. It's amazing.

    • @hikmetmertdincer6816
      @hikmetmertdincer6816 Месяц назад

      Udio is not for music?

    • @BobDoyleMedia
      @BobDoyleMedia  Месяц назад

      @@hikmetmertdincer6816 yes, it’s for music. It just happens to also do this

    • @NeedaNewAlias
      @NeedaNewAlias Месяц назад

      @@BobDoyleMedia Try to compare this to 11eleven T2S

  • @RockyBMusic
    @RockyBMusic Месяц назад +2

    In my opinion, the German translated version of Kits sounded more natural
    The Applio version was not bad and understandable, but sounded like the artificial overdubs you find on many youtube videos.
    Thanks for the comparison.
    Greetings
    Martin

  • @alexanderdichiara4874
    @alexanderdichiara4874 Месяц назад +4

    I‘m german. The Applio Version was better. The word Ort sounds like american. thank you for your great videos!

    • @BobDoyleMedia
      @BobDoyleMedia  Месяц назад

      Thanks for taking the time to leave your feedback!

  • @ludoviclebleu
    @ludoviclebleu Месяц назад

    This is a great job, very useful. Thank you!
    Have you tried other techs and how they compare and/or integrate with these?
    Tortoise, OpenVoice, Piper, RVC... and ultimately compare the result of an open source workflow with the exact texts and voice on Eleven Labs.

    • @BobDoyleMedia
      @BobDoyleMedia  Месяц назад

      Applio is basically RVC. It just puts everything together in a nice free package. I've played with Tortoise a lot, but it's been a while. Still not happy with the quality of the clones. Don't know about Piper - will check that out, and haven't looked at OpenVoice either, at least not that I remember.
      And I've done a LOT with ElevenLabs. My video on that actually jumpstarted this channel a good bit. I'm just always on he lookout for open source alternatives.
      I would be worth doing a side by side with 11 labs...and there is also Play.ht (I think they just modified their name a bit) which I've also done a video on.

  • @jmichaelingram
    @jmichaelingram Месяц назад

    Can't wait for you to look at 11 labs song creation tool that is apparently in beta

    • @BobDoyleMedia
      @BobDoyleMedia  Месяц назад

      I just reached out to them to try to get access. I'm all over it!

  • @frankieAllan-vr6qn
    @frankieAllan-vr6qn 27 дней назад

    Bob Im 79 got a 24 track home studio and had a few covers with a London publisher Rondor early 80s also recorded an Elvis tribute song at Rockfield studios in 1977..Why Im telling you all this is the questions I sometimes ask you might think comes from a dozo!! Although Ive been in the music business for quite sometime Ive been lazy taking in information which is much worse now at 79..You are a great presenter who doesnt rush things and its great for an oldie like me..I did manage to work out Relay and want to know do you do a tutorial for ACE Studio Thanks Frankie

    • @BobDoyleMedia
      @BobDoyleMedia  27 дней назад

      I do. Here's one: ruclips.net/video/7oY8pFhPoK4/видео.html

    • @frankieAllan-vr6qn
      @frankieAllan-vr6qn 26 дней назад

      @@BobDoyleMedia excellent thanks

  • @idontknowmorenames
    @idontknowmorenames 15 дней назад

    Dear Bob. when ever I let Applio speak german it has a very american accent. But your test sound very german. After watching your video multiple times I do not find the reason. I make sure that the TTS Voices is a german one, but still it has an accent. Does the Model has to be german, that would make sense, but as you used your own... ...idk. Do you have any idea. Best regards Anton

    • @BobDoyleMedia
      @BobDoyleMedia  15 дней назад

      Ironically, as I remember, I think I actually forgot to change the voice to German for the TTS, didn’t I? So I think it was actually working with the Japanese voice or the English voice. I’d have to go back and look closely.

  • @khajask8113
    @khajask8113 19 дней назад

    Hindi audio can generate Applio..?

  • @FSK2
    @FSK2 Месяц назад

    This whole video is created using face fusion for dubbing of the song .
    ruclips.net/video/8VVmWHvQ_sA/видео.html

  • @MIKEWHOSCO
    @MIKEWHOSCO Месяц назад

    "and in seconds youll see the face is replaced". I didnt see any "face swap". It just looked like you with a subtle mustache added.

    • @BobDoyleMedia
      @BobDoyleMedia  Месяц назад

      If you need to see more dramatic examples: ruclips.net/video/PwA14GOX1mI/видео.htmlsi=sCRZ6tyw4i2BGrCg

  • @StringerBell
    @StringerBell Месяц назад

    Sadly the wav2lip looks terrible and it's not usable in any real life production .(just like in the other videos I've search about it). It' s very sad we still don't have a reliable audio 2 lipsync technology available for use today.

  • @BStudioT
    @BStudioT Месяц назад

    In german applio sounds better, but still very like the usual TTS voices.

  • @EricLefebvrePhotography
    @EricLefebvrePhotography Месяц назад

    French Canadian here ... the French pronounciation on both systems is a bit off. Kits is MARGINALLY better.

    • @BobDoyleMedia
      @BobDoyleMedia  Месяц назад

      Thank you!

    • @ludoviclebleu
      @ludoviclebleu Месяц назад +1

      The last French example with the lipsync sounds very good. It has a slight American accent, but not as strong as the one on GPT app. Eleven Labs still sounds better, but this is very good.
      Maybe the accent is due to its training?

    • @BobDoyleMedia
      @BobDoyleMedia  Месяц назад

      @@ludoviclebleu If you're talking about the French example I ended up using for the last video, I realized I ALSO forgot to change the language to French in Kits...so that could explain the dialect problem. I think it was still set on Japanese or something.

    • @ludoviclebleu
      @ludoviclebleu Месяц назад

      LOL, ok.
      That would make sense, but it's still an American accent, not Japanese. Strange.

  • @ElaraArale
    @ElaraArale Месяц назад

    Do portuguese