The ONLY FREE AI Voice Text-to-Speech YOU NEED!!! (Bark AI Full Tutorial)

Поделиться
HTML-код
  • Опубликовано: 24 июл 2023
  • Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects. The model can also produce nonverbal communications like laughing, sighing and crying.
    ©️ Bark is now licensed under the MIT License, meaning it's now available for commercial use!
    ⚡ 2x speed-up on GPU. 10x speed-up on CPU. We also added an option for a smaller version of Bark, which offers additional speed-up with the trade-off of slightly lower quality.
    Suno Bark AI Official Repo - github.com/suno-ai/bark
    Bark AI Google Colab - colab.research.google.com/dri...
    Bark AI Speaker Prompts suno-ai.notion.site/8b8e8749e...
    ❤️ If you want to support the channel ❤️
    Support here:
    Patreon - / 1littlecoder
    Ko-Fi - ko-fi.com/1littlecoder

Комментарии • 94

  • @ceyhunakar1450
    @ceyhunakar1450 Год назад +13

    I was just watching your previous video on the same topic when I stumbled upon this new video about the open-source text-to-speech tool and its usage in Colab. That's amazing! You're doing a great job with your content. Keep up the good work!"

  • @IceMetalPunk
    @IceMetalPunk 11 месяцев назад +8

    Bark is probably the best text-to-audio AI around when it works, but it's also super unstable, which means it only works without falling apart like 20% of the time. I hope it improves with time; Bark 2 or Bark 3 might be a winner.

  • @dodgewagen
    @dodgewagen Год назад +1

    Another good video, thank you!
    The Spanish version sounds too robotic, but I guess it will increasingly get better and better overtime.

  • @RevanthMatha
    @RevanthMatha Год назад +1

    I find these kind of tutorials very helpful thanks

  • @PUV_Zero_One
    @PUV_Zero_One 6 месяцев назад

    Thank you for providing this video.

  • @hp24256
    @hp24256 Год назад +4

    Great video! Can you show us how to use it for longform audio generation? Is there a colab for it, or a way to simply append the code in the provided colab? Help appreciated!

  • @ojikutu
    @ojikutu Год назад

    Thank you

  • @d3mist0clesgee12
    @d3mist0clesgee12 Год назад +1

    Great stuff, thanks

  • @carlovonterragon
    @carlovonterragon Год назад +1

    Do you know if you can "update" the models to get better or train your own voice for bark?

  • @jayjeckel
    @jayjeckel 5 месяцев назад +1

    There is some confusion about licensing at the start of this video. The source code of Bark is now licensed under MIT and can be used in commercial software. This has nothing at all to do with how you can or can't use output generated by Bark. At least in the US, AI generated content is automatically public domain content and that means anyone can do anything they want with it, regardless of how the source code of the generating software is licensed.

    • @jamad-y7m
      @jamad-y7m 3 месяца назад

      where do you get that idea? if you wrote the text, then it is copyrighted by you. You can't copyright the voice recording of it (I'm not sure there has been any ruling on this yet besides with MidJourney), but what you wrote is yours.

    • @jayjeckel
      @jayjeckel 3 месяца назад

      @@jamad-y7m I'm not sure what you're disagreeing with; any script you wrote would be copyrightable, but an ai generated audio of that script wouldn't be copyrightable. Neither of those has anything to do with Bark having an MIT license.

  • @user-dn9bp6ky1c
    @user-dn9bp6ky1c 8 месяцев назад +1

    @1littlecoder, are there any tweaks to use BARK for real-time text-to-speech scenarios?

  • @DevasheeshMishra
    @DevasheeshMishra Год назад +3

    Please review a TTS system that is good for real-time speech generation from text. And I have a idea if we can implement a pipeline for speech generation like in chatgpt….. that will give us benefits in assistant like systems

  • @tishinpadilla100
    @tishinpadilla100 7 месяцев назад

    Thank you. Well done and thorough. I appreciate that you used a notebook.

  • @MarceloLimaXP
    @MarceloLimaXP Год назад

    This is a begining of something that s will be great. 😉

  • @CapitanMegaa
    @CapitanMegaa Месяц назад

    man this was the perfect tts but the 14 second limit and long wait to process the tts makes it horrible ;( hope there a way to fix it

  • @nic-ori
    @nic-ori Год назад +2

    Thanks.

  • @TerminatorSAW2k
    @TerminatorSAW2k 11 месяцев назад

    Could u make please a live streaming tutorial, so that i use this in this video 4 live streaming voice???

  • @BOSS_1417
    @BOSS_1417 Год назад

    I've got GTX 1650 Max Q and Ryzen 9, can I use or no point?

  • @srikantdhondi
    @srikantdhondi 11 месяцев назад

    how to clone voice using suno bark ?

  • @tuncelcel
    @tuncelcel 11 месяцев назад +3

    this text to voice is limited, rgiht? cuz it allow me record seconds. but i want to create my pdf to audiobook? please answer me friend. great channel

  • @QEDAGI
    @QEDAGI Год назад +3

    Bark's idea of laughter is kinda maniacal :|
    (the 1st example, that is)

    • @1littlecoder
      @1littlecoder  Год назад +3

      I edited out the video clip where I said it's villainous. Maniacal is probably a better word :D

    • @QEDAGI
      @QEDAGI Год назад +1

      @@1littlecoder Next time, don't hold back. Let's call it as it is. If any Inteligence, artifical or otherwise, get's offended then that's on them.

  • @JustMyRU
    @JustMyRU 11 месяцев назад

    Есть ли возможность использовать эту программу вместе с моделью, которую я сделал с помощью RVC?

  • @testingtime7780
    @testingtime7780 5 месяцев назад

    Works for me, but no AMD GPU support, CPU takes quite a while. For now very stable, but it have 13-14 sec limit :(

  • @mouadboss8888
    @mouadboss8888 Год назад +1

    I can't use it for a long text? only 13 sec

  • @thisisaname4868
    @thisisaname4868 11 месяцев назад

    how to do this in amd graphic card? and can this clone other language or just english?

  • @ananthrajan9097
    @ananthrajan9097 10 месяцев назад +1

    I could not able to create a lengthy audio (more than 40 words). It shows an error of "WARNING:bark.generation:warning, text too long, lopping of last 66.7%". Any solution for this?

  • @dayanithi0012
    @dayanithi0012 Год назад

    Bro I am getting generate_audio() is not defined. Help me sort it out.

  • @SkillTests4All
    @SkillTests4All 7 месяцев назад

    Can you explain step by step targeting a layman?

  • @greendsnow
    @greendsnow Год назад

    So the voices are finally NOT randomized in Suno? Normally you'd get all sorts of different voices. Is it always the selected voice now?

    • @MautozTech
      @MautozTech 11 месяцев назад

      Tortoise TTS is way better. I've tried both

  • @iswin-2861
    @iswin-2861 11 месяцев назад

    Do you know if it's possible to import our own model ?

  • @nandu18157
    @nandu18157 3 месяца назад

    Bro the problem is we can't give longer text,then it is giving error

  • @forwardatom
    @forwardatom Год назад

    Great tutorial, thanks

  • @thedoctor5478
    @thedoctor5478 Год назад +1

    I can never get it to behave like you would want for production. Voices always changing, hallucinations, and inference not real-time (Which is what I would need it for). I just tried it again on a 4090 with same result.

    • @MautozTech
      @MautozTech 11 месяцев назад

      Tortoise TTS is way better. I've tried both. But it's not very fast and takes 15 gb of GPU memory on standard preset

    • @thedoctor5478
      @thedoctor5478 11 месяцев назад

      @@MautozTech yeah but can you get it to keep the same voice and produce long outputs?

    • @MautozTech
      @MautozTech 11 месяцев назад

      @@thedoctor5478 I will test long outputs later, but voice stays the same. I can send you an example, my mail is in the channel description

  • @MichaelScharf
    @MichaelScharf Год назад +1

    Why not [MAN] as described in the docs?

  • @michaelforde4037
    @michaelforde4037 7 месяцев назад

    how can we change the rate in collab?

  • @yashsrivastava677
    @yashsrivastava677 Год назад

    Make a video on best speech to text AI as well..

  • @tharana5167
    @tharana5167 Год назад

    Can I use this for RUclips videos for free, RUclips videos that have ads

  • @tiberiusvetus9113
    @tiberiusvetus9113 Год назад +3

    Not good enough yet. I need a TTS that runs in real time on my phone without an internet connection.

    • @benshums
      @benshums Год назад

      I guess we are just chillin' for the next one to three years. You and me. Chillin.

  • @teriyala
    @teriyala 10 месяцев назад

    bro i am rohan , im studying CSE finial year please give any project title or idea for me

  • @benshums
    @benshums Год назад +1

    Can Bark run convincingly on an android phone? Has anyone tried to test this yet?

    • @hewerton_
      @hewerton_ 7 месяцев назад

      Yes, with google colab

  • @SMIK370
    @SMIK370 Год назад

    Can I use this in phone

  • @alex_smallet
    @alex_smallet Год назад +3

    Did you try NeMo TTS models? Specifically Tacotron2Model and HifiGanModel seems to work much better and faster than Bark.

  • @developpeur4713
    @developpeur4713 8 месяцев назад

    how to add kurdish language for text to speech

  • @InspiringVibesVideos
    @InspiringVibesVideos Год назад

    Hi, thanks for sharing, can you also do it offline, i mean by just using a VS code?

  • @veliea5160
    @veliea5160 11 месяцев назад

    why bark generates audio very slow. it takes like 3 mins to generate on my machine. is this normal duration?

    • @1littlecoder
      @1littlecoder  11 месяцев назад

      It depends upon the configuration of your machine and length of the sound

  • @nandu18157
    @nandu18157 3 месяца назад

    Why iam getting error?

  • @superman3id
    @superman3id 9 месяцев назад

    Its limited it only creates audio of 13 sec

  • @interspacer4277
    @interspacer4277 Год назад +2

    As much as I like Bark, I still prefer MS/Azure TTS for a nice balance of quality and speed. It's near realtime on a local machine, and even faster than Eleven Labs.
    Bark is okay, but it pretty much needs to run on a dedi machine tweaked for speed, and ideally with streamed/cache response to make it viable for conversational-NLP.

    • @blisphul8084
      @blisphul8084 Год назад +1

      You can run Azure neural/AI voices locally? I'm not seeing anything about that.

    • @interspacer4277
      @interspacer4277 Год назад

      @@blisphul8084
      I've run MS voices locally before, but obviously the connected versions are newer/better (and some require subs and some dont). It really depends on your needs. Like any model, if it's small enough, you can run it.
      Windows itself has embedded TTS, but it's unclear how extensible it is offline.

    • @kklaopo
      @kklaopo 9 месяцев назад

      @@interspacer4277 I hope Microsoft can improve their Azure ai voices, I don't know if it's just me, but I do think they voices are falling behind 11 labs and other company's products.

  • @rageshantony2182
    @rageshantony2182 Год назад

    Bad at Voice cloning.Doesn't have inbuilt support for voice cloning. There are some extensions for that, but not good

  • @BrackiesAi
    @BrackiesAi 10 месяцев назад

    Is there an api to do curl and http requests?

    • @1littlecoder
      @1littlecoder  10 месяцев назад

      elevenlabs.io/?via=1lc for now!

  • @Drugvigil
    @Drugvigil 8 месяцев назад

    Bro how to download generated audio from Google collab to local machine?

    • @1littlecoder
      @1littlecoder  8 месяцев назад

      If you right click there you'll see save as option

    • @Drugvigil
      @Drugvigil 8 месяцев назад

      @@1littlecoder Yes i made it. Thanks man.

  • @ThomasTomiczek
    @ThomasTomiczek Год назад

    That is - no quantized version?

  • @limpopo171
    @limpopo171 Год назад +1

    quality is not usable... personally i would never put this voice on the video...

  • @swannschilling474
    @swannschilling474 Год назад

    Bark is cool, but sadly kinda slow! 😅

  • @siriyakcr
    @siriyakcr Год назад

    can we use it for CV

    • @1littlecoder
      @1littlecoder  Год назад

      Do you mean Computer Vision ?

    • @siriyakcr
      @siriyakcr Год назад

      @@1littlecoder curriculum vitae 😁

  • @JLSXMK8
    @JLSXMK8 Год назад

    12:32 Translation: “Your colleague thinks that your German is extremely bad. But I suppose your English isn’t terrible!”

  • @topcca
    @topcca Год назад +20

    It's not better than tortoise TTS

    • @gabluz
      @gabluz Год назад +6

      Also, there's tortoise-tts-fast, but bark is very, VERY promising.

    • @topcca
      @topcca Год назад +5

      @@gabluz currently their voice cloning and tts is way to inferior to tortoise, but let's see if they improve it

    • @gabluz
      @gabluz Год назад +1

      @@topcca bark is also incredibly slow. That's disappointing.

    • @topcca
      @topcca Год назад +2

      @@gabluz yes, but for a reason.. there is always a balance between quality and quantity.. if you want good quality results, it will take more time

  • @fractalarbitrage
    @fractalarbitrage Год назад

    kept saying Bart too lol

    • @1littlecoder
      @1littlecoder  Год назад +1

      Honestly I don't know why maybe because I'm using Bard a lot or don't know. Stupid mistake

    • @fractalarbitrage
      @fractalarbitrage Год назад

      @@1littlecoder you're good brother i meant I kept saying bart when reading bark haha

    • @1littlecoder
      @1littlecoder  Год назад

      @@fractalarbitrage Oh I did that too. Edited it out in many places.

  • @__________________________6910

    Sometime the voice are bad

  • @dislive1
    @dislive1 5 месяцев назад

    Very bad tutorial