CLONE ANY AI Voices for FREE LOCALLY in 1 CLICK! JUST INSANE!
HTML-код
- Опубликовано: 11 мар 2024
- The game-changing AI voice cloning tool RVC has is THE BEST open-source voice cloning tool EVER! With RVC, you can clone ANY AI voices with just 10 minutes of audio for Perfect results running LOCALLY on your own computer for FREE! In this video, I'll show you how to install the RVC WebUI on your computer in 1-CLICK! Plus, I'll show you how to easily clone any voices, show you how to convert any audio with the cloned voice model and how to use TTS with RVC locally so you can start having fun right now!
Have you managed to install the RVC WebUI? Let me know in the comments!
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
SOCIAL MEDIA LINKS!
✨ Support my work on Patreon: / aitrepreneur
⚔️ Join the Discord server: bit.ly/aitdiscord
🧠 My Second Channel THE MAKER LAIR: bit.ly/themakerlair
📧 Business Contact: theaitrepreneur@gmail.com
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
✨ PATREON LINK: / aitrepreneur
RVC: github.com/RVC-Project/Retrie...
FFMPEG: huggingface.co/lj1995/VoiceCo...
Launchers: huggingface.co/datasets/Aitre...
Coqui TTS: • CREATE UNCENSORED AI V...
Sillytavern: • INSTALL BEST UNCENSORE...
python -c "import torch; print(torch.__version__)"
pip install torch torchvision torchaudio --index-url download.pytorch.org/whl/cu118
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
►► My PC & Favorite Gear:
i9-12900K: amzn.to/3L03tLG
RTX 3090 Gigabyte Vision OC : amzn.to/40ANaue
SAMSUNG 980 PRO SSD 2TB PCIe NVMe: amzn.to/3oBR0WO
Kingston FURY Beast 64GB 3200MHz DDR4 : amzn.to/3osdZ6z
iCUE 4000X - White: amzn.to/40y9BAk
ASRock Z690 DDR4 : amzn.to/3Amcxph
Corsair RM850 - White : amzn.to/3NbXlm2
Corsair iCUE SP120 : amzn.to/43WR9nW
Noctua NH-D15 chromax.Black : amzn.to/3H7qQSa
EDUP PCIe WiFi 6E Card Bluetooth : amzn.to/40t5Lsk
Recording Gear:
Rode PodMic : amzn.to/43ZvYlm
Rode AI-1 USB Audio Interface : amzn.to/3N6ybFk
Rode WS2 Microphone Pop Filter : amzn.to/3oIo9Qw
Elgato Wave Mic Arm : amzn.to/3LosH7D
Stagg XLR Cable - Black - 6M : amzn.to/3L5Fuue
FetHead Microphone Preamp : amzn.to/41TWQ4o
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Special thanks to Royal Emperor:
- Totoro
- TNSEE
- RG
- Judy Godvliet
- Gluthoric
- Jay
Thank you so much for your support on Patreon! You are truly a glory to behold! Your generosity is immense, and it means the world to me. Thank you for helping me keep the lights on and the content flowing. Thank you very much!
#rvc #voicecloning #voiceclone #textgeneration #aivoices
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
WATCH MY MOST POPULAR VIDEOS:
RECOMMENDED WATCHING - All LLM & ChatGPT Video:
►► • CHATGPT
RECOMMENDED WATCHING - My "Tutorial" Playlist:
►► bit.ly/TuTPlaylist
Disclosure: Bear in mind that some of the links in this post are affiliate links and if you go through them to make a purchase I will earn a commission. Keep in mind that I link these companies and their products because of their quality and not because of the commission I receive from your purchases. The decision is yours, and whether or not you decide to buy something is completely up to you.
HELLO HUMANS! Thank you for watching & do NOT forget to LIKE and SUBSCRIBE For More Ai Updates. Thx
thanks
i thought "so vits svc" was better
can it be used on chrome os?
Is this possible to do on phone to or only PC I'd love to try this
I love how your accent seems to come thru every AI voice xD
FYI community, the transpose option our AI overlord used in this tutorial, should be noted that if you use this for music (which is awesome) that transpose option will actually change the key or note that the voice is on. So if you want to go from Johnny Cash to Michael Jackson, and still match the key or note range of the music you must transpose by a full 12 steps either up or down. Simple explination is there are 12 notes A-G and the sharps and flats between, each number on transpose moves the voice up or down those notes, relative to the one it's on and what you have set. Simply put, for speaking do what sounds right high to low, for singing, pay closer attention, you will be limited to a full 12 or -12 to stay in tune. (It will also match the key or notes of the recording not the model.) Hope that helps some folks.
Autotune.
@@moamber1 It's even better, you can train your voice and just replace on your favorite song, or duet even. Finally everyone can hear what you hear in the shower/car.
As I hoped for early this morning 😃
Thank you!
Hello. Is using a local RVC fork for training custom voice models better than using Google Colab?
You should remake this video, because the start of the manual installation guide is just wrong.
"python -c "import torch; print(torch.__version__)" is a command to check the version of already installed torch more on global level because you have not started the python environment yet.. Some might have it and some might not. And the thing you said about checking the cu117 or cu118 is also wrong because it doesn't really matter, you are running things in an environment which is basically it's own little container box that has nothing installed but the basic version of python itself and you install packages to that, which only that project can use. While version of the CUDA might be important in some projects whether it be cu116, cu117 or cu118 for torch, just that whole beginning section on this manual install guide is wrong.
And if you do not believe me, try running that ""python -c "import torch; print(torch.__version__)"" after you activate new environment.
Oh and anyone who is running into fairseq-issues: Use python 3.10.11
THANK YOUUUUUUU i was trtying for hourss omg u saved me
it also heep saying 'python' is not recognized as an internal or external command,
operable program or batch file.
@@carlajimenez1482 nvm it didnt work :((
I was really excited for this!!! Thank you!!
If you can it would be nice if you could chop your video up into chapters. That way patrons can just skip the install section and go back to it only if necessary.
Can you continue training an existing model or would you have to create a new one a train for longer?
1 click 25 minutes video, god bless you
RVC is great, i was struggling with making decent voices but i learned that as long as you train a TTS model to the style of talking you want RVC does the rest. It can take a while to get the hang of what you need for good data so when you are just testing it out don't train it too much to avoid wasting hours on something unusable.
I tried to train on my mother's voice and then generate using speech from a TTS, but her style and accent doesn't come through at all. I only have 11 minutes of audio to train on, but that doesn't seem to be enough.
Have you tried TortoiseTTS?
Tortoise imo gets people pretty good. You just have to watch what seed you are using when you generate audio. Some will sound more like your mom than others.
I'm curious if this is as good as Tortoise.@@redleader7988
Hey Man, do you know any solution less manual? I found some that use edge tts -> RVC but it doesn't have API or cmd support
i was just looking for exactly this last night. woooot. thank you
This is amazing! I never knew this program existed! Thank you so much!!!!!
I'm really enjoying AllTalk right now. Amazing quality and it still uses the tts2 model but has a really easy to use UI that helps you create awesome voice models. And you can use it inside Silly Tavern like the others.
I tried it but couldn't get as much quality compared to rvc though, but still nice
@@Aitrepreneur RVC does sound good but since I use it for SillyTavern, Alltalk is super easy and the difference isn't enough for me to switch. But I do enjoy these videos because I'm always learning something new.
@@Aitrepreneur Did you try the AllTalk finetune option? After finetuning merge the best samples .wavs together and put in voices folder and mess with temperature and repeat penalty sliders. Sounds exactly like my Elevenlabs v2 cloned voice now.
If i already installed Coqui TTSv2, do you think i still need to install this?
Does the manual installation that you shown work for AMD as well?
Is this working for normal laptop. whats the requirenents ? If i sing and want to clone using orher voices, aka nasking my own voice, possible ? Thanks
I have a voice sample. I only want to clone the narrative style of this voice. In other words, I don't want to maintain either the sound of that voice or the original language of that voice. What AI tool can I use to do this?
A lot of steps mentioned in the video are just wrong, there's a lot of important information missing, and there's way too many errors and hurdles, likely related to newer versions of all the components... Making this an absolute coding nightmare to successfully install.
I can't imagine even 10% of people successfully installing everything when newer versions come out, since most of the video relies on specific versions of tools to work alongside each other, the most important being python and all the different modules and dependencies which bring their own whole bag of issues upon issues upon issues.
For one, some of the instructions are in the wrong order: The "python -c "import torch; print(torch.__version__)" check is at the wrong point in the video, since we haven't even installed torch or any of the components yet, so how can we check the version? Result: Error: No module named torch.
The github says to install version 117 at the moment, this video says 118, installing it without specific version installs a CPU version that causes even more errors down the line... a whole bunch of confusion, searching and errors follow, which will undoubtedly get worse when updates to anything come out.
Secondly, "pip install -r requirements" command throws "Failed building wheels for pyworld, fairsec" errors... requiring 15GB worth of Visual studio SDK stuff to be installed. People who already have visual studio or these dependencies installed won't notice things like these, but they're important to mention in a "guide" like this. Same with correctly installing python, making sure to set the PATH variable during installation.
And I haven't even gotten to the actual WebUI stuff yet.
So all in all, at the moment, there's a TON of extra googling, tinkering and changing steps required to even get past the "basic" steps. And even that only works if you STRICLY follow the (already outdated) versions of python+torch+SDKs+other libraries.
This video will desperately need to be remade or the tools better developed for the majority of people to even get into it.
You are completely right, I tried to install it and found lots of errors that I had to manually fix looking for answers online and even then I'm stuck on a missing torch_python.dll even tho I tried to followed all steps.
I feel like Aitrepreneur assumed everyone has all dependencies installed with the correct versions and made a quick video assuming that.
He needs to more carefully try to follow his own steps on a freshly installed PC and troubleshoot all errors that appear and then make a video with all the fixes already done. Because he missed a lot of steps you need to do, he glossed over a lot of stuff.
I got the model and the live convertion working ditching python enviroments, and installing everything directly to my python installation
True... fairsec was a pain
Can you achieve something similar to Heygen? Mainly lip syncing?
I've tried using others but without success
How do I recognize the rvc model in xtts-webui? Can you make a video tutorial?
Thank you, it worked! Long process, but it was worth it!!
It doesn't matter if the voice sample you acquire is mono or stereo, correct?
@aitrepreneur That's an amazing technology. Could it be used to convert speech from microphone on the fly and send that output to a virtual microphone?
yes :)
Is this now word to text to word conversion? It's quite grating on the edges. I would want to try the W-okada sound converter. Anyone have any experience?🤔
Is there any way of controlling everything through an API?
What is that website at the end with the rvc voices listed @24:38?
you might have missed it, but he said it @ 18:00 (18:08)
What about different languages? Like donor voice is spanish and subject voice is english?
24:41 Where can I find the models trained by the community?
Does your patreon have a one-click for an AMD GPU PC?
It does yes
@@Aitrepreneur does it work for mac (intel) too?
hello, you didn't explain what is your new executables (root folder install) so ppl should double check
I am having some issues installing this, when I enter 'python -c "import torch; print(torch.__version__)"' I get = Traceback (most recent call last):
File "", line 1, in
ModuleNotFoundError: No module named 'torch'
I already have Stable diffusion and the cascade preview installed so I think it shouldn't have a problem?
Is there a fix to help it find torch or install it if its missing?
having same issue, working on figuring it out.
@@RealShinpin it won't work on python 3.11.x and 3.12.x. You will need 3.10.x
@@TheBlasterspewpew is there a way to run different instances of python on your pc? for different programs?
@@RealShinpin Yes. Check Poetry or Anaconda.
@@TheBlasterspewpew How uninformed you guys are? If there is ModuleNotFoundError it means there was no torch installed. It needs to be installed for that checkup to show. Torch works in 3.11.x and 3.12.x. Even the test AItrepreneur does in the very start to check the version is literally just a check to see if the torch is installed or not and the version you are on, not the nonsense his spouting.
3:15 what should i do if this is what comes up? Traceback (most recent call last):
File "", line 1, in
ModuleNotFoundError: No module named 'torch'
same
@@ItsOk-mq9ex could you explan in bit more detail? or link resorces ? sorry im kind of uneducated with this stuff
@@the_stray_cat type "pip install torch" when the problem comes up, in the cmd
@@suchaprettyhouse thank you! ill try it when i can
@@suchaprettyhouse by chance are you in the discord? might be easyer if i could screen share? getting another error.
ERROR | Al abrir el archivo go-web bat, se abre una ventana de CMD y despues de cargar unas lineas.... se cierra 😔
same here...
sameee here...
Question: this model only clones tone but not style, doesn’t it?
what's your opinion on the tts webui?
Great stuff! Anything for Mac users? Just asking... 🤔😃
I can't get it to work from a clean windows install, so many dependencies missing even after installing Python and Git
Can we use these models in TTS-generation-WebUI?
Hello, I cannot found any « go-web » file for AMD version, with the old solution. Same thing in the .rar file, before unzipping it. Can you help me please ? 1:59
same here. I did all the steps, and in the end, by replacing the go-web.bat file, it loads the Windows command prompt with the line C:\RVC\Retrieval-based-Voice-Conversion-WebUI>CALL env\Scripts\activate and closes quickly. Please help me, Aitrepreneur... I spent a lot of time following the steps in the tutorial :(
and how do we get 10 minutes of someones voice? id like to clone a certian actors voice, how do i rip that from a video?
Does it work with other languages as well? Spanish, Polish, Japanese, French, German, Italian, you know?
I would also like to know
well...this works with voice. The datasets on which the model is getting trained on, are ultimately sound waves. So it doesn't really matter what language is in concern. As long as the trained language and the target language are the same, it should work great!
@@0xdutta Thx for the reply! I think i've asked that a month ago because I was wondering wether I could use that to do voice acting/dubbing e.g. for games or visual novels
Is it possible to have a trained voice turned into a TTS engine so it can read PDF, and books ?
I am also looking for something like this
Same! learning this is a process but really we are ahead of the curve most are not learning. Check back here with answers!
Hi, I got this message:
D:\RVC_2\Retrieval-based-Voice-Conversion-WebUI>python -c "import torch; print(torch.__version__)"
2.2.1+cpu
Is this correct since it is not a torch version like you have in the video?
@@Niffelheim Looks like it, but why?
@@Niffelheim This is a part in the video I dont understand at all. At this point in the vid he have to check what torch version is installed, by why should any tourch version be instaled in the first place?
Same issue, stuck on CPU and not in CUDA
@@michaelbishop813 I think the idea to check for the installed version, was just so people don't mess up the existing installation, and ha forgot the guy how did not have anything installed. But this also makes no sense since we use an environment anyway.
I had the same problem and now the program is not running
Is it possible install this on mac m1?
Seems my earlier comment was deleted for some reason, but while trying to download models i get an error message saying that there is no "request". I have no clue what that means, or how to try and fix it?
I'm also having the same problem :(
@@Ayralis type Python with the capital P
pip install requests
did you run the "pip install -r requirements.txt" line?
What's the minimum requirement of VRAM?
you are my best RUclipsr ❤️
Am I missing something as I don't see where to get the installer.
this is not well explained... we need specific python version, specific cuda toolkit version (which you don't about), otherwise it won't run at all even if you do all steps perfectly
but it's not though, that's why you we create the virtual env even if you don't have cuda pytorch installed
@@Aitrepreneur i mean even when i have cuda and pytorch + virtual env created, did all steps, still not working
ok soo i had to install older version of python (3.10.11) thanks anyway@@Aitrepreneur
@@TREXYT Thank you for sharing your experience. I have installed the same version and everything is working fine.
so does it works on amd gpu?
Can RVC make someone sing, even if they weren't singing at all in the training data?
It says python hasn't been found whenever i use this "python -c "import torch; print(torch.__version__)" command
You need to install Python to be in your path. Although if it gives you ModuleNotFoundError just move to the step where you install torch torchaudio and torchvision because his giving wrong information in that bit anyway because the check he does just shows if you have torch installed or not and the version of it.
As far as I know there is no TTS for AMD cards, is there a way for me to play with TTS on my all AMD system ?
same hopeing it works anyways mtesting it becuse why not
it works with amd
Well I'll be dammed, Perfect !
Thanks a bunch
What did he say at 20:25? It sounds like: Uba-Wuga Tech Generation UI?
can it be installed on windows 10, im lost when you mentioning python I'm not a tech savvy sorry for asking a lot of question
Yes it works on 10/11, You just need need to install git for windows and an old version of python 3.10.6 works for all his videos just not available on pythons main site but you can find it with a web search. Then just follow the video and use the links in his show more under the vid.
Sorry, at this point the video is missing a lot of infos on what needs to be installed to get this running. for example I get the message that I need to install "Microsoft Visual C++ 14.0" an "Microsoft C++ Build Tools" to get this running. I installed it but still get the same message. And I get a lot of other error messages as well. I doubt more than 5% get this working.
I had all the same errors as you...after downloading the C++ build tools and installing the C++ package in "Workloads" I was able to get rid of those errors about building wheels....but then I to the point where I try to open the webui with go-web.bat and it simply doesn't work....the command prompt window pops up but then it flashes a bunch of lines of text and closes super fast but no webgui opens...haven't been able to catch what comes up in the box
This was helpful for me, thank you. I installed what sounded like what I needed but I did not get that workload file. @@whitedragon1337
what is workloads? :-) @@whitedragon1337
Yeah I get this too, I recorded it to lookback, the error is this:
ValueError: mutable default for field common is not allowed: use default_factory@@whitedragon1337
consider yourself lucky. I'm having to build python3,8 from source on a debian machine -_-
Work on different languages then English voices ?
You don't need to close the python environment?
We need to do voice-to-voice translation locally; we will dub games and TV series in our own languages.
Could you make a tutorial how can I use the my trained voice with a free TTS software, to read out loud a longer textual file
I show how to do that in the video...
my PC stuck on this PYTHON -M VENV ENV. Nothing happens after type this and press enter, any ideas?
same, I have same issue
Does this work for Macbook Pro? If not, can you make a tutorial for Mac 🙏🏽
Can we do it without gpu
how to do with it the tts2?
Why do we change the bat files?
Anyone know how to fix the 'NoneType' object has no attribute 'tobytes' error? Cheers
I found this error too, and over 40 people asking this question around, no one gave a working answer.
go web bat file is not launching any ui for me and I followed all the steps!
Try downloading the .7z file mentioned in the beginning of the video and extract it, then copy the "Runtime" and "Assets" folders to your RVC folder that you're running from. This worked for me, hope this helps.
@@magneticanimalism7419 i need rvc audio cloning installer.bat ..... how can i get it????
delete python and reinstall python 3.10.11 the newer versions break this but i just reinstalled it from scratch and it works so long as you have the old version of py
you couldnt mention the database before the training?
ModuleNotFoundError: No module named 'distutils'
Any idea how to fix this?
You most likely have python 3.12 installed, you need to remove it and reinstall version 3.9. The easiest way to do it is through the Windows Store
I did all the steps, and in the end, by replacing the go-web.bat file, it loads the Windows command prompt with the line C:\RVC\Retrieval-based-Voice-Conversion-WebUI>CALL env\Scripts\activate and closes quickly. Please help me, Aitrepreneur... I spent a lot of time following the steps in the tutorial :(
does it work on mac?
does this work on Ubuntu?
if you are just looking to follow the tutorial and need a voice sample. Russel Brand puts up like a 30 min video of him talking to the camera like once or twice a day on his youtube channel . And its pretty uninterrupted.
supported languages?
Do you have any tutorials for text to speech?
Lmao, this is what I asked for in ths community voice!
can i use this voice in a chatbot? if so how?
You can actually ONLY click the "One Click Training" button at the bottom of the page because it basically repeats the functionality of the first 2 buttons. Definitely the way to go because it creates the necessary index file. Just a little time saver.
yes that what I use :)
I'm looking forward to when we can do this real time. finally my villians in my table top games can have a proper evil accent
Actually you can already do that. Last time there was an additional app inside RVC root that can load the model and use it in real time. There are other factors at play for this workflow and It has its drawbacks because of latency but it’s possible.
With a card with 24 GB of VRAM you basically can reach realtime speeds using the Okada voice changer tool. With 8 GBs I can get decent realtime conversion at roughly half a second to a second delay. Quality will generally be slightly worse than RVC, but with a good enough card and model it shouldnt be noticeable.
Is possible on mac
This is the second video in a row I can't follow because I get errors like no modules named torch and cannot find path.
Unninstall everything, install latest version of python, learn to manage enviroments with conda
For Mac ?
I had the cmd go-web prompt that was downloaded close quickly. But then I edited it in vscode, and then added a pause at the end so it didn't immediately close. I saw that the file used env instead of venv like I did. Plus I also did *pip install -r requirements.txt* and after this it worked, this was the fastest I got one of these ai things going. At 250 epochs and a batch size of 20 per gpu my 4080 Super finished in 30 mins, so I might have to try out 500 epochs for better quality, its still pretty good af tho.
edit: Plus I forgot to mention I also had an issue with my audio until I converted it from mp3 to wav, like why tf doesn't it work for mp3 what is this
Where are the files saved by default?
Did everything exactly as mentioned, and not open webui... py 3.10.6
I just get a error message:
D:\RVC_2\Retrieval-based-Voice-Conversion-WebUI>python -c "import torch; print(torch.__version__)"
Traceback (most recent call last):
File "", line 1, in
ModuleNotFoundError: No module named 'torch'
do you have torch installed? try uninstalling it and reinstalling using pip if you do
also, I dont dont know how this is surposed to be working at all? We never installed pytorch, so why we need to check what version we have installed, when we never installed any version?
@@ibis8566 was there a install torch in the video? ;-) I guess he missed this part. Thanks for the quick help
@@Niffelheim This is later. Fire I hab to find out what to put there, but I only got a error message, so I dont have anything to put there. I also dont get why "K" thinks that torch is already installed
@@Niffelheim D:\RVC_2\Retrieval-based-Voice-Conversion-WebUI>python -c "import torch; print(torch.__version__)"
2.2.1+cpu
D:\RVC_2\Retrieval-based-Voice-Conversion-WebUI>python -m venv env
D:\RVC_2\Retrieval-based-Voice-Conversion-WebUI>env\Scripts\activate
(env) D:\RVC_2\Retrieval-based-Voice-Conversion-WebUI>
I don't like installing multiple versions of python and all this voice stuff is like 3.10.12
thats what virtual environments are for
I keep different tools on different shelves in my garage, and I keep different tools in different conda environments on my PC.
@@markdaga1711i dont need 8 hammers
@@markdaga1711 do you know why we need the install the same torch in the envirement?
They got Conda for that
Please make a video about Diffusion Light AI, it create HDRI from a single Image. Thank you!!
Wow❤
Hey, if you have RTX 3080 10GB use 10 as your Batch size it’s way more quicker than using 20-40 batch sizes trust me 26m of voice with 100Epochs less than 30m, when trying try to lower your shared GPU memory vram as much as possible
Hi, your Discord link isn't working!
Thanks, I just updated it
Is there really no TTS engine that you could load a RVC model into?
XTTS-RVC-UI or tortoise-tts
@@avalsmayoreh in webui or silly tav?
can you do text to speech from polish politic and make him say something what you want in Polish language?
keep getting errors. note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for fairseq
Failed to build pyworld fairseq
ERROR: Could not build wheels for pyworld, fairseq, which is required to install pyproject.toml-based projects
remove completely install visual studio make sure to select python and the one below it I think it's says something like cmcake that one then install and now follow the steps he does.
Has anyone ever tried using the basic silly tavern/oobabooga TTs and then run it trough the w-okada voice-changer?
Because that way you could get any voice you wanted instead of the normal default TTS voices.
Im just curious what would happen, it could obviously beccome a complete desaster, but i still wanna know 🤣
Whats wrong with links???
cAN U RUN IN A WEBSITE???
The RVC link is not here?
same can't find it