AI Voice Cloning for Singing with RVC - Guide and Set-up

Jarods Journey

Просмотров 308 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 28 дек 2024

Комментарии • 933

@TantuBeats Год назад ⁺⁶⁵
so much respect to everyone who is making this work.. the amount of problems I'm running into is insane, haha. I hardly know where to start after hours of being into this.
@raykrislianggi Год назад ⁺²⁶
For those of you looking for the "weights" folder in the main RVC directory, as of RVC1006, it's inside the "assets" folder.
@pingusmcdingus5 Год назад ⁺¹
Nothing is placed here after training a model though. Do I manually copy the D_*.pth or G_*.pth over from logs, or something?
If I try that and click Refresh Voice List and Index Path, the new model appears in the Inferencing Voice list, but when I select it I just see a red 'Error' all over the UI: i.imgur.com/QNQUpmq.png
@raykrislianggi Год назад ⁺¹
@@pingusmcdingus5 In my case, the .pth file is placed there automatically if it successfully finished the training without any errors. If it's not the case for you, there might be something wrong in the middle of the process. You might want to try retracing the steps or redo it from scratch.
The one thing I did differently from this video is that my audio file for training is not split up into multiple short .wav files, but I just combine them into a single 20-minute file. I've compared both the cut and uncut audio and the result is much better with the uncut 20-minute audio.
@realon Год назад
Thx for advice
@ohheyvoid 10 месяцев назад
thanks! :)
@Ekemini-xl1ow 4 месяца назад
@@raykrislianggiplease how did you combine your wav files into one? Thank you
@matthewedwards904 Год назад ⁺²⁹
@8:03 if your process fails when you try to process the input data one possible explanation is that the path for your folder includes a space. That is what hung up my first couple of attempts. make sure your file path doesn't include any spaces for easiest handling.
@Hestia3332 Год назад ⁺¹
thank you! I took the spaces out of the song name and it worked for me!
@Primesky Год назад
Thank you m8
@ChaseEverything Год назад
Still not working for me. It says :(
['trainset preprocess_pipeline_print.py', 'C:\\RVC-beta-0528\\RVC- beta0717\\voice\\me', '40000', '12', 'C:\\RVC-beta-0528\\RVC-
beta0717/logs/me', 'False']
C:\RVC-beta-0528\RVC-beta0717\voice\me/myself.m4a->Suc.
end preprocess C:\RVC-beta-0528\RVC-beta0717\voice\me/myself.m4a->Suc. end preprocess
@wektorus Год назад ⁺²¹⁸
Finally a tutorial that even I can understand. It's so stupid that most of the tutorials are made as everyone was that tech savvy. Thank you so much.
@Jarods_Journey Год назад ⁺²⁶
Appreciate it 🤟🤟
@smokinmoose2 Год назад ⁺¹⁷
I wish i could say the same. I'm just a singer. I want a program that installs, I hit the .exe file, it opens, I put the source files in and voila, new voice. Don't know why that should be so hard.
@linuxtuxvolds5917 Год назад ⁺⁶
@@Jarods_Journey I can't stress enough how important it is to absolutely tell people that the training process will take a long time. I thought my progress was just stuck but no, it's just taking a long while!
@LovelyNyx7 Год назад ⁺²
@@linuxtuxvolds5917I will wait as long as it takes. If it means I get to sound like someone's voice I really enjoy!
@paleguywithdonuts Год назад
@@Jarods_Journey it says "No supported Nvidia GPU found, use CPU instead" but it still opened
@RobertJene Год назад ⁺⁷⁶
12:42
1. Open file explorer to the folder that has a file who's path you want
2. Press Alt+D
3. Press End
4. Type a backslash \
5. Start typing the name of the file, look for the autocomplete with the correct name, press down arrow until the correct file is highlighted
6. Press Ctrl+C
@Optimus97 Год назад ⁺⁸
Or you could Shift-Rightclick to unhide "Copy As Path" option
@RobertJene Год назад ⁺¹
@@Optimus97 I prefer to use the mouse as little as possible
@fluffsquirrel Год назад
@@RobertJene I can kinda see what you're saying, especially with the delay of the context menu in Windows 10/11.
@RobertJene Год назад ⁺¹
@@fluffsquirrel any keyboard sequence you do will save time not reaching for the mouse
@fluffsquirrel Год назад ⁺¹
@@RobertJene I think this is generally true, although the less sequences the better, if possible.
@arhythwrith Год назад ⁺⁴³
For those who would like to know the harmony bit in 5:11
Harmony is when there's more than one note being sang at the same time
It's kinda like chords but for vocals.
HP5 Helps with separating harmony but it will be less clear on the voice compared to HP2.
The newer RVC2 also has dereverb & deecho which I also highly recommend using to make the vocal separation even more clear for songs where the voice has a lot of reverb / echo.
I'd say just mess around with it a bit and choose to your liking depending on the song.
Anyways have nice day :D
@JeanIbarz Год назад ⁺⁵
Thanks for sharing ! Small tip: using cut/paste instead of copy/paste allows moving the folder instantaneously ;)
@joshmerritt7915 4 месяца назад
And saves HD space.
@ragemax8852 4 месяца назад ⁺¹
Thank you bro, this was the best tutorial so far on how to train the voices, so many other tutorials were not clear on how to set everything up and also get the index file which was something that I had trouble with for the longest.
@lalalala99661 Год назад ⁺³
Quick tip in minute 13:03 you can do shift+ right click then a other menue pops up and you can click on copy path in the poped up menue
@TheDailyMemesShow Год назад ⁺²
I'm going crazy with Jarod's channel 😂
I'm that off the cliff with it that I'm running into rewatching old videos😂
@montivtuber Год назад ⁺²
11:23 the pth thing doesn't appear for me I have no clue what could be wrong
@djdocq8963 Год назад ⁺²
@13:28 you pick an index file v2, in my drop down box it only has 3 different v1 files to choose from? It doesnt seem to create an index file when I train my voice.
@luqmanhaqim97 Год назад ⁺²⁰
Nice one, keep up the good work. Your instructions is very clear and helpful compared to others. 👍 ✨
@solm8212 11 месяцев назад ⁺¹
thank you sooo much, all the other tutorials were so confusing and this was simple and fast, encountered some problems while running the rvc command prompt since i dont have a gpu, but i installed cuda and python and that fixed it. its like now you need to know programming and stuff but this tutorial was easy, fast and simple. keep up the good work.
@Xivlex Год назад ⁺⁴
Hello thanks for the video. It piqued my curiosity and now I want to try RVC myself. Unfortunately, I'm running an AMD GPU (6800xt) but upon checking the releases an option for AMD users is present in updated0814v2. My problem now is that when I try to follow your steps, RVC does not detect my GPU. For example, at step 2b as in 8:21, the options to select a GPU are not present. The option to input a GPU index is there and I've tried putting in "0", "1", "2" and "0-1-2" but when pressing one-click training it says: "NO GPU DETECTED: falling back to CPU - this may take a while" Do you know a way for it to detect my GPU?
@Jarods_Journey Год назад
I'm not too sure unfortunately, you might have to check on their githubs issue area to see if anyone else is running into it.
@012_siddhantprasad9 4 месяца назад
Hi, did you got any solution for that?
@KennaLovesGouda Год назад ⁺¹
8:11 it will not process data. It starts but then stops and there is an orange line around the output any reason why?
@RobertJene Год назад ⁺⁵
9:33 when I train embeddings for stable diffusion (image generation) I have it save an embedding file every 50 steps so I can check the loss and strength of them with scripts and test a few
@Jarods_Journey Год назад ⁺²
I've been finding with these speech models that the intermediary saves don't really exhibit abilities better than the final model, so I really just save the last one only in order to save space. I haven't found one yet that has been overtrained.
@Snackbarry 8 месяцев назад ⁺²
damn as a complete beginner coming to this channel to have it being explained like this was really..... interesting....
@denblindedjaligator5300 Год назад ⁺⁶
i have some questions. When I download other people's voice modules, there is a file called something like traint.index, it's a file you have to use. The same goes for total_fea. I have also seen that there are pth files in the log folder itself.
@Jarods_Journey Год назад ⁺²
These should go into the log folder underneath the "experiment" or "speaker" name that you want to use. So if the name is john, the john.pth goes into weights and the index goes into the log/ where you have to create a john directory and place the index into.
@denblindedjaligator5300 Год назад ⁺¹
@@Jarods_Journey but i meen the traint.index. And why is there a modul in the log folder and a detail file
@LucasMarak Год назад ⁺²
RVC is best for me thanks Jarod take care
@stevecommand77 Год назад ⁺⁷
Well convinced after the preview. Hope you can have video on text to own vocal speech soon.😊
@hariom2580 Год назад ⁺⁴
I have succesfully trained voice but there is no index file in voice name folder, in weights folder pth file is there what to do...nice video
@PumpkinnieVA 7 месяцев назад ⁺¹
There is a little mistake in this video which I want to point out, after you finish preprocessing and feature generation, you tell to click on One-Click-Training. This is unnecessary because that button will do the preprocessing and feature generation AGAIN, which you already did before. So when it's done, click on "Train Model".
@SplicerTv Год назад ⁺⁸
Thanks for the great tutorial! I found a couple things that might be helpful to others. For extracting the archive I use the official 7Zip software, its free and open source and will save you some hassle. Next thing, is regarding the batch size. I have a 3090ti which has 24GB of VRAM I find a value of 32 makes use of 21.7GB of the VRAM and leaves a bit for OS related stuff. You don't want to go overboard with batch size of 40, or the gpu will start swapping to system RAM, and significantly affect the time it takes to train even if you have fast RAM, it's still an I/O cycle you can avoid between GPU / System RAM. I recommend looking at task manager or using a tool like nvidia-smi to check the GPU VRAM use and experiment with batch size to find the best value for your card in order to get much faster training.
@HyperbolicArachnid Год назад ⁺²
Finally, a tutorial that doesn't fly 5 miles over my head
@EthanWinters176 Год назад ⁺⁵
If you can read this:
.pth files go in the folder "weights"
.index and others go to "logs" under the voice name ex: Logs\EthanWinters
@ericleigh007 Год назад ⁺²
if you want to move the folder faster, just rename the top folder, then cut and paste the lower into the top-level. When you cut and paste the contents, explorer knows it only MOVES the folder, so no copy wait.
@paarthsingh Год назад ⁺³
can someone plz fix this error , jarods plz tell thisError : ValueError: invalid literal for int() with base 10: 'voice'
this error i get when i do process data
its step2a error : when i put my local URL into path folder
@re-duke 3 месяца назад
9:14 So after trying this by myself I found out that if you select "no" at "Whether to save only the latest ckpt file" then your disk may be full after a while if you don't have much space and train many models.
@nycdweller4287 Год назад ⁺⁴
Hi, thanks for your video. Are there already some pre-trained models for RBC? Also, is there a reason you prefer to train locally rather than on collab?
@Jarods_Journey Год назад ⁺⁴
I'm not sure about fully pre-trained models, you'll have to take a look around the internet to see. Colab is a nightmare to work with for debugging, etc and unless you made the code, trying to debug it isn't that fun. If I can work locally, I much prefer it and my hardware allows for it.
@ElChapoDel8 11 месяцев назад ⁺¹
If i don't have any problems but i want to keep training my model i just do the same thing that you said on the minute 10:50 but increasing the epoch, right?
@Jarods_Journey 11 месяцев назад
Correct :)!
@guytisdale Год назад
What is that song playing @2:15? Sounds like it was created on a 32bit snes synth.
@Jarods_Journey Год назад ⁺¹
The artist is しゃろう (sharou) with the song here: ruclips.net/video/JAC2KCbbvmc/видео.html
@guytisdale Год назад
@@Jarods_Journey thank you
@guytisdale Год назад
I’m going to try that voice cloning though, sounds interesting. So if I can modify the voice a bit and not make it exactly like the artist
@Jarods_Journey Год назад ⁺¹
@@guytisdale yeah, if you pitch changes it or did your own filtering, you could get a completely new voice
@guytisdale Год назад
@@Jarods_Journey awesome
@321Engage28 Год назад ⁺⁴
Great tutorial! Unfortunately, I seem to be having a problem with step 2a: My attempt to process the data was unsuccessful, and the output message came up blank! What am I doing wrong?
@Yumegipsu Год назад ⁺¹
This happened to me too but it worked when I removed spaces from my experiment name. If it's not that then idk
@Noor.alhayat Год назад ⁺²
i recived this
Unfortunately, there is no compatible GPU available to support your training.
how to solve this problem.
@gabrielmorgan3369 Год назад ⁺²
For those who are having trouble choosing where the download goes you can right click it and choose save link as
@Burchh.4rt Год назад ⁺²
im having a lot of trouble... opening the go-web file doesnt show the language option, and then has lots of stuff and at the end says to press any button to continue. After i do that it closes, and when i searched up the localhost:7897 it says i cant reach the page..
@the3fe245 Год назад ⁺²
thanks mate, all of the other people i looked up as tutorials were too complicated, a month ago i viewed your so vits svc fork tutorial too, you are one of the best teachers in the world, i can understand your videos perfectly and my native language isnt even english!
@welachut_excel Год назад ⁺²
Since i’m relatively new to this, how would you use this rvc for just cloning a voice? Do I just leave out the parts in model inference about the pitch and music related things?
@krysidian Год назад ⁺⁷
That was very nice to follow along, thanks!
Any interest in showcases bark ai? I think it's a pretty interesting way of doing tts but I don't think it's very well explained in many places or left out a lot that kinda confused me. Especially when it comes to getting decent results. Do think the prompting idea is really intriguing though
@Jarods_Journey Год назад ⁺¹
My quick experience with bark is that it's still in very early stages, excited to see where it goes though! I might have to do a more throughout test of it, but tortoise tts by far is the most promising and easiest to use
@krysidian Год назад
@@Jarods_Journey That's definititely true. Tortoise is incredible!. Really hope bark will update or get some cool successors with a similar but more stable approach. Making it generate laughs sighs etc. is spooky and very fun.
@Jarods_Journey Год назад
@@krysidian I'm definitely interested in the laughing part. That's one additional touch to AI that is lacking in voices and when that gets fleshed out, things are gonna get interesting xD!
@todhold2673 Год назад ⁺²
Am i missing something? Did he go over how to add the newly converted vocals back to the instrumental?
@animeui_es Год назад ⁺³
Great job!. I have a question for you... How many audios do you recommend me to generate the model, and they are not problem if the audios have some background sound?
@Jarods_Journey Год назад ⁺¹
10 minutes or more of high quality audio. You need to split the background from the audio samples and can check my latest video on that
@warsin8641 Год назад ⁺¹
This abosulte legend amongst men
@Cyborg11 Год назад ⁺³
Thanks for your very good tutorial Jarod.
I still have a question.
What do the values "loss_disc", "loss_gen", "loss_fm", "loss_mel" and "loss_kl" mean when training? Which values are indicating a good trained model? Are lower values better?
@Jarods_Journey Год назад ⁺¹
A downloads slope on the graph is better, or lower values. You wanna look for total loss and train till that's as low as possible preferably
@pigoshko7307 Год назад ⁺¹
10:23 After completing the training process, I received an error message saying that the specified file or directory could not be found. Specifically, the error stated that the file named "trained" (or a similar file) could not be located. What did I do wrong?
Edit: It's ok now, I figured it out.
@jofejofeson9932 Год назад
what did you do?
@pigoshko7307 Год назад
@@jofejofeson9932 I reinstalled the whole file again and repeated the process, and then it was fine.
@michaelteuber7362 Год назад ⁺³
Thanks a lot for the video! One question: 40kHz is a pretty unusual sample rate so I want to use 48kHz (which now also seems to work with v2). Also I slice up the training vocals manually with a DAW (Cubase) into up-to-10-seconds snippets. Do I have to export the snippets in 48 kHz already from the DAW or would the usual 44,1 kHz be alright and only the output (the resulting file) would be in 48 kHz?
@Jarods_Journey Год назад ⁺¹
It'll be fine, I believe RVC resamples your audio already using ffmpeg to the correct SR. I actually haven't verified this, but since it handles my datasets when using either 40k or 48k, that means it doesnt really matter :)
@michaelteuber7362 Год назад ⁺³
@@Jarods_Journey Thanks for your fast reply! So there's a tiny bit of hope that if you feed it 48kHz already it might skip the resampling which could probably result in higher quality oputput 🙂.
@djsaquib Год назад
While making dataset, if i am taking vocals from a singer! Do i keep to keep key of vocals same? Or i can add multiple audios of different songs to train model of particular singer?
@Jarods_Journey Год назад ⁺¹
As long as its the same singer, you can add as many songs from them as you like
@djsaquib Год назад
@@Jarods_Journey thank you for clarifying 🙏🏻
@Beary_TheBear Год назад ⁺³
Hi, thanks for the tutorial. I got stuck at the training process. I received a message saying this:
RuntimeError: The expanded size of the tensor (12800) must match the existing size (4040) at non-singleton dimension 1. Target sizes: [1, 12800]. Tensor sizes: [4040]
Before I got this message, I was getting the "Cuda out of memory", even though I have 32GB of RAM. I cut the audio samples into smaller bits under 10 seconds, and now I have the expanded size of the tensor error. What did I do wrong?
@gabrielmorgan3369 Год назад
same issue
@gabrielmorgan3369 Год назад
it means that the if it finishes its going to take up too much space so just turn batch size down to fix
@MrSix-1 11 месяцев назад
Cuda Memory is VRAM Its different than regular RAM
@csrdfx Год назад ⁺¹
i don't see a pitch extraction algorithm, everything else is there. Any solutions?
@MarceloNunes Год назад ⁺¹
I'm having problems at the 10:01 part, it says:
"FileNotFoundError: [WinError 3] The Sytem cannot find the specified path C://Users//User//Downloads/RVC-beta-v2-0528/logs/Character/3_feature256"
It's frustrating, honestly.
@Jarods_Journey Год назад ⁺¹
This means there was an issue with the feature extractinon step and it didn't finish before trying to train. This step can take anywhere up to an hour to complete depending on system specs and sample size.
@denblindedjaligator5300 Год назад
wride 0 in the qpu indexes
@MarceloNunes Год назад
@@denblindedjaligator5300 Yeah, i did this some time after i made this comment, it's just stops as if nothing happened.
To be honest, i'm just training the models in colab, and making the audios with the local version, i'm fine with that.
@shep9194 Год назад ⁺³
Have you tried the realtime voice changing? Ive been trying to get that working but had some issues, i think its an svc fork though
@Jarods_Journey Год назад
Have not gotten to try that yet on either repos unfortunately :/
@andivax Год назад
Thank you very much! My Inferencing voice list is empty. Where to put the downloaded voice models?
And epochs. It's it worth to use 1000 epochs instead of 200 to increase the quality?
@Jarods_Journey Год назад
I believe downloaded voice models should go into the weights folder, as long as they're from RVC. As for epochs, if you get good results at 200, I don't see much reason to go to 1k. If you have enough voice samples, 200 should be relatively good. I would listen to them per 100 epochs and see what you think is best (as it's always dependent on your data and how much of it you have)
@SNYCHANNEL Год назад ⁺⁴
Thank you for this video!!
When i trying to train i get this error:
sr = int(sys.argv[2])
ValueError: invalid literal for int() with base 10: 'Yona\\Desktop\\RVC-beta\\RVC-beta-v2-0528\\voice\\Me'
You know howwhat im doing wrong?
@Jarods_Journey Год назад
RVC: Invalid Literal or File Not Found error
@joemmaama Год назад ⁺³
For the voice me folder, is it just audio recordings of my own voice? if so how many do i need to include and what length? Thanks In advance youre a massive help dude
@Jarods_Journey Год назад ⁺¹
Yup, as shown, make sure the folder contains all of the audio files without subfolders. Then just use that path for those and you should be fine
@Maartje117 4 дня назад
Followed the guide but ended up with completely different files and I have no clue how to install the damm program.
At 02:00 minutes I couldn't follow along anymore.
@Retro-zn2jt Год назад ⁺³
thanks for your video, there are a few other videos on the subject and I find that yours is better explained nevertheless I still have to deal with several errors. First I had "Cuda Out of memory", so I lowered the batch to the minimum, now I have another error which is: "RuntimeError GET was unable to find an engine to execute this computation". My audio samples are a bit long (a few minutes) and they are in 32Bits float at 44.1Khz but I only have 4 samples...
should I divide them into several parts? thanks in advance.
Editv1 : I tried many time and also to cut in differents parts, reduce the size and i still get the RuntimeError even with 2 small sample (16bit 44.1khz) than less than 10 secondes… i don’t understand
Editv2: Also i wonder if you know how to text-to-speech with this tool ?
@Jarods_Journey Год назад ⁺¹
You might have to reinstall or make sure the CUDA being stalled is compatible with your GPU
@necrovolo Год назад
I'm having the same issue.
@nicksterba Год назад
At 13:39, when I need to hit convert, it just says error. Could that be because I have a 4gb gpu? If so, is there anything I can do to make it work?
@Jarods_Journey Год назад
Ah, depends. If it says cuda out of memory then yeah, you'll need smaller batch sizes, shorter data, or a larger GPU. I actually don't remember if I responded to you somewhere else though lol
@scedolin Год назад ⁺⁴
thx for this good tutorial
Unfortunatly I had a an error after 2 s and I don't understand why I did wrong.
if data.dtype in [np.float64, np.float32, np.float16]:
AttributeError: 'NoneType' object has no attribute 'dtype'
@Jarods_Journey Год назад ⁺¹
Another commenter had this issue but I haven't encountered it yet and haven't found a way to reproduce it. You might be able to find others who are looking to get this issue resolved here:
github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues?q=is%3Aissue+AttributeError%3A+%27NoneType%27+object+has+no+attribute+%27dtype%27+is%3Aopen
Could be related to the training process, trying to find files, etc
@scedolin Год назад ⁺¹
@@Jarods_Journey I applied your remark on your good short File Not Found: feature_768, and I succeed to avoid this eror anymore. Thx alot - I start to follow your channel last few days put your subject are very interesting - great job
@phannguyenhuuphat2706 Год назад
3:52 My macbook M2 cannot open it with terminal app. Can you help me check this. Tks you so much.
@RuneLightLovely 6 месяцев назад ⁺¹
Why the "Unfortunately, there is no compatible GPU available to support your training." appeared in "GPU Information"?
@Nangel2 Год назад ⁺⁷
Thank you for taking the time to make this tutorial! It was so easy to follow. :) Could I ask you to make a comment or tutorial on how to re-train a previously trained voice? I can't find that information anywhere.
@Jarods_Journey Год назад ⁺¹
Let me know if this was what you were thinking about: ruclips.net/user/shortseO0gvi_RXTc?feature=share
@Nangel2 Год назад
@@Jarods_Journey That's exactly what I was looking for, tysm!
@ImaCreepyCreeper Год назад
4:13 I can't seem to get into the localhost page, also, is localhost necessary to make the custom vocal models? I haven't really gone through the whole video more or less skimmed it just to see how to get the custom voice models. -_-
@Jarods_Journey Год назад
-_- To get the local host page, you'll need to instantiate it via the python script
@321Engage28 Год назад ⁺¹
It worked. Thanks so much!
@Clutchinit Год назад ⁺¹
when i click process data it doesnt fill anything and cmd comes back with "valueError: invalid literal for int() with base 10: 'Changer'" what do I do?
@Jarods_Journey Год назад
Remove spaces from your path or folder named, this causes this issue
@ifoundrandomevents5240 Год назад ⁺⁵
i got something like this
result = torch._C._nn.leaky_relu(input, negative_slope)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 10.00 MiB (GPU 0; 6.00 GiB total capacity; 5.27 GiB already allocated; 0 bytes free; 5.31 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
what does it mean> how can i fix it?
@ark1fied Год назад
man same, im facing this same problem and i have no idea how to fix it
@gabrielmorgan3369 Год назад
same
@gabrielmorgan3369 Год назад ⁺³
it means that the if it finishes its going to take up too much space so just turn batch size down to fix
@ifoundrandomevents5240 Год назад
@@gabrielmorgan3369 I turn it down to literally 4 bro it still ain't working.
@gabrilapin Год назад
Same help please !
@pingzheng1030 Год назад
what does 3:01 saying "then what I did was just move all of these from the zip folder into the folders that I installed with vs code" mean? Where can i find the folder?
@Jarods_Journey Год назад ⁺¹
That was just an insert for people who want to install it manually. I don't really recommend you do it this way, just download the .7zip file that is shown in the vid in the releases area and that includes everything you need.
Here is another comment I left for another user who wanted to do this:
"So when you do git clone etc., the raw repo doesn't have any pytorch files inside of pretrained, pretrained_v2, uvr5_weights, or hubert_base. So what you have to do is move those models into those folders (doesn't really matter where you get them from, but I just took them from the zip folder because it was easier than downloading each individually.)
For hubert_base, it literally just sits in the parent directory or the cloned repo. The other ones go into their respective folders."
@pingzheng1030 Год назад
@@Jarods_Journey Thanks for your video and answering.
@azfarmcalpha875 Год назад
I failed to get the myself python file in weights folder after one-click training. Should I reset the process at 7:19 and if so do I need to delete certain files?
Correct me if I'm wrong but I think this is the error? I'm not familiar with coding.
RuntimeError: Calculated padded input size per channel: (2). Kernel size: (3). Kernel size can't be greater than actual input size
98_1.wav-contains nan
9_2.wav-contains nan
all-feature-done
@Jarods_Journey Год назад
I would rerun the preprocess again for all of your data and then try again, but check this out here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/484
@universalator Год назад
I have a weaker GPU (GTX 1660 Ti) and its taking about half an hour for each epoch, i put the settings to match the reccomended starting settings (at 9:13 ), is this normal? Thanks
@RuneLightLovely 6 месяцев назад ⁺²
I can't find the .pth file...
@mrbilly5217 Год назад
13:40 Im getting an error when I try to convert my audio, does anyone have an idea why? I have the vocals in a wave file and I have the correct location, as well as my inferencing voice as well.
@RobertJene Год назад
3:00 sorry don't know what you did there. if you extracted the archive, aren't the folders already where they are supposed to be?
@Jarods_Journey Год назад
If you want to install it with python, you run those lines and then you'll still need to download all of the pretrains from hugging face. Because I don't like downloading things 1-by-1 as I couldn't find a download entire folder option on hugging face (I'm a hugging face noob), I just downloaded the zip and moved them over lol.
@RobertJene Год назад
@@Jarods_Journey yeah same here when getting the updated ControlNet models for Stable Diffusion, there's over a dozen, all of them GIGS of data.
I can't figure out how to batch download them.
But that one part of the tutorial is lost on me....
Tell me, do I just unzip the contents of the ZIP:
RVC-beta.7z
is that it? Or do I have to move it's folders around?
@Jarods_Journey Год назад ⁺¹
So when you do git clone etc., the raw repo doesn't have any pytorch files inside of pretrained, pretrained_v2, uvr5_weights, or hubert_base. So what you have to do is move those models into those folders (doesn't really matter where you get them from, but I just took them from the zip folder because it was easier than downloading each individually.)
For hubert_base, it literally just sits in the parent directory or the cloned repo. The other ones go into their respective folders.
@RobertJene Год назад
@@Jarods_Journey thanks, I copied this to my notes
@alperengulen6963 Год назад ⁺¹
how can ı change the localhost site to english? (it opens in turkish for me)
@RuneLightLovely 6 месяцев назад ⁺¹
Why the "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 4: invalid start byte" error appeared in cmd.exe after I clicked "One-click training"?
@ItsMeCharkey Год назад ⁺¹
Can you do a video on how to update RVC to newer versions when applicable
@VongolaChouko Год назад ⁺¹
Is 1000 epochs overkill? Will it have diminishing returns compared to just keeping it up to 300? I really don't see a standard recommended epoch total anywhere, the answer varies. I usually use 500, but I honestly don't know if that's fine since I just use RVC for SillyTavern and haven't tried it just on itself yet, hence I don't know how to evaluate if the results are better or not .___.
@parthluwaria Год назад ⁺¹
Hey I am getting "cuda out of memory. tried to allocate 20.00 mib (gpu 0; 4.00 gib total capacity; 2.88 gib already allocated; 0 bytes free; 2.90 gib reserved in total by pytorch) if reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. see documentation for memory management and pytorch_cuda_alloc_conf" as an error. Can you help me navigate through this.
@sreshkhsreshkh3872 Год назад
same error, did you find any fixes ?
@EricNoneless Год назад ⁺¹
When I click in the bat file it says it was not possible to find the determined path so when I try to search the localhost on the web it gives an error...
@RobertJene Год назад ⁺¹
7:10 do these audio files that you train with have to be 10 seconds or shorter like SVC and Tortoise need?
@Jarods_Journey Год назад
I haven't tested it, but I'm assuming that if not, you'll run out of vram as it'll have to load the entire audio file onto memory
@dylanchrey Год назад ⁺²
When I conver its error
@omg01546 Год назад ⁺¹
I don't see any model.pth in weights folder...
@BlueprintBro Год назад
For some reason it doesn't always create a .pth file so just in case click on Train Feature Index for the model you're creating and it might solve your issue
@shaysilver203 Год назад ⁺¹
Great one! Finally works!
@Odyssey_ACNH Год назад ⁺²
"The expanded size of the tensor (12800) must match the existing size (0) at non-singleton dimension 1."
Anyone know a fix to this problem?
@SunPrime_Nexus Год назад
Yeah bro, I recently have this problem. This happens because you surely try to use other data to train your AI that is probably not the original one that the AI started training. To solve this problem you only need to create other AI renaming the experiment and then you charge the new data you want to use at step 2a and then process the data. Then go to step b and extract the feature normally. Then go in your disc where you save your RCV documents, search logs and look for the carpet of the old AI, and copy all the archives that say D and G, then paste them in the new carpet of your new AI. When you have all this done you could train as normally and everything should be right
@TheBlueRage 8 месяцев назад
Thanks. I just have to make voice samples. I guess I am supposed to sing something. Is that correct? The UVR software works great. I was able to stem Suno ai. m also looking at Jen music ai and Lalal ai which uses celebrities. This was more intense than what I expected. I see that Mac has a download app. I just found an App of Google App Store. I will look through your other videos for more lessons. Thanks.
@Rainbowgunsh Год назад ⁺²
At Model inference when I try to convert, I get AttributeError: 'NoneType' object has no attribute 'dtype'
How do I fix this?
@Jarods_Journey Год назад
Check out this issue here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/529
As well, you.migbt be able to find others that have had this issue on the main issues tab here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues?q=is%3Aissue+is%3Aopen+nonetype
@Rainbowgunsh Год назад
@@Jarods_Journey thanks so much!!!!
@Jurian0 Год назад
@@Rainbowgunsh were you able to fix this issue?
@Rainbowgunsh Год назад
@@Jurian0 no :(
@Rainbowgunsh Год назад
@@Jurian0 I fixed the issue. You need to add ".wav" to the end of your file (if wav) EXAMPLE: \instrument_Balling song.mp3_10.wav
@Lillyflower-J88 Год назад ⁺¹
Jarods Journey Why cant we just upload for example an existing split song file from inside the folder that is just the singing voice with no music. Also Why copy and paste the whole address? Please answer, because I dont usually get a response when i ask a simple question.
@todm964 6 месяцев назад
What are your thoughts on Applios RVC Fork? Ever consider making a video with it?
@sigh7731 Год назад
Hello, I would really appreciate if you help me with this issue I'm having. Every time I try to convert the music file into the vocals and instrumental (the process you start here 5:55), I always get this error message at the end. Can you please help me resolve this issue?
RuntimeError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
@Jarods_Journey Год назад
Hmm, your graphics card might not be compatible with the version of pytorch running, it might be too old. What GPU do you have?
@sigh7731 Год назад
@@Jarods_Journey I have an NVIDIA GeForce GTX 650
@Jarods_Journey Год назад
@@sigh7731 Ah gotcha, well the possible fix for this is a little too involved and I'm not even sure if it would work for RVC, so you may be out of luck. The only option you have is to run on CPU, upgrade your GPU, or run RVC via Google Collab.
Here is the article that references the out of date GPU issue: discuss.pytorch.org/t/solved-pytorch-no-longer-supports-this-gpu-because-it-is-too-old/15444
@sigh7731 Год назад
@@Jarods_Journey Ok, thank you!
@Nangel2 Год назад ⁺²
Hello again! Do you know what the issue could be when the preprocess stops going in the middle? No error message or anything shows up, after successfully processing many of the vocal samples it just stops going.
@Jarods_Journey Год назад ⁺¹
Check the logs folder, should have 0 and 1 and if there are contents in there, it finished. Not sure if it stops in the middle, it would output an error.
@Nangel2 Год назад
@@Jarods_Journey Ty for taking the time to reply again! The logs folder did have 0 and 1 in it and the feature extraction worked, so I tried to train the model but it never progressed past step 1 (ie it never reached the epoch count). I've trained several models before with no problem, so I'm not sure what the issue is. I'll try to experiment a bit and see if I can figure out what the issue is, and if I figure it out I'll report back.
@looooool3145 10 месяцев назад
Hey man, thanks for the tutorial. I was wondering how to match the key of the instrumental to the output voice? I converted a male song to a female cover, but I don't know how to change the instrumental pitch to match with the female voice.
@Tarbard Год назад ⁺¹
Thanks for the videos, they are fascinating.
@viking6985 Год назад ⁺²
Hello Jarods, after i put the directory for the voice, for voice training and clicking "process data" i got an error in cmd.exe : ValueError: invalid literal for int() with base 10:
i have a a 3070ti
@Sam-cq9us Год назад
same man
@Jarods_Journey Год назад
Delete any spaces in your path, this usually comes from this
@viking6985 Год назад
@@Jarods_Journey what do you mean by path ? i'm really noob into this
thanks for the reply
@Jarods_Journey Год назад
@@viking6985 check this out: ruclips.net/user/shortsUbPMhzZuE9I?feature=share
@AIAsiaSinger Год назад ⁺²
thanks bro generous sharing! one quick question, when we restore the previous model, how can we continue the training? do we need to go through all step 1 to step 3 ? should we update the "Load pre-trained base model G path" ?
@Jarods_Journey Год назад ⁺¹
Check this short to see if it answers your question!
ruclips.net/user/shortseO0gvi_RXTc?feature=share
@AIAsiaSinger Год назад
@@Jarods_Journey thank you so much!!
@Crazy_now Год назад
6:27 Did I miss something I use my own record voice (Harvard sentences) but I can not train it in Gradio {{Do I need to use my voice that I train in so-vits-svc-fork?}} if yes I need to download files from my folder "me" yeah? or just use my record voice
@Jarods_Journey Год назад
In the video, you have to place the path to where all of your audio files are, in this case, your recorded voices. In that folder you place to path to, all your files need to be in there.
@macdoctorsg Год назад ⁺¹
hey jarods, any instructions on how i could get this installed on my mac?
@Jarods_Journey Год назад
You'll have to install it manually by cloning the github repository and manually downloading the weights. A way on how you might go about it here is here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/README.en.md but I recommend you don't go with poetry and just stick with conda or venv.
@ScorgeRudess Год назад
Dude, you are amazing! Thanks for your great work!
@ultimamage3 Год назад ⁺¹
thank you for the video, it's really informative but i have an issue: when training the voice it doesn't generate ".pth" files in the weights folder, any way to fix that?
@pingusmcdingus5 Год назад ⁺¹
The checkpoints are under logs\[YourModelName] however if you copy them to assets\weights it won't load them properly, so ¯\_(ツ)_/¯.
@StephanWithPH Год назад ⁺¹
For me the first epoch stays at 0% and does not continue. why can this be? It gives messages like "max value is tensor(1.0717)"
@rae8379 11 месяцев назад
Thanks for sharing. But now I run into a problem. Could I just use pretrained models instead of training models myself? But on RVC WebUI, I couldn't figure out how.
@Pringles2.0 Год назад
Stuck at 8:10 it's blank when i click process data. All my data files are .wav and short files.
@Pringles2.0 Год назад
UPDATE: After converting the files (even though they were already wave) then putting in the pathway and process data it worked
@Hestia3332 Год назад ⁺¹
Darn, my I was trying to train the voice but it seemed stuck on Reducer buckets have been rebuilt in this iteration
@Jarods_Journey Год назад
Hmm, I've never come across this, if you waited, did it ever finish through an epoch?
@Hestia3332 Год назад
@@Jarods_Journey I left it an hour but nothing changed, not sure what settings I should have tried I have Ryzen 7 5800x 8 core processor, Rog strix x570-f gaming baseboard, and an rtx 3070 GPU?
@Jarods_Journey Год назад ⁺¹
@@Hestia3332 ah this should be able to run it. Try cutting your dataset so that it only totals 10 minutes or less. You might also wanna check the GitHub issues tab too to see if there are any present issues
@Hestia3332 Год назад
@@Jarods_Journey I've only just downloaded this software so not too sure what you mean by cutting database ?
@AleDZmusicProd Год назад ⁺¹
Hi, i'm not getting the .pth file in the "weight" folder, so therefore my "Inferencing voice" remains empty with nothing listing up. I don't know what could be wrong as I've made all the steps as you listed in the video...
@Jarods_Journey Год назад ⁺¹
This means the model never finished. Either something happened during training or it never began, check your console to see if there were any errors.
@AleDZmusicProd Год назад ⁺¹
@@Jarods_Journey it gives me no errors , I ever give the program enought time and by the end of every process it says that all is done

Следующие

Автовоспроизведение

(COLAB PRO ONLY) AI Voice Cloning with RVC in GOOGLE COLAB - Guide and Setup