AI Voice Cloning for Singing with RVC - Guide and Set-up

Поделиться
HTML-код
  • Опубликовано: 29 май 2023
  • Links referenced in the video:
    RVC Github - github.com/RVC-Project/Retrie...
    Curate and Record Data Samples - • Complete Guide: AI Voi...
    Download UVR - • Complete Guide: AI Voi...
    Come join The Learning Journey!
    Discord - / discord
    Github - github.com/JarodMica
    TikTok - / jarodsjourney
    If you found anything helpful, please consider supporting me and the content I am trying to produce!
    www.buymeacoffee.com/jarodsjo... |
    Hardware for my PC:
    Graphics Card - amzn.to/3pcREux
    CPU - amzn.to/43O66Ir
    Cooler - amzn.to/3p98TwX
    RAM - amzn.to/3NBAsIq
    SSD Storage - amzn.to/42NgMFR
    Power Supply (PSU) - amzn.to/3NBAsIq
    PC Case - amzn.to/447499T
    Mother Board - amzn.to/3CziMXI
    Alternative prebuilds:
    Corsair Vengeance i7400 - amzn.to/3p64r22
    MSI MPG Velox - amzn.to/42MnJHl
    Cheapest and minimum specs recommended:
    Cyberpower 3060 - amzn.to/3XjtZoP
  • НаукаНаука

Комментарии • 912

  • @TantuBeats
    @TantuBeats 9 месяцев назад +47

    so much respect to everyone who is making this work.. the amount of problems I'm running into is insane, haha. I hardly know where to start after hours of being into this.

  • @wektorus
    @wektorus Год назад +201

    Finally a tutorial that even I can understand. It's so stupid that most of the tutorials are made as everyone was that tech savvy. Thank you so much.

    • @Jarods_Journey
      @Jarods_Journey  Год назад +26

      Appreciate it 🤟🤟

    • @smokinmoose2
      @smokinmoose2 Год назад +17

      I wish i could say the same. I'm just a singer. I want a program that installs, I hit the .exe file, it opens, I put the source files in and voila, new voice. Don't know why that should be so hard.

    • @linuxtuxvolds5917
      @linuxtuxvolds5917 Год назад +7

      @@Jarods_Journey I can't stress enough how important it is to absolutely tell people that the training process will take a long time. I thought my progress was just stuck but no, it's just taking a long while!

    • @LovelyNyx7
      @LovelyNyx7 Год назад +3

      ​@@linuxtuxvolds5917I will wait as long as it takes. If it means I get to sound like someone's voice I really enjoy!

    • @paleguywithdonuts
      @paleguywithdonuts 11 месяцев назад

      @@Jarods_Journey it says "No supported Nvidia GPU found, use CPU instead" but it still opened

  • @luqmanhaqim97
    @luqmanhaqim97 Год назад +19

    Nice one, keep up the good work. Your instructions is very clear and helpful compared to others. 👍 ✨

  • @TheDailyMemesShow
    @TheDailyMemesShow 10 месяцев назад

    I'm going crazy with Jarod's channel 😂
    I'm that off the cliff with it that I'm running into rewatching old videos😂

  • @raykrislianggi
    @raykrislianggi 8 месяцев назад +19

    For those of you looking for the "weights" folder in the main RVC directory, as of RVC1006, it's inside the "assets" folder.

    • @pingusmcdingus5124
      @pingusmcdingus5124 7 месяцев назад +1

      Nothing is placed here after training a model though. Do I manually copy the D_*.pth or G_*.pth over from logs, or something?
      If I try that and click Refresh Voice List and Index Path, the new model appears in the Inferencing Voice list, but when I select it I just see a red 'Error' all over the UI: i.imgur.com/QNQUpmq.png

    • @raykrislianggi
      @raykrislianggi 7 месяцев назад +1

      @@pingusmcdingus5124 In my case, the .pth file is placed there automatically if it successfully finished the training without any errors. If it's not the case for you, there might be something wrong in the middle of the process. You might want to try retracing the steps or redo it from scratch.
      The one thing I did differently from this video is that my audio file for training is not split up into multiple short .wav files, but I just combine them into a single 20-minute file. I've compared both the cut and uncut audio and the result is much better with the uncut 20-minute audio.

    • @realon
      @realon 6 месяцев назад

      Thx for advice

    • @ohheyvoid
      @ohheyvoid 4 месяца назад

      thanks! :)

  • @stevecommand77
    @stevecommand77 Год назад +6

    Well convinced after the preview. Hope you can have video on text to own vocal speech soon.😊

  • @paarthsingh
    @paarthsingh Год назад +2

    can someone plz fix this error , jarods plz tell thisError : ValueError: invalid literal for int() with base 10: 'voice'
    this error i get when i do process data
    its step2a error : when i put my local URL into path folder

  • @solm8212
    @solm8212 5 месяцев назад +1

    thank you sooo much, all the other tutorials were so confusing and this was simple and fast, encountered some problems while running the rvc command prompt since i dont have a gpu, but i installed cuda and python and that fixed it. its like now you need to know programming and stuff but this tutorial was easy, fast and simple. keep up the good work.

  • @JeanIbarz
    @JeanIbarz 7 месяцев назад +3

    Thanks for sharing ! Small tip: using cut/paste instead of copy/paste allows moving the folder instantaneously ;)

  • @SplicerTv
    @SplicerTv 11 месяцев назад +8

    Thanks for the great tutorial! I found a couple things that might be helpful to others. For extracting the archive I use the official 7Zip software, its free and open source and will save you some hassle. Next thing, is regarding the batch size. I have a 3090ti which has 24GB of VRAM I find a value of 32 makes use of 21.7GB of the VRAM and leaves a bit for OS related stuff. You don't want to go overboard with batch size of 40, or the gpu will start swapping to system RAM, and significantly affect the time it takes to train even if you have fast RAM, it's still an I/O cycle you can avoid between GPU / System RAM. I recommend looking at task manager or using a tool like nvidia-smi to check the GPU VRAM use and experiment with batch size to find the best value for your card in order to get much faster training.

  • @ScorgeRudess
    @ScorgeRudess 10 месяцев назад

    Dude, you are amazing! Thanks for your great work!

  • @arhythwrith
    @arhythwrith 11 месяцев назад +40

    For those who would like to know the harmony bit in 5:11
    Harmony is when there's more than one note being sang at the same time
    It's kinda like chords but for vocals.
    HP5 Helps with separating harmony but it will be less clear on the voice compared to HP2.
    The newer RVC2 also has dereverb & deecho which I also highly recommend using to make the vocal separation even more clear for songs where the voice has a lot of reverb / echo.
    I'd say just mess around with it a bit and choose to your liking depending on the song.
    Anyways have nice day :D

  • @the3fe245
    @the3fe245 11 месяцев назад +3

    thanks mate, all of the other people i looked up as tutorials were too complicated, a month ago i viewed your so vits svc fork tutorial too, you are one of the best teachers in the world, i can understand your videos perfectly and my native language isnt even english!

  • @matthewedwards904
    @matthewedwards904 Год назад +28

    @8:03 if your process fails when you try to process the input data one possible explanation is that the path for your folder includes a space. That is what hung up my first couple of attempts. make sure your file path doesn't include any spaces for easiest handling.

    • @Hestia3332
      @Hestia3332 11 месяцев назад +1

      thank you! I took the spaces out of the song name and it worked for me!

    • @Primesky
      @Primesky 11 месяцев назад

      Thank you m8

    • @ChaseEverything
      @ChaseEverything 9 месяцев назад

      Still not working for me. It says :(
      ['trainset preprocess_pipeline_print.py', 'C:\\RVC-beta-0528\\RVC- beta0717\\voice\\me', '40000', '12', 'C:\\RVC-beta-0528\\RVC-
      beta0717/logs/me', 'False']
      C:\RVC-beta-0528\RVC-beta0717\voice\me/myself.m4a->Suc.
      end preprocess C:\RVC-beta-0528\RVC-beta0717\voice\me/myself.m4a->Suc. end preprocess

  • @Tarbard
    @Tarbard Год назад +1

    Thanks for the videos, they are fascinating.

  • @shaysilver203
    @shaysilver203 Год назад +1

    Great one! Finally works!

  • @RobertJene
    @RobertJene Год назад +75

    12:42
    1. Open file explorer to the folder that has a file who's path you want
    2. Press Alt+D
    3. Press End
    4. Type a backslash \
    5. Start typing the name of the file, look for the autocomplete with the correct name, press down arrow until the correct file is highlighted
    6. Press Ctrl+C

    • @Optimus97
      @Optimus97 Год назад +8

      Or you could Shift-Rightclick to unhide "Copy As Path" option

    • @RobertJene
      @RobertJene Год назад +1

      @@Optimus97 I prefer to use the mouse as little as possible

    • @fluffsquirrel
      @fluffsquirrel Год назад

      @@RobertJene I can kinda see what you're saying, especially with the delay of the context menu in Windows 10/11.

    • @RobertJene
      @RobertJene Год назад +1

      @@fluffsquirrel any keyboard sequence you do will save time not reaching for the mouse

    • @fluffsquirrel
      @fluffsquirrel Год назад +1

      @@RobertJene I think this is generally true, although the less sequences the better, if possible.

  • @RobertJene
    @RobertJene Год назад +5

    9:33 when I train embeddings for stable diffusion (image generation) I have it save an embedding file every 50 steps so I can check the loss and strength of them with scripts and test a few

    • @Jarods_Journey
      @Jarods_Journey  Год назад +2

      I've been finding with these speech models that the intermediary saves don't really exhibit abilities better than the final model, so I really just save the last one only in order to save space. I haven't found one yet that has been overtrained.

  • @obamabinbiden9762
    @obamabinbiden9762 Год назад +1

    This worked perfectly. Thank you.

  • @LucasMarak
    @LucasMarak Год назад +1

    RVC is best for me thanks Jarod take care

  • @gabrielmorgan3369
    @gabrielmorgan3369 11 месяцев назад +3

    For those who are having trouble choosing where the download goes you can right click it and choose save link as

  • @animeui_es
    @animeui_es Год назад +3

    Great job!. I have a question for you... How many audios do you recommend me to generate the model, and they are not problem if the audios have some background sound?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      10 minutes or more of high quality audio. You need to split the background from the audio samples and can check my latest video on that

  • @321Engage28
    @321Engage28 Год назад +1

    It worked. Thanks so much!

  • @Snackbarry
    @Snackbarry 2 месяца назад

    damn as a complete beginner coming to this channel to have it being explained like this was really..... interesting....

  • @Nangel2
    @Nangel2 Год назад +7

    Thank you for taking the time to make this tutorial! It was so easy to follow. :) Could I ask you to make a comment or tutorial on how to re-train a previously trained voice? I can't find that information anywhere.

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Let me know if this was what you were thinking about: ruclips.net/user/shortseO0gvi_RXTc?feature=share

    • @Nangel2
      @Nangel2 Год назад

      @@Jarods_Journey That's exactly what I was looking for, tysm!

  • @nycdweller4287
    @nycdweller4287 Год назад +4

    Hi, thanks for your video. Are there already some pre-trained models for RBC? Also, is there a reason you prefer to train locally rather than on collab?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +4

      I'm not sure about fully pre-trained models, you'll have to take a look around the internet to see. Colab is a nightmare to work with for debugging, etc and unless you made the code, trying to debug it isn't that fun. If I can work locally, I much prefer it and my hardware allows for it.

  • @warsin8641
    @warsin8641 Год назад +1

    This abosulte legend amongst men

  • @darksydeflow
    @darksydeflow Год назад +1

    niiiice thank you for the video :D

  • @Cyborg11
    @Cyborg11 Год назад +3

    Thanks for your very good tutorial Jarod.
    I still have a question.
    What do the values "loss_disc", "loss_gen", "loss_fm", "loss_mel" and "loss_kl" mean when training? Which values are indicating a good trained model? Are lower values better?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      A downloads slope on the graph is better, or lower values. You wanna look for total loss and train till that's as low as possible preferably

  • @hariom2580
    @hariom2580 7 месяцев назад +4

    I have succesfully trained voice but there is no index file in voice name folder, in weights folder pth file is there what to do...nice video

  • @DPIConnor
    @DPIConnor Год назад +1

    oh my god this is so awesome

  • @HyperbolicArachnid
    @HyperbolicArachnid 10 месяцев назад +2

    Finally, a tutorial that doesn't fly 5 miles over my head

  • @krysidian
    @krysidian Год назад +7

    That was very nice to follow along, thanks!
    Any interest in showcases bark ai? I think it's a pretty interesting way of doing tts but I don't think it's very well explained in many places or left out a lot that kinda confused me. Especially when it comes to getting decent results. Do think the prompting idea is really intriguing though

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      My quick experience with bark is that it's still in very early stages, excited to see where it goes though! I might have to do a more throughout test of it, but tortoise tts by far is the most promising and easiest to use

    • @krysidian
      @krysidian Год назад

      @@Jarods_Journey That's definititely true. Tortoise is incredible!. Really hope bark will update or get some cool successors with a similar but more stable approach. Making it generate laughs sighs etc. is spooky and very fun.

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      @@krysidian I'm definitely interested in the laughing part. That's one additional touch to AI that is lacking in voices and when that gets fleshed out, things are gonna get interesting xD!

  • @joemmaama
    @joemmaama Год назад +3

    For the voice me folder, is it just audio recordings of my own voice? if so how many do i need to include and what length? Thanks In advance youre a massive help dude

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Yup, as shown, make sure the folder contains all of the audio files without subfolders. Then just use that path for those and you should be fine

  • @andivax
    @andivax Год назад

    Thank you very much! My Inferencing voice list is empty. Where to put the downloaded voice models?
    And epochs. It's it worth to use 1000 epochs instead of 200 to increase the quality?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      I believe downloaded voice models should go into the weights folder, as long as they're from RVC. As for epochs, if you get good results at 200, I don't see much reason to go to 1k. If you have enough voice samples, 200 should be relatively good. I would listen to them per 100 epochs and see what you think is best (as it's always dependent on your data and how much of it you have)

  • @SirMato
    @SirMato 11 месяцев назад

    bro tysm my brain could not process how to do that on my own

  • @michaelteuber7362
    @michaelteuber7362 11 месяцев назад +3

    Thanks a lot for the video! One question: 40kHz is a pretty unusual sample rate so I want to use 48kHz (which now also seems to work with v2). Also I slice up the training vocals manually with a DAW (Cubase) into up-to-10-seconds snippets. Do I have to export the snippets in 48 kHz already from the DAW or would the usual 44,1 kHz be alright and only the output (the resulting file) would be in 48 kHz?

    • @Jarods_Journey
      @Jarods_Journey  11 месяцев назад +1

      It'll be fine, I believe RVC resamples your audio already using ffmpeg to the correct SR. I actually haven't verified this, but since it handles my datasets when using either 40k or 48k, that means it doesnt really matter :)

    • @michaelteuber7362
      @michaelteuber7362 11 месяцев назад +3

      @@Jarods_Journey Thanks for your fast reply! So there's a tiny bit of hope that if you feed it 48kHz already it might skip the resampling which could probably result in higher quality oputput 🙂.

  • @321Engage28
    @321Engage28 Год назад +4

    Great tutorial! Unfortunately, I seem to be having a problem with step 2a: My attempt to process the data was unsuccessful, and the output message came up blank! What am I doing wrong?

    • @Yumegipsu
      @Yumegipsu Год назад +1

      This happened to me too but it worked when I removed spaces from my experiment name. If it's not that then idk

  • @djsaquib
    @djsaquib Год назад

    While making dataset, if i am taking vocals from a singer! Do i keep to keep key of vocals same? Or i can add multiple audios of different songs to train model of particular singer?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      As long as its the same singer, you can add as many songs from them as you like

    • @djsaquib
      @djsaquib Год назад

      @@Jarods_Journey thank you for clarifying 🙏🏻

  • @M4rt1nX
    @M4rt1nX Год назад +1

    Those high notes though!!!
    We love local!!!!!!!!!!!!!!

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Haha I wishhhhh xD. Local installation yields less issues, and is much easier to debug lol.

  • @denblindedjaligator5300
    @denblindedjaligator5300 Год назад +5

    i have some questions. When I download other people's voice modules, there is a file called something like traint.index, it's a file you have to use. The same goes for total_fea. I have also seen that there are pth files in the log folder itself.

    • @Jarods_Journey
      @Jarods_Journey  Год назад +2

      These should go into the log folder underneath the "experiment" or "speaker" name that you want to use. So if the name is john, the john.pth goes into weights and the index goes into the log/ where you have to create a john directory and place the index into.

    • @denblindedjaligator5300
      @denblindedjaligator5300 Год назад +1

      @@Jarods_Journey but i meen the traint.index. And why is there a modul in the log folder and a detail file

  • @shep9194
    @shep9194 Год назад +3

    Have you tried the realtime voice changing? Ive been trying to get that working but had some issues, i think its an svc fork though

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Have not gotten to try that yet on either repos unfortunately :/

  • @androidgameplays4every13
    @androidgameplays4every13 Год назад +1

    Thank you, thanks to your tutorial I finally succeed at creating my own models! even with only 4gb of memory in my gtx 1650 Super.

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Awesome! They do say that it can work on smaller amounts of VRAM so glad that this worked!

    • @schoodst6095
      @schoodst6095 Год назад

      how did you get it to work on low ram? mine is eating up 6gb really quick and shuts down cause it run out, do I lower batch size?

    • @titrecords2294
      @titrecords2294 Год назад

      Mine keeps running out of memory how did you do it? Please help

    • @schoodst6095
      @schoodst6095 Год назад

      @@titrecords2294 lower the batch size, like a lot

    • @MohamedAdel-kw4hx
      @MohamedAdel-kw4hx 11 месяцев назад

      Thx , but I can't find pth file after training.

  • @AImusikindo
    @AImusikindo Год назад +1

    Thanks bro, from Australia

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      That's awesome, appreciate it!

    • @AImusikindo
      @AImusikindo Год назад

      @@Jarods_Journey i just made some cup of coffee for you lol

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      @@AImusikindo Haha thank you, each coffee keeps me going! 🤟🤟

  • @Retro-zn2jt
    @Retro-zn2jt Год назад +3

    thanks for your video, there are a few other videos on the subject and I find that yours is better explained nevertheless I still have to deal with several errors. First I had "Cuda Out of memory", so I lowered the batch to the minimum, now I have another error which is: "RuntimeError GET was unable to find an engine to execute this computation". My audio samples are a bit long (a few minutes) and they are in 32Bits float at 44.1Khz but I only have 4 samples...
    should I divide them into several parts? thanks in advance.
    Editv1 : I tried many time and also to cut in differents parts, reduce the size and i still get the RuntimeError even with 2 small sample (16bit 44.1khz) than less than 10 secondes… i don’t understand
    Editv2: Also i wonder if you know how to text-to-speech with this tool ?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      You might have to reinstall or make sure the CUDA being stalled is compatible with your GPU

    • @necrovolo
      @necrovolo 8 месяцев назад

      I'm having the same issue.

  • @lalalala99661
    @lalalala99661 7 месяцев назад +2

    Quick tip in minute 13:03 you can do shift+ right click then a other menue pops up and you can click on copy path in the poped up menue

  • @ericleigh007
    @ericleigh007 7 месяцев назад +2

    if you want to move the folder faster, just rename the top folder, then cut and paste the lower into the top-level. When you cut and paste the contents, explorer knows it only MOVES the folder, so no copy wait.

  • @EthanWinters176
    @EthanWinters176 11 месяцев назад +4

    If you can read this:
    .pth files go in the folder "weights"
    .index and others go to "logs" under the voice name ex: Logs\EthanWinters

  • @CamelliaWings07
    @CamelliaWings07 11 месяцев назад +2

    Thank u. This is the best explaination video I've ever seen in YT. Very clear☺ I successfully make it because of your detailed contents! (I failed many times before Kkkkk)

  • @djdocq8963
    @djdocq8963 8 месяцев назад +2

    @13:28 you pick an index file v2, in my drop down box it only has 3 different v1 files to choose from? It doesnt seem to create an index file when I train my voice.

  • @SNYCHANNEL
    @SNYCHANNEL Год назад +4

    Thank you for this video!!
    When i trying to train i get this error:
    sr = int(sys.argv[2])
    ValueError: invalid literal for int() with base 10: 'Yona\\Desktop\\RVC-beta\\RVC-beta-v2-0528\\voice\\Me'
    You know howwhat im doing wrong?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      RVC: Invalid Literal or File Not Found error

  • @Beary_TheBear
    @Beary_TheBear Год назад +3

    Hi, thanks for the tutorial. I got stuck at the training process. I received a message saying this:
    RuntimeError: The expanded size of the tensor (12800) must match the existing size (4040) at non-singleton dimension 1. Target sizes: [1, 12800]. Tensor sizes: [4040]
    Before I got this message, I was getting the "Cuda out of memory", even though I have 32GB of RAM. I cut the audio samples into smaller bits under 10 seconds, and now I have the expanded size of the tensor error. What did I do wrong?

    • @gabrielmorgan3369
      @gabrielmorgan3369 11 месяцев назад

      same issue

    • @gabrielmorgan3369
      @gabrielmorgan3369 11 месяцев назад

      it means that the if it finishes its going to take up too much space so just turn batch size down to fix

    • @MrSix-1
      @MrSix-1 5 месяцев назад

      Cuda Memory is VRAM Its different than regular RAM

  • @trubyart6193
    @trubyart6193 11 месяцев назад +2

    im having a lot of trouble... opening the go-web file doesnt show the language option, and then has lots of stuff and at the end says to press any button to continue. After i do that it closes, and when i searched up the localhost:7897 it says i cant reach the page..

  • @welachutmelexcel
    @welachutmelexcel 8 месяцев назад +2

    Since i’m relatively new to this, how would you use this rvc for just cloning a voice? Do I just leave out the parts in model inference about the pitch and music related things?

  • @scedolin
    @scedolin Год назад +4

    thx for this good tutorial
    Unfortunatly I had a an error after 2 s and I don't understand why I did wrong.
    if data.dtype in [np.float64, np.float32, np.float16]:
    AttributeError: 'NoneType' object has no attribute 'dtype'

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Another commenter had this issue but I haven't encountered it yet and haven't found a way to reproduce it. You might be able to find others who are looking to get this issue resolved here:
      github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues?q=is%3Aissue+AttributeError%3A+%27NoneType%27+object+has+no+attribute+%27dtype%27+is%3Aopen
      Could be related to the training process, trying to find files, etc

    • @scedolin
      @scedolin Год назад +1

      @@Jarods_Journey I applied your remark on your good short File Not Found: feature_768, and I succeed to avoid this eror anymore. Thx alot - I start to follow your channel last few days put your subject are very interesting - great job

  • @KennaLovesGouda
    @KennaLovesGouda 11 месяцев назад +1

    8:11 it will not process data. It starts but then stops and there is an orange line around the output any reason why?

  • @VongolaChouko
    @VongolaChouko 8 месяцев назад +1

    Is 1000 epochs overkill? Will it have diminishing returns compared to just keeping it up to 300? I really don't see a standard recommended epoch total anywhere, the answer varies. I usually use 500, but I honestly don't know if that's fine since I just use RVC for SillyTavern and haven't tried it just on itself yet, hence I don't know how to evaluate if the results are better or not .___.

  • @enricopileggi7909
    @enricopileggi7909 9 месяцев назад +1

    How can I solve the error message " Unfortunately, there is no compatible GPU available to support your training" in step 2b? (My GPU is MX250). Thank you

  • @samphelps856
    @samphelps856 Год назад

    Thank you

  • @fsForward
    @fsForward 3 месяца назад

    I *love* that he says "Don't trust me blindly", good! But now I do trust you blindly😂

  • @aatkins2002
    @aatkins2002 8 месяцев назад

    The program after a while replaces certain fields, usually the big buttons and their output fields with "Error" and a popup appears in the top right saying "Connection Errored Out"
    The command console isn't reporting anything unusual, but when I tried to proceed like nothing was wrong, one click training didn't seem to react well.
    What's causing this "Connection Errored Out" message?

  • @Ulibert
    @Ulibert 10 месяцев назад

    hello pls help I got stuck with this message ValueError: invalid literal for int() with base 10: 'Music'

    • @Jarods_Journey
      @Jarods_Journey  10 месяцев назад

      Check to make sure there are no spaces in your path

  • @Winterbliss-sg7qg
    @Winterbliss-sg7qg Год назад

    Keep putting more tutorials!!!!

  • @LillianGreenHiLilly
    @LillianGreenHiLilly 11 месяцев назад +1

    Jarods Journey Why cant we just upload for example an existing split song file from inside the folder that is just the singing voice with no music. Also Why copy and paste the whole address? Please answer, because I dont usually get a response when i ask a simple question.

  • @POPMAGStudios
    @POPMAGStudios Год назад

    i put all the settings and used one-click training and i got:
    "added_IVF985_Flat_nprobe_1_IbrahemHefny_v2.index
    All processes have been completed!"
    but i couldnt find the model from the "Inferencing voice" slider even though i found the voice model in the log file
    please help

  • @el-bicente
    @el-bicente 9 месяцев назад

    Thank you for this great tutorial! I was wondering if there was any tool to separate the vocals when they are different singers, because I want to apply several models. I can get clean vocals with UVR5, but I don't know what to do next. I tried to use whisperX but I think it's not really suitable for singing and overlapping voices...

  • @EricNoneless
    @EricNoneless 7 месяцев назад +1

    When I click in the bat file it says it was not possible to find the determined path so when I try to search the localhost on the web it gives an error...

  • @todhold2673
    @todhold2673 10 месяцев назад +2

    Am i missing something? Did he go over how to add the newly converted vocals back to the instrumental?

  • @RuneLightLovely
    @RuneLightLovely 5 дней назад +1

    Why the "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 4: invalid start byte" error appeared in cmd.exe after I clicked "One-click training"?

  • @AI_arab_world_maroc
    @AI_arab_world_maroc Год назад +1

    Hi Jarod, should it train less with V2 48k , what is the best combination to train a model when it comes to V1 , V2 , 40k 48k? Thank you

  • @TheBlueRage
    @TheBlueRage 3 месяца назад

    Thanks. I just have to make voice samples. I guess I am supposed to sing something. Is that correct? The UVR software works great. I was able to stem Suno ai. m also looking at Jen music ai and Lalal ai which uses celebrities. This was more intense than what I expected. I see that Mac has a download app. I just found an App of Google App Store. I will look through your other videos for more lessons. Thanks.

  • @Noor.alhayat
    @Noor.alhayat Год назад +1

    i recived this
    Unfortunately, there is no compatible GPU available to support your training.
    how to solve this problem.

  • @givehead
    @givehead 9 месяцев назад +1

    i don't see a pitch extraction algorithm, everything else is there. Any solutions?

  • @Liphisbus
    @Liphisbus Год назад

    Hello again, I deleted my other comment that I did saying I had problems with memory (I had a 4GB RAM but was facing problems using it.)
    I asked a friend that has 8GB of RAM to train one model for me instead and he faced the same CUDA Memory error. Do you might know why this happens by some chance? Hope that's not a already answered question, if it is, I must be blind. Thanks again!

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Your audio files are probably a bit too long. Check out this short here: ruclips.net/user/shortsuo6umRWVVTE?feature=share
      Hopefully that helps and let me know if you have any other questions!

  • @Jefersen
    @Jefersen Год назад

    Hello, thank you so much for the tutorial, everything worked fine except the last button: When i click one click training i get all this messages for every file: mp3_10.wav->Suc. and then it just stopps, nothing happens any more. any suggestions ?

  • @Xivlex
    @Xivlex 10 месяцев назад +2

    Hello thanks for the video. It piqued my curiosity and now I want to try RVC myself. Unfortunately, I'm running an AMD GPU (6800xt) but upon checking the releases an option for AMD users is present in updated0814v2. My problem now is that when I try to follow your steps, RVC does not detect my GPU. For example, at step 2b as in 8:21, the options to select a GPU are not present. The option to input a GPU index is there and I've tried putting in "0", "1", "2" and "0-1-2" but when pressing one-click training it says: "NO GPU DETECTED: falling back to CPU - this may take a while" Do you know a way for it to detect my GPU?

    • @Jarods_Journey
      @Jarods_Journey  10 месяцев назад

      I'm not too sure unfortunately, you might have to check on their githubs issue area to see if anyone else is running into it.

  • @tachankafreeman3442
    @tachankafreeman3442 10 месяцев назад

    for me for some reason it takes so long for the files to exoport and stuff and when it done well for my version it just shows a glimps of a small window appear and it closes as if it can open idk how to fix that

  • @rae8379
    @rae8379 5 месяцев назад

    Thanks for sharing. But now I run into a problem. Could I just use pretrained models instead of training models myself? But on RVC WebUI, I couldn't figure out how.

  • @ShyGun78
    @ShyGun78 10 месяцев назад

    ،"I reach the final stages and move into the inference step after completing the training process. However, the model I have created doesn't appear in the Inferencing voice section and there are no options or indications - it's like a blank state. I have followed all the correct steps and the issue remains unresolved, which seems to be a problem that many individuals are encountering. Please help me out with this. :)"

  • @RuneLightLovely
    @RuneLightLovely 5 дней назад +1

    Why the "Unfortunately, there is no compatible GPU available to support your training." appeared in "GPU Information"?

  • @TheAimax
    @TheAimax Год назад

    following your advice to use RVC I have a question, if I stop training at 250 epochs, if I want to start training again to reach 500 epochs I must put a total training epochs of 250 and they will be added to the 250 that I already had or put 500 ? I know that maybe it's a silly question but the question really arose, thanks for the attention to each one

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Gotta do 500. These models store checkpoints so if you did 250 (assuming you didn't delete them), it'll start up from the checkpoint

  • @Odyssey_ACNH
    @Odyssey_ACNH Год назад +2

    "The expanded size of the tensor (12800) must match the existing size (0) at non-singleton dimension 1."
    Anyone know a fix to this problem?

    • @SunPrime_Nexus
      @SunPrime_Nexus 10 месяцев назад

      Yeah bro, I recently have this problem. This happens because you surely try to use other data to train your AI that is probably not the original one that the AI started training. To solve this problem you only need to create other AI renaming the experiment and then you charge the new data you want to use at step 2a and then process the data. Then go to step b and extract the feature normally. Then go in your disc where you save your RCV documents, search logs and look for the carpet of the old AI, and copy all the archives that say D and G, then paste them in the new carpet of your new AI. When you have all this done you could train as normally and everything should be right

  • @BierGarten100
    @BierGarten100 6 месяцев назад

    i dont have gpu training. i can only use cpu but when i do all steps the weights folder is empty and my .pth doesnt exist. :(

  • @looooool3145
    @looooool3145 4 месяца назад

    Hey man, thanks for the tutorial. I was wondering how to match the key of the instrumental to the output voice? I converted a male song to a female cover, but I don't know how to change the instrumental pitch to match with the female voice.

  • @DreamboyyHD
    @DreamboyyHD Год назад

    When i use a Vocals/Accompaniment it away show this message "clean_empty_cache" in my folder it have only one mp3 (I try to move it to another drive and try to make a new one and it still not work )or did i do some thing wrong?

  • @philerasmus
    @philerasmus 8 месяцев назад

    Excellent tutorial. Running the gui I have found that the inference does use the GPU but the Vocal extraction task just relies on CPU. Is there a solution? Thanks

  • @InfinityPiano
    @InfinityPiano Год назад

    what do I do when it just says CUDA out of memory
    CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 6.00 GiB total capacity; 5.12 GiB already allocated; 0 bytes free; 5.28 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

  • @luciovids9208
    @luciovids9208 10 месяцев назад

    Thanks for the great tutorial Jarod. In my case it seems that "One click training" doesn't work very well. The epochs are created at a very slow pace (2.5 hours each) but if I click "Train Model" from the outset, then it works perfectly (40 seconds per epoch).

    • @Jarods_Journey
      @Jarods_Journey  10 месяцев назад

      Interesting, well glad that "train model" works. That is what I use nowadays as well

  • @DragunnitumGaming
    @DragunnitumGaming Год назад +1

    Finally something i can understand! Thank you!! holly hell the other tutorials felt like i had to hack nasa just to make a parody ai song cover

  • @gummywormee41
    @gummywormee41 10 месяцев назад

    Hello! One issue I've been having is that it says that it cannot find an NVDIA GPU and to use GPU instead, but then it says that there's no GPU to support training. Do you know what would be a good solution to this?

  • @Tpizzle1313
    @Tpizzle1313 Год назад

    While extracting features Any solution for
    line 13, in
    version = sys.argv[6]
    IndexError: list index out of range

  • @alperengulen6963
    @alperengulen6963 11 месяцев назад +1

    how can ı change the localhost site to english? (it opens in turkish for me)

  • @mrdeadmemes
    @mrdeadmemes Год назад

    i've managed to go through the entire data training process, but i get an error at the very end when it attempts to create the file in weights. it's an 'unexpected pos' error, ("unexpected pos [long number] vs [long number]"). im not sure how to fix it
    i trained for 5 epochs instead of 200, and it created a file in pth (and was found in the model inference section), so this only seems to happen on higher epoch values. i'm not sure why

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Hmm, I'm not too sure on why this might be happening. I can imagine that maybe something got corrupt or messed up somewhere along the line, causing the position to be wrong.
      Have you tried training a new model with all new folders?

  • @zakizahid7375
    @zakizahid7375 Год назад

    i currenty stuck at the process stage where it say ValueError: invalid literal for int() with base 10: 'changer\\song', how do i fix this?

  • @denblindedjaligator5300
    @denblindedjaligator5300 Год назад

    i have to set my gpu index at 0 and it works. when i have trained my module i can not find it only wavfiles can i send you my project folder?

  • @ultimamage3
    @ultimamage3 7 месяцев назад +1

    thank you for the video, it's really informative but i have an issue: when training the voice it doesn't generate ".pth" files in the weights folder, any way to fix that?

    • @pingusmcdingus5124
      @pingusmcdingus5124 7 месяцев назад +1

      The checkpoints are under logs\[YourModelName] however if you copy them to assets\weights it won't load them properly, so ¯\_(ツ)_/¯.

  • @BillMill
    @BillMill 9 месяцев назад

    On some machines selecting 40 batch size makes training extremely slow. It was my case too (running 4090). I reduced it to 30 and it started flying.

  • @VaibhavShewale
    @VaibhavShewale 10 месяцев назад +1

    why this one is good

  • @ifoundrandomevents5240
    @ifoundrandomevents5240 Год назад +5

    i got something like this
    result = torch._C._nn.leaky_relu(input, negative_slope)
    torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 10.00 MiB (GPU 0; 6.00 GiB total capacity; 5.27 GiB already allocated; 0 bytes free; 5.31 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
    what does it mean> how can i fix it?

    • @frontified
      @frontified Год назад

      man same, im facing this same problem and i have no idea how to fix it

    • @gabrielmorgan3369
      @gabrielmorgan3369 11 месяцев назад

      same

    • @gabrielmorgan3369
      @gabrielmorgan3369 11 месяцев назад +3

      it means that the if it finishes its going to take up too much space so just turn batch size down to fix

    • @ifoundrandomevents5240
      @ifoundrandomevents5240 11 месяцев назад

      @@gabrielmorgan3369 I turn it down to literally 4 bro it still ain't working.

    • @gabrilapin
      @gabrilapin 11 месяцев назад

      Same help please !

  • @MistahJ100
    @MistahJ100 11 месяцев назад

    CAN SOMEONE PLEASE HELP ME, i am trying to train more audio but when i try to get back into the google colab, it says it cannot find the model falss or some nonsense, I did not change anything on my drive, its all where it should be but i can not get back in. I would like to train more voices and i dont understand why it wont work, This happens when i click on the web cell