Complete Guide: AI Voice Training with So-Vits-SVC - Part 1: Google Collab

Поделиться
HTML-код
  • Опубликовано: 23 авг 2024
  • Voice Recorder - github.com/Jar...
    Audiosplitter - github.com/Jar...
    so-vits-svc-fork - github.com/voi...
    UVR - github.com/Anj...
    Download Python - • How to Install Python,...
    Come join The Learning Journey!
    Discord - / discord
    Github - github.com/Jar...
    TikTok - / jarodsjourney
    If you found anything helpful, please consider supporting me and the content I am trying to produce!
    www.buymeacoff... |
    Hardware for my PC:
    Graphics Card - amzn.to/3pcREux
    CPU - amzn.to/43O66Ir
    Cooler - amzn.to/3p98TwX
    RAM - amzn.to/3NBAsIq
    SSD Storage - amzn.to/42NgMFR
    Power Supply (PSU) - amzn.to/3NBAsIq
    PC Case - amzn.to/447499T
    Mother Board - amzn.to/3CziMXI
    Alternative prebuilds:
    Corsair Vengeance i7400 - amzn.to/3p64r22
    MSI MPG Velox - amzn.to/42MnJHl
    Cheapest and minimum specs recommended:
    Cyberpower 3060 - amzn.to/3XjtZoP

Комментарии • 358

  • @Jarods_Journey
    @Jarods_Journey  Год назад +11

    There's currently an issue with the software as of 07/18/23 rn:
    github.com/voicepaw/so-vits-svc-fork/issues/837

    • @Tstormer
      @Tstormer Год назад +1

      Building wheel for pyworld (pyproject.toml) ... error
      ERROR: Failed building wheel for pyworld
      Building wheel for tqdm-joblib (setup.py) ... done
      Created wheel for tqdm-joblib: filename=tqdm_joblib-0.0.3-py3-none-any.whl size=1631 sha256=c1e4112b1cc3303c89765fc01f30d8fa5c916c8ad143972ab87e9640c7a758ee
      Stored in directory: /root/.cache/pip/wheels/68/bc/33/47a70346aa7f8953ca185e3485dc2b5b5dc5f233b4d6c6e8f1
      Successfully built tqdm-joblib
      Failed to build pyworld
      ERROR: Could not build wheels for pyworld, which is required to install pyproject.toml-based projects

    • @ELQapso
      @ELQapso 11 месяцев назад

      Is it fixed yet? Mine is not working on any version.

  • @TheChipMcDonald
    @TheChipMcDonald Год назад +29

    Your guide has been the best of what I've seen so far. What I'd like to see in a video:
    1) a little more detail about how you manage stopping/continuing training, managing colab (given it seems there are limitations in how much time you're allotted, idle time):
    2) what the files are (why is there a g, and a d? ), directory structure - relative to dialogue in the github script;
    3) how to migrate trained files to another Jupyter service? what to save/download to backup trained data?
    4) *** how to setup locally *** - in methodical detail; what files do you need from github, where to put them for a complete Python noob;
    5) what are the parameters of looking for a graphics card; does it have to be NVIDIA; what are the minimums; does CPU/mobo memory matter; CUDA. NVIDIA branding matter;
    5) prospects for so vits for Mac M1
    You may or may not already know these things, but I'm guessing you may be motivated to research them for video purposes, so I'm maybe outsourcing researching to you. 🙂

    • @Jarods_Journey
      @Jarods_Journey  Год назад +7

      The local training of it is out on part 2 if you are looking to run some local stuff, but all good points! I'll be creating (hopefully soon) a Q&A of the most frequently asked questions etc for sovits so appreciate this 🤟

    • @hoihoi-san
      @hoihoi-san Год назад

      @@Jarods_Journey If each of the Harvard Sentences are individually recorded to a separate .wav, will silences at the beginning and end of a recording significantly impact the training? Also, if the Harvard Sentences are recorded in blocks of ten and all of them are saved in a single folder, will the audiosegmenter process every recording in the folder?

  • @SirHolmes
    @SirHolmes Год назад +4

    Love the way you explain complicated things with simple language in depth, thank you very much!

  • @Justin-ul4tt
    @Justin-ul4tt Год назад +2

    I'm literally stuck at the Last step "user trained model" it says name error "vocals is not defined"

    • @SHARKNADO_2006
      @SHARKNADO_2006 4 месяца назад +1

      same, don't know what to do about it.

  • @k.k9206
    @k.k9206 Год назад +3

    You really covered all the bases here, This was a great video! I subbed.

  • @Jagan__7
    @Jagan__7 Год назад +1

    Bro this is just lit man !
    Love the content ❤!

  • @g0ldazu
    @g0ldazu Год назад +7

    This guide was truly such a lifesaver, you explained each and every step so clearly and I really appreciate that you gave me a better idea of how everything works, too. Thank you so much. ⸜(。˃ ᵕ ˂ )⸝

  • @Zenmelo
    @Zenmelo Год назад +2

    how do i fix /bin/bash: svc: command not found when i try and play the automatic proccessing

  • @BR-ud4jo
    @BR-ud4jo Год назад +2

    Thx for this Jarod really appreciate the detail. I've got two "issues" that occured but Im still running the training so we'll see how it turns out. 1. Step 6 preprocessing gave me this: "Preprocessing: 75% 73/97 [00:37

    • @Jarods_Journey
      @Jarods_Journey  Год назад +2

      Interesting... The first one is just a warning, I'm not actually sure what that one indicates. If the code is training, then its working 😅!
      The permissions part.... Is odd because it should be your Google account. You can try copying over the Collab notebook to your own Google drive, and then run it that way to see if that resolves it.
      Lmk!

  • @RobertJene
    @RobertJene 6 месяцев назад

    18:48 OK I see what log_interval does now, thanks.
    Because I watched the 2nd video first, because I'm not interested in google collab
    but in the 2nd video you also change the eval_interval
    what does eval_interval do? Because you change it in the 2nd video
    28:09 eval_interval divided by amount of steps gets your evaluation model? what? and where do I find the number of steps?

  • @Jxki69
    @Jxki69 Год назад +19

    i cringed at myself after i tried replacing my voice to some popular song 😭😂

    • @Jarods_Journey
      @Jarods_Journey  Год назад +7

      I feel ya 😅 it's actually a thing though that we tend to not like the sound of our own voice

  • @user-zl6wb9lb2e
    @user-zl6wb9lb2e Год назад +2

    Error at : Automatic preprocessing i get error /bin/bash: svc: command not found !

  • @jmas679
    @jmas679 11 месяцев назад +2

    When I press train, it loads for like 6 seconds and then it stops and I don't think it's training at all...

  • @jimlapbap
    @jimlapbap Год назад +1

    Thanks! As a musician with exactly zero experience coding, this was extremely helpful.
    I saved a copy of the collaboration sheet, and I also discovered that when I stopped training, and started the next day, it seemed to resume the "Epoch counting." Does that mean it's adding on and will result in better sounds?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      It should, but you'll wanna pay attention to the loss values on the graph. You might wanna check my video where I go over getting the best possible trained voice models

    • @jimlapbap
      @jimlapbap Год назад

      @@Jarods_Journey Thanks. I got about 3000 epochs (?) in before I ran out of the free Google Collab limits, and it still turned out pretty good (at least for what I’m going), so thanks again

  • @Karagumruksporlu_Dursun
    @Karagumruksporlu_Dursun Год назад +1

    Great video man, but i didnt get how to make voice samples for someone elses voices for example a streamer. Can you answer please?

  • @magickey8
    @magickey8 11 месяцев назад +1

    How do I work with more than one speaker or more than one model? I'm trying to keep the model I've already trained but start a new model or speaker and be able to select between the two and maybe add more later. Do I need to clear out any files from the first training before I begin the new one, etc.? I tried to add a second speaker but it seems that the "training" went way too fast to have actually worked and the vocals.out.wav is the voice from the first speaker.

  • @RedouaneAouameur
    @RedouaneAouameur 6 месяцев назад

    Hey great stuffs, what would be in your opinion the best training models for cloning metal vocals? how can i contact you?

  • @SHARKNADO_2006
    @SHARKNADO_2006 4 месяца назад

    30:30 I got so close but this just wouldn't work for me. I just kept getting "NameError: name 'vocals' is not defined" no matter what I renamed the file to.

  • @user-go1ii6ez6y
    @user-go1ii6ez6y Год назад +1

    Hi. Thanks for the cool guide) You are a nice teacher!

  • @leandrawalters5038
    @leandrawalters5038 Год назад +1

    So when you are at ruclips.net/video/xgvT7UnUTng/видео.html (29:42) and you have your audio file folder showing, mine never produced an audio file folder. Do you have any way I can fix this?
    "train": {
    "log_interval": 100,
    "eval_interval": 200,
    "seed": 1234,
    "epochs": 10000,
    Are my settings, thank you.

  • @austinallen4184
    @austinallen4184 11 месяцев назад +1

    Do you know of or could imagine a possible way where you could have AI create an original vocal performance given an instrumenta? It would just be more data to get trained on, that being, say, 50 instrumentals, 50 acapellas from said instrumentals before separation and finally the voice you want to be singing, like what everyone is doing now. Does this sound remotely possible?

  • @luzconciencial
    @luzconciencial Год назад +1

    30:45 I didn't understand this part in order to continue training the models, because the instructions talks about wav. files while all files I have from the saved checkpoints are python files in logs folder.
    Also, I miss something, how can I use my trained voice model to cover any song I want? Pls man, help me with this.

  • @dysflexxia8204
    @dysflexxia8204 Год назад +1

    Thanks for the tutorial its really helpful! Whenever I select audiosegment it keeps creating an empty folder. Am I doing something wrong?

  • @ifoundrandomevents5240
    @ifoundrandomevents5240 Год назад +1

    at how many epochs will be the best outcome/quality for my AI voice? i have no idea about this and i dont wanna "over-train" it as u mentioned.

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      It's impossible to tell with each dataset, you just have to watch as the loss goes down on the tensorboard graph. The lower, the better for when to stop. It just requires a lot of testing around

  • @whimblaster
    @whimblaster 11 месяцев назад +1

    Hey I'm a newbie to all this stuff. My question is do you need to have 10GBs of VRAM to start and get in to this whole thing? I have a GTX 1070 Ti (8GB VRAM). Can I do all things needed to develop a fully compatible singing model of me?

    • @Jarods_Journey
      @Jarods_Journey  11 месяцев назад

      8gb will work, just make sure your audio files are cut and split. I would recommend you use RVC instead of sovits SVC though, a little easier to get going

  • @SnakesPower3
    @SnakesPower3 3 месяца назад

    Can someone help?, It worked perfectly before but now when I try to use the audiosplitter, I press the first or second option then i name the folder, and it just pops up for one second then command prompt closes?. Please help

  • @LookingForSomethingMissing
    @LookingForSomethingMissing Год назад +1

    can I continue training my model after I stoped training it ? I'm really bad at this stuff will you enlighten me the way to solve this problem of mine.

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Yes, you have to set epoch higher and rerun all of the cells previously to continue training

  • @iSparky24
    @iSparky24 Год назад +1

    Hey I must say it's a very good informative video! I still have a question tho, when starting the training I get a 403 error, but it still continues the training, the actual question is how many Epochs? Should I wait till it hits 9999?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      There no clear answer for epochs, but try as many as time permits for your data and try out how they model sounds after that point in time

  • @Sam-cq9us
    @Sam-cq9us Год назад +1

    as soon as a press enter after naming the output folder on audiospiltter i get an error message what am i doing wrong?

  • @rohitgupta3333
    @rohitgupta3333 Год назад +1

    Thank you so much for creating this content. I had a question -- do you know if the results are generally good if the target is not a song but speech? So I could change the voice of someone's speech to the voice that I've trained on?

    • @aguilar2461
      @aguilar2461 Год назад +1

      maybe you have already tried it out by now but it should work just fine since singing isnt a lot different than talking, think of it as if you were singing with a monotonic voice.

  • @Verbalaesthet
    @Verbalaesthet Год назад

    Thank you for the good tutorial. It worked.

  • @dyl6802
    @dyl6802 Год назад

    This was AMAZING. Thank you.

  • @SHARKNADO_2006
    @SHARKNADO_2006 4 месяца назад

    How long do I train it for? I don't want to overtrain it, but apparently 134 is under training it.

  • @yusufbenhassine
    @yusufbenhassine Год назад +1

    Can you tell me please why i have this mention when i run the training: No dashboards are active for the current data set ? Probable causes:
    You haven’t written any data to your event files.
    TensorBoard can’t find your event files

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      I've only seen this on initial start of the training. There's no data for the tensorboard to read until later. Once training gets up and going, if you refresh it, it should populate.
      Other than that, I'm not too sure

  • @RobertJene
    @RobertJene Год назад +1

    I played around with tortoise-tts, I also did 10 seconds or shorter samples.
    instead of having one big file that I split, I just made multiple 10-secon samples manually.

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      What do you think of tortoise? I'm looking at coqui tts to create vits models ATM as I've heard tortoise is slow

    • @RobertJene
      @RobertJene Год назад

      @@Jarods_Journey it _is_ freaking slow, but what it outputs is amazing. I'm going to use it for some characters in an upcoming video

  • @Syphuss
    @Syphuss Год назад

    huh for some reason the voice recorder is recording in slow motion on my mic, is there anyways to fix that?

  • @user-go1ii6ez6y
    @user-go1ii6ez6y Год назад +1

    Hello. Please tell me how to train the trained model further. I trained the model for 134 epochs and I got files D_134.pth, G_134.pth. Where should they be placed? I know that when retraining, I need to enter data again in the "Copy your dataset" cell, but if I enter D_134.pth, G_134.pth there, then I get an error when enabling the "!svc pre-config" cell: "ValueError: too few files in dataset/44k/Me". Can you please tell me where and what should be placed during re-training?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      If you followed all of the steps, you shouldn't be moving the files around as that may cause issues. The cells should be ran in order once again in order to re-continue training. Other than that, unfortunately it's beyond me to try and debug collab any further unfortunately :/

  • @morsclue
    @morsclue Год назад +1

    Love ur bgm btw. Sharou bgms are bangers

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Oh absolutely, I love the stuff he's produced and I got introduced to them by watching hololive streamers lol

  • @InfinityPiano
    @InfinityPiano Год назад

    the audio splitter application doesn't work for me. It launches and I typed in my option and select the audio file and then it would just closes immediately

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      It doesn't process anything other than wav files unfortunately as that would require having to install ffmpeg. I might make changes to the script to do this, but for the time being, you have to feed it wav files.

  • @Zhenya_01
    @Zhenya_01 Год назад

    Thank you for the amazing guide! One question. Can I try the inference part without stopping to train the model? Or, if I stop it, can I continue from where I left, after testing?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      I actually haven't tried this yet surprisingly 😅 (probably should) but I don't think there should be a problem. If you have it to where it saves checkpoints, it'll stop at the latest checkpoint as determined by save frequency

  • @lennyb.9616
    @lennyb.9616 Год назад +1

    Hey, thank you so much for the tutorial, I wanted to know how do you add more epoch to an existing model using google colab (like if you wanted more than 134 epochs on your own model)

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      You should be able to run all of the cells and then just increase epochs in the config file

    • @lennyb.9616
      @lennyb.9616 Год назад +1

      @@Jarods_Journey Oh ok , thank you so much for answering so fast !!

  • @IslamUnited
    @IslamUnited Год назад +2

    Please make a tutorial for doing all tasks on a local machine with a GPU. Please do from initial steps. I dont want to use google colab since its asking for money for computes

    • @Jarods_Journey
      @Jarods_Journey  Год назад +2

      I have the video, just need to finish editting 🤟

  • @Luganoff
    @Luganoff Год назад +1

    Hi, thanks for the awesome tutorial, the only problem I have is when launching training, i get a Google page with error 403 instead of displaying TensorBoard. How may I fix this ?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Hmm, not sure. Could be a wifi issue that's causing it and if you're running a VPN or on a server, it could be getting that forbidden access. Though, you might just have to restart it.

    • @Luganoff
      @Luganoff Год назад +1

      @@Jarods_Journey I couldn't resolve the problem but it is still operating normally despite the fact that the page isn't loading, so It's not a big deal !

  • @royalpickle1481
    @royalpickle1481 Год назад +1

    hey when i try to segment it, it finished 100% but then the files dont show up in the folder that was made

  • @Dr-Yazan
    @Dr-Yazan Год назад

    What about if i want to train another model? Do i need to delete all files and configure everything again? or just adding new folder in dataset file and keeping anything old?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      For colab, delete all of the old stuff or move all of the old stuff to somewhere else.

  • @user-du3ow9er1y
    @user-du3ow9er1y Год назад

    i did everything you said to the letter, and when i got here :
    @title Train
    %load_ext tensorboard
    %tensorboard --logdir drive/MyDrive/so-vits-svc-fork/logs/44k
    !svc train --model-path drive/MyDrive/so-vits-svc-fork/logs/44k
    it shows :
    The tensorboard extension is already loaded. To reload it, use:
    %reload_ext tensorboard
    Reusing TensorBoard on port 6006 (pid 7336), started 0:06:09 ago. (Use '!kill 7336' to kill it.)
    [10:32:08] INFO [10:32:08] Created a temporary directory
    and stops and doesn't do any training at all, what did i do wrong ???

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      This most likely means that something went wrong during the pre-hubert step and didn't finish through. You can try adding -n 1 to the end of the line in the cell that has the pre-hubert stage.

  • @MGNYmusic
    @MGNYmusic 6 месяцев назад

    ValueError: rate must be specified when data is a numpy array or list of audio samples
    how to fix this. someone help

  • @ravkhangurra7522
    @ravkhangurra7522 Год назад

    When t try to check GPU I get the following error ( I am running windows 10 64bit with an RTX 2080ti
    /bin/bash: nvidia-smi: command not found

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Are you running locally or with colab? This is the colab version of the video, so that means your colab runtime isn't connected to a GPU. You have to change that in the settings.

  • @yusufbenhassine
    @yusufbenhassine Год назад

    Hi Jarod, i think u forgot to mention how you create the folder "config" which not exist in my so-vits-svc, and also why your folder "Me" is not in your drive like previously mentionned but in the folder "dataset_raw" ?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Hey yusaf, the config folder is automatically created by the colab notebook, so you don't need to create it manually.
      The Me folder is inside of "dataset" as Google Collab is expecting a certain file path. Once it's copied into the Collab environment, the names change, however, those ones you don't have to worry about how they're named

  • @justjewellent8129
    @justjewellent8129 Год назад

    Hello, first thanks for the tutorial, no one explains like you. 2nd do you have a working link for so-vits-svc that works with Mac?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Appreciate it! As for Mac, I'm not too sure unfortuneatley...

  • @joohyekang2767
    @joohyekang2767 Год назад

    Hello! I have a question, when I hit training button and the Tensorboard shows up, it says "No dashboards are active for the current data set". Even I refresh it, it still shows the same thing. And I had one time when it led me to a screen like yours, the graph was empty. I think the training is not working at all and do you know how to fix it? Also thank you for this tutorial!!!
    +It says "ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
    google-colab 1.0.0 requires ipython==7.34.0, but you have ipython 8.14.0 which is incompatible."
    on install dependencies

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      This may be an issue with Google colab, check the issues tab on the GitHub page to see if anyone else is getting these issues

  • @strayriffs
    @strayriffs Год назад +1

    Hi! I've been making a lot of progress using your video as a guide. However, the vocals are way out of tune when I'm done training. I think I need to adjust autopitch. Do you have any insights for how I can do this (or if there's a better way to fix out of tune vocals)? Thank you for any help!

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      When you run it, make sure the autotune part is off by following what I did at the end of the video. The GUI version has a way to adjust the value of the pitch on it, but I'm not sure how to do it on the Collab version.

    • @wunderkindt
      @wunderkindt Год назад

      THIS, THIS RIGHT HERE. This is what saved me like 100+ extra hours, I made like several voice models but none of them came out with proper pitch, this was a super simple fix. You're a big help

  • @lakshgarg6480
    @lakshgarg6480 Год назад

    awesome brother thanks

  • @717MERCURY
    @717MERCURY Год назад

    The audiosegmenter is not working, only the audio splitter :(

  • @wowfunbuy
    @wowfunbuy Год назад +1

    Wonderful video!! I am trying to train my own model now!! meanwhile I am wondering why the audiosplitter doesnt work on my computer. The segmenter generated no audio files after execution. can anyone face the same issue?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      The audiosplitter that I made only works with .wav files so you'd need to convert it to that format. That's my oversight, I'll try to get to fixing that soon so that it can take other file types as well.

    • @MajickRs
      @MajickRs Год назад

      Same issue, doesn't work even with wav files. Were you able to find a fix?

    • @rajatdubey6854
      @rajatdubey6854 11 месяцев назад +1

      ​@@MajickRs Same issue i encountered. You can use Audacity for audio splitting there is a option "Analyse>Silence Finder>Set threshold to 5db" and then click OK. Finally in File Tab Export Mulitple Files

  • @ebomb-bb3ec
    @ebomb-bb3ec 8 месяцев назад +1

    woah the amount of likes for the qulity of this video is criminal, new sub for the amazing effort you put into helping us🫡🫡🫡

  • @JustFallenOfficial
    @JustFallenOfficial Год назад

    how to retrain voice model i allready have ? ...how to make it better ? ...train new words or something please ? ^^

  • @senomichaelsantosa4500
    @senomichaelsantosa4500 Год назад +1

    Hi, I want to ask about continuing the training for the AI. Cuz mine stop at around 4000 epochs then it just disconnected while I was slept. Do I have to repeat all those steps in order to continue or there's any other way? love ur video btw its easy to listen

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Appreciate it! Yes, to begin training, you have to run all of the cells again in order 🤟, make sure the config is set up again as well!

    • @senomichaelsantosa4500
      @senomichaelsantosa4500 Год назад

      @@Jarods_Journey so, I've repeated all your step but when I started training it, it just gave a checkmark although my current epoch is in 4000 meanwhile the goal is in 10000. Did I miss something or did I make a mistake

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      @@senomichaelsantosa4500 Unfortunately, this seems to be a common error that occurs and I don't know where it occurs. I believe it generally has to do with the pre-hubert stage, where you can add -n 1 to the end of the line and see if the pre-hubert works, the other is trying to rerun the copy configs cell multiple times.
      I believe it could also be a load factor depending on how loaded googles servers are as it seems to work just fine at one time and not at another :/

    • @senomichaelsantosa4500
      @senomichaelsantosa4500 Год назад

      @@Jarods_Journey okay i'll give it a try, thank for ur help

  • @RezaKuntang
    @RezaKuntang Год назад

    i have an issues about the audiosplitter.exe, when i entered the file folder name, the command is gone (force close) can u please tell me what to do?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Make sure all files are wav files, it cannot accept anything that is not

  • @paulmaudibatreidis
    @paulmaudibatreidis Год назад

    Hi! came from the "Realtime AI Voice Changer Using RVC (Retrieval-based Voice Conversion w./ w-okada)" Video in a journey of baiting my friends. I wanted to clarify coming from that video the marine.pth file used in 5:20 of the aforementioned video, is it the same file as any of the files shown in 29:13? If not how can I convert this trained file into a singular file like marine.pth?

    • @paulmaudibatreidis
      @paulmaudibatreidis Год назад

      If Im not mistaken the G file with the highest number is the more trained epoch .pth file, If this is the file I'm looking for the only question I might have left is... how can I make it convert audio smoothly with a non high-end gpu haha. Im currently a user of the laptop 1660 ti.

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      If you're using sovitssvc, I believe you have to select it instead of RVC when you go into the GUI and then use the G file. I completely recommend RVC over so vits though, and this tutorial you're following is so vits svc

    • @paulmaudibatreidis
      @paulmaudibatreidis Год назад

      @@Jarods_Journey oh I didn't read the title properly... LOL

    • @paulmaudibatreidis
      @paulmaudibatreidis Год назад

      @@Jarods_Journey tq! Ill give it another shot

  • @user-tp1jx2sf8b
    @user-tp1jx2sf8b Год назад

    does this software support nivida A4500? please let me know, thanks

  • @strayriffs
    @strayriffs Год назад

    Thanks! Super helpful. Best vid I've found. So, if I want to use a trained model over and over how do I keep my models separate and repoint to each model as I need it?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Appreciate it! I personally rename my models and config and change which folder they're in. Then you just have to select those in the GUI that opens up

  • @christinazheng3594
    @christinazheng3594 Год назад

    How to use the programs you showed with someone else’s voice for example a k pop singer?

  • @user-rb1bo2mh1l
    @user-rb1bo2mh1l Год назад

    hii i i tried using your audio splitter but it is not working at all, file is creating but it is empty

  • @official_vegoku
    @official_vegoku Год назад +1

    Great Video but i have a problem
    Automatic processes:
    /bin/bash: svc: command not found
    what did that means?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Since your running Collab, means there was an issue with the installation of the repo. Make sure all of the cells were ran without failure and it should work!

    • @darlingred6054
      @darlingred6054 Год назад

      ​@@Jarods_Journey So I'm having problems with this too. On restarting the runtime environment I came across:
      error: subprocess-exited-with-error

      × Building wheel for pyworld (pyproject.toml) did not run successfully.
      │ exit code: 1
      ╰─> See above for output.

      note: This error originates from a subprocess, and is likely not a problem with pip.
      Building wheel for pyworld (pyproject.toml) ... error
      ERROR: Failed building wheel for pyworld
      Failed to build pyworld
      ERROR: Could not build wheels for pyworld, which is required to install pyproject.toml-based projects
      This. And
      Automatic processes:
      /bin/bash: svc: command not found
      Still persist. I'm going to keep working and see if I find a solution.

    • @darlingred6054
      @darlingred6054 Год назад

      So the errors above probably have nothing to do with the SVC command not found issue.

    • @darlingred6054
      @darlingred6054 Год назад

      I've restarted it 3 times, doing it from scratch. Not sure what the issue was, wish I was techy enough to solve it.

    • @darlingred6054
      @darlingred6054 Год назад

      I see I might have the wrong version of Python. I have version 8.14.0 which is apparently incompatible with 7.34.0. I'm going to uninstall and re-install python and see if that fixes it.

  • @kennysdead500
    @kennysdead500 Год назад

    Hello! I had a problem with this application. How do I reuse it? It took me a while to get it to work, and the results were mixed (I think it had to do with vocal layering but will try 10 second increments), but when I reopened it later, I couldn't get it to work. I tried to reinstall it again (there's no way I have to do that every time?) but it just confused the system. I'd appreciate any help. I tried it on a Mac but may use a PC for my next go-around. Thanks!

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Google Collab is a separate PC on googles servers, meaning it doesn't matter what OS you use. You will have to run everything each time. As for any colab issues, check the issues tab on GitHub to see if anyone else is having issues

  • @chlisart9804
    @chlisart9804 Год назад

    How to continue training my model after google colab reached its time limit? Rerun all the cells you mention in the video next day?

  • @ah89971
    @ah89971 Год назад

    It took 30 minutes for 5 epochs only.I tried paperspace not big difference

  • @InquireWithin
    @InquireWithin Год назад +1

    How many epochs until you felt it was studio quality? I’m training locally right now, over 200 high quality samples (about 45 minutes worth). I’m at 2800 epochs and it is getting better but still seems a long ways off.

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Hmm, I haven't actually gotten my voice to the level of being studio quality yet, but I also need to do more testing. People say 25k steps is generally a good starting point, but I've found that it widely varies based on your training data. In your case, if you're doing something like 20 batch size, you'd already be there. You might want to vary your learning rate to see if that does anything, or run it for more epochs.
      One other thing is it seems to do a much better job of infering the voice if the base vocals are closer to the pitch of the trained model. Not saying it can't do other pitches, as you can take a female voice and infer it with a male voice to get a good result, it just seems to be a bit more accurate.

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      @@RobertJene so I wasn't aware of the first two before, but doing a little investigation on it, you can definitely adjust the gradient accumulation by adjusting it in the configs file (default is 1) and adjust the training speed by lowering the learning rate or adjusting the other variables like batch size and eval interval.
      The vectors per token I'm not too sure about. There are some mel spec variable that can be changed, but I'm not sure on their effects for final output.

    • @RobertJene
      @RobertJene Год назад

      @@Jarods_Journey cool

  • @hlasyvhlave
    @hlasyvhlave 9 месяцев назад

    svc pre-config give me warning and 0% any help?

  • @user-cd6zr9jc7j
    @user-cd6zr9jc7j Год назад

    Hey, thanks a lot for the video! Also, is it still gonna work in case my voice samples are not in English language?

  • @mtraining3256
    @mtraining3256 Год назад

    Great! Do I need to install python in my PC? Thank you!

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Not if you're doing the colab version, but if you're running on locally supported hardware, you do.

  • @SineEyed
    @SineEyed Год назад

    Bro, where did you find a copy of the harvard recommended practice for speech quality measurements doc? I spent over an hour today trying to track it down using google cache and every other trick I know, but I always hit a pay wall. Can you help me out?..

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      www.cs.columbia.edu/~hgs/audio/harvard.html

    • @SineEyed
      @SineEyed Год назад

      @@Jarods_Journey hell yeah bro - thank you!.. 🙇‍♂

  • @Darmathrama
    @Darmathrama Год назад

    Thanks for the video

  • @anthonylucio1
    @anthonylucio1 Год назад

    Is it possible to train a model and create a finished song on a Mac? it looks like all the vids I’ve seen (okay two lol) use a PC instead. Just trying to figure that out before I possibly waste my effort and time if it’s not possible.

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Works universally for Collab users, as for local... It technically is possible if your specs match the equivalent windows PC. However, I know there are issues with AMD as most of these are reliant upon Nvidia GPUs, but I can't verify this 😅

    • @Povoa
      @Povoa Год назад

      @@Jarods_Journey Is there any audiosplitter for mac ?

  • @canadianhowl
    @canadianhowl Год назад

    Seem to be having issues with the gpu suddenly disconnecting whenever I try and use the model on audio. Everything else runs smoothly but it suddenly disconnects a minute or so in and the gpu refreshes

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Not sure, might be a collab specific issue and I'm not sure how that works. Might have to run it on a different day

  • @tungchan1735
    @tungchan1735 Год назад

    "ValueError: rate must be specified when data is a numpy array or list of audio samples." when infer :(

  • @angelherondale3618
    @angelherondale3618 Год назад

    Grat video btw

  • @kamille6397
    @kamille6397 Год назад

    when i try to train the model, it doesn't start a graph like yours. even with the .wav file in it doesn't start the little purple bars! everything worked like the video but maybe it's because i have too little samples?? im not sure how to make it train

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Are there any outputs? I know many users have had issues with google collab and it's usually because the pre-hubert step never finished. You can try appending -n 1 to the end of the line at that cell to see if it might help.

  • @love_ramgarhia13
    @love_ramgarhia13 Год назад

    bro please help me fixing an error.... its showning "NameError: name 'vocals' is not defined".. please help me i did every other step like u did.. guide me please

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      I believe you have to edit the name correctly according to your file name.

  • @JTowers97
    @JTowers97 Год назад

    I received an error that reads as follows: " Building wheel for pyworld (pyproject.toml) did not run successfully.
    │ exit code: 1
    ╰─> See above for output.

    note: This error originates from a subprocess, and is likely not a problem with pip.
    Building wheel for pyworld (pyproject.toml) ... error
    ERROR: Failed building wheel for pyworld
    Failed to build pyworld
    ERROR: Could not build wheels for pyworld, which is required to install pyproject.toml-based projects"
    I fixed it by running the cell normally again, then editing the code in the cell to the below (basically just specifying a different version of Pip to use as it seems the new one is bugged):
    #@title Install dependencies
    #@markdown pip may fail to resolve dependencies and raise ERROR, but it can be ignored.
    %pip install pip==21.1.3
    %pip install -U ipython
    #@markdown Branch (for development)
    BRANCH = "none" #@param {"type": "string"}
    if BRANCH == "none":
    %pip install -U so-vits-svc-fork
    else:
    %pip install -U git+github.com/34j/so-vits-svc-fork.git@{BRANCH}

  • @user-fr3cc5xr8t
    @user-fr3cc5xr8t Год назад

    Hi! I have GTX 1660ti Mobile GPU and i5 processor. Is it enough for somewhat efficient training?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      If it's 2gb, no, if 4gb maybe, just barely. Might wanna use the colab versions

  • @jameslin7457
    @jameslin7457 Год назад

    compare with tortoise, which one has better quality in terms of voice cloning?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      They're completely different architectures, Tortoise is Text-to-Speech and SVC is Speech-to-speech. Quality-wise, they're both pretty impressive.

  • @kiran_akamatsu
    @kiran_akamatsu Год назад +1

    I encountered an error in the trained model that stated: "ValueError: rate must be specified when data is a numpy array or list of audio samples." Can someone help me

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Is this when your trying to generate a response with the trained model? What type of file are you using?

    • @kiran_akamatsu
      @kiran_akamatsu Год назад

      @Jarods_Journey yes this is when im trying to generate a response from the model. I am using .wav file, also it shows that line 5 is the cause of error

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      @@kiran_akamatsu hmm, well I'm not too versed in the technical coding side of so vits, but if you wanted to hop in my discord, it might be easier to send and share screenshots to maybe see if I can help out there
      I would say try other wav files and see how those go first to see if it's only an issue with that one or all of the other ones

  • @candyman3537
    @candyman3537 Год назад

    Since codelab is only able to run certain amount of time. Is it possible to resume last training ?

  • @lZompal
    @lZompal 10 месяцев назад

    How do I delete folders so I can do other voices? I cant delete folders

  • @duquera1
    @duquera1 Год назад

    good job dude ;)

  • @BirdKeeperCel
    @BirdKeeperCel Год назад

    I keep getting "name error" at the used trained model section even though it is named correctly, not sure how to fix
    this

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Is it in the curly { } brackets still? Make sure those are deleted.

  • @SYRUPPINK
    @SYRUPPINK Год назад

    hii question, if i wanna continue training
    but im at D_100
    and G_100 for example
    do i delete the D_0 G_0 stuff or no need to
    and can i use the same method just like the first time or no? thankss

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Leave all of the "D" and "G" files if you wanna continue training. For colab, you can just run all of the cells before train as you did to initially set it up and it should begin training from your most recent checkpoint (That's if you made it to the checkpoint, the log intervals)

    • @SYRUPPINK
      @SYRUPPINK Год назад

      @@Jarods_Journey Thanks !

  • @obase79
    @obase79 Год назад

    hi jarod !
    where can i download the voice of liam gallagher AI model?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      I know there are discord groups and stuff out there, but I don't know of much unfortunately. You'll have to do a deep dive into google to see if people have pre-trained svc models and I know there are some youtube videos out there that go over this as well.

  • @Narziss1
    @Narziss1 Год назад

    Thank you!

  • @stnhndg
    @stnhndg Год назад

    10000 epochs (default in config.json) is too much. Right now the process runs about 2 min per epoch. Which means that estimated time of training is about ~330 hours...

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Adjust based on sample size, so that sounds like more than an hour or more of audio, which you should be fine on about 10-20 minutes

  • @mmmdawe
    @mmmdawe Год назад

    pre-hubert never finishes even with -n 1 or -n 2. what do?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Might just have to try again unfortunately, colab is finicky here :(

  • @IlIlIlIlIlIlIlIlIlIlIlIlIlllll

    I have a problem, when I start the training after 20 seconds it ends and nothing else happens.

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Other users have had this issue, and it may be your pre-hubert stage or something else. Assuming it's pre-hubert stage, add "-n 1" to the end of the line making sure there's a space between the last character and try rerunning that stage. One user even reported rerunning the config cell helped them.
      Other than that, I believe this is an issue with colab and the GPU availability. I haven't had the time to look into it more unfortunately.

  • @nepaliitlessons4136
    @nepaliitlessons4136 Год назад

    Do we need to isolate vocal for singers before training model too?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +2

      Yep! The way the model trains is by comparing it's output to the audio you feed it, so if the audio has instrumental in it, well it's gonna try to match it all

  • @professormeme6584
    @professormeme6584 Год назад

    When i get to ussing a trained ai i get the error
    ValueError: rate must be specified when data is a numpy array or list of audio samples. I have no idea how to fix this. Can you help?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      There might be a corrupt file or one that is too small, make sure no file is smaller than 200kb

    • @professormeme6584
      @professormeme6584 Год назад

      @@Jarods_Journey I figured it out and got it working. Thank you

    • @daoizt4032
      @daoizt4032 8 месяцев назад

      same error, how did you fix it?

  • @callestenqvist7634
    @callestenqvist7634 Год назад

    Do you think you will get equal or better results if you the sample set includes singing instead of talking / read sentences?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      I've had users report that singing samples produce a better output if you wanna mimic singing. I haven't tried it, but I don't think it's harmful!

    • @callestenqvist7634
      @callestenqvist7634 Год назад +1

      @@Jarods_Journey Ok. Sounds good. Another question, if I get more samples that I want to feed it, do I need to start over or can I just add more samples to the same folder and continue the training that I already started?

  • @Shacharzadik
    @Shacharzadik Год назад +2

    Hey! I'm running it, but after about 6 hours it stoped the process. I have an error of "Cannot connect to GPU backend
    You cannot currently connect to a GPU due to usage limits in Colab." I'm on a free collab account. Any tips on how to overcome this? I understand that there's a 12 hours limit - but the thing is to train the model looks like i'll need much more time. it took 5.5 hours to reach 1000 epoch. now it doesn't let me connect to a gpu . Thank you!

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      Due to google's policy, you'll have to wait till the limit is up! Unfortunately, that's the limitations they have on a free account 😅. You could technically move all your data over to another Google account and begin training there, but that's up to you (you'd just move the entire so-vits-svc-fork folder)

    • @Shacharzadik
      @Shacharzadik Год назад +1

      @@Jarods_Journey thanks! But looks like in order to train the model I'll need about 2 days of running it. If there's a limit of 12 hours how can I reach it? Also - how do I continue training from the same point?

    • @Jarods_Journey
      @Jarods_Journey  Год назад +1

      @@Shacharzadik it'll just take longer for you to finish due to the limits. To continue training each time, just run all the previous cells before train, and make sure you your config file stays the same.

    • @Shacharzadik
      @Shacharzadik Год назад

      @@Jarods_Journey cool! i've finished the training, but when I try to use it, I get an error saying - "NameError: name 'vocals' is not defined", that's after I allready changed the file name. Do you maybe know why? thanks!

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      @@Shacharzadik If you're running on a different day, make sure to re-mount your drive so that it has access. If you have, double check that everything is named correct including .wav and everything!

  • @omnimanagement8957
    @omnimanagement8957 Год назад

    this doesn't work at all on mac, and when i try it on windows, the audiosplitter just closes when its supposed to be removing silence and creates an empty folder.

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      The splitter doesn't work on files that aren't wav files unfortunately, I'll have to build it in as well and include ffmpeg to do this as it just uses a built in python library. The google colab shouldn't be limited between platforms though as it's all on google servers.

    • @MajickRs
      @MajickRs Год назад

      @@Jarods_Journey I tried the audio splitter with wave files and it didn't work for me either

  • @RandomPlayer717
    @RandomPlayer717 Год назад

    i did everything you did and ended up with a G240.pth and a D240.pth, when i stick these 500mb files into my Reltime voicechanger they do nothing , all my other .pth files that i get elsewhere work but not yours , why?

    • @Jarods_Journey
      @Jarods_Journey  Год назад

      Use RVC not so vits, I believe the are incompatibility issues.