Great video. thank you I'm one of the VCClient and RVC contributors. There are some additions to the content of the video. Regarding the difference between the f0 estimator harvest and crepe, in addition to the sound quality, harvest uses a CPU and crepe uses a GPU. Crepe can improve latency if you have a good GPU. In sever mode you can choose the sound driver. VCClient measures latency within VCClient, but additional latency is added when connecting to other devices. Besides MME, WASAPI and ASIO can be selected, so if you can use them, I recommend using them. For the protect item in advanced options, if protect is set to less than 0.5, the ratio of retrieved features will be reduced in cases where f0 estimation is unsuccessful (silence or breath sounds).
From a musician experience: if you have ASIO supporting soundcard - use ASIO instead of MME. It decreases the audio delay provided by audio tract (e.g. on my PC guitar/mic recording delay for 1024 samples chunk is 180ms for standard MME, and 14ms for ASIO). Theoretically WASAPI can also work fast however I don't have WASAPI supported hardware.
If anyone has issues exporting an ONNX file and getting an error message in the GUI (it usually just says error message: no error message), but if you check in the console it says that pytorch has tried to allocate VRAM and has failed. A quick workaround for this that worked for me was changing in the GUI to use my CPU instead of my GPU and then exporting the ONNX worked. Afterwards you can change it back to your GPU.
i still get an error in the GUI.. if it helps,this is what is shown on the console : "[Voice Changer] get_onnx ex: Can't get source for . TorchScript requires source access in order to carry out compilation, make sure original .py files are available."
i'm probably going to buy an rtx 4090 for AI stuff too. otherwise i'd buy a 7900xtx. it costs half as much, has the same amount of vram and the performance of a 4080. too bad amd sucks with these things.
I just wanna say thanks for the video, it helped me better understand what settings I needed to change to get the desired result I wanted. I was currently using a different program until recently when it decided to break after an update, so I had to start looking around at alternatives. Your video helped me understand this program very quickly, and it made things a lot more manageable for me as well. A lot less trial and error having to try and figure it with stuff on my own. So once again, thank you.
Thanks for all the help! You have responded to all of the comments and provided everything Ive needed. Anyways keep up the great work and keep doing what you are doing 👍
Considering how calculation heavy this thing is it's unlikely. I might try to make a VST3 client for this stuff.. but I couldn't find any API description on their page.
I've found using this in games causes your mic to cut out and stutter, as well as spiking CPU usage. I hope in the near future it's even more optimized ^^
It's to be expected. The results outside of games are when it's able to fully utilize your GPU. While playing a game, the game is eating some amount of resources. Ecspecially so if the game isn't able to run at a steady framerate, which very well causes the voice client to have unreliable processing times.
Great video as always glad you went through everything with a good explanation for everything! Keep up the great work, and i am excited for what will come after RVC!
do you have any idea what happened? The voice changer was running pretty well a couple weeks ago but it looks like they had an update and now all the voices seem slurred and choppy, it was perfectly fine a couple weeks ago and I've made no changes to the settings, i don't know what happened, it affected all my models
Not sure, this would only have happened if you had installed a newer version of it. I would recommend just use and stay with the version that worked for you previously as there seems to be issues using other versions
@@Jarods_Journey so far I've redownloaded everything, and everything is up to date, I optimized my settings the same as the older versions but for some reason, the millisecond per response keeps stacking, all the way up to 30k ms per response all the while it's eating my CPU, this didn't happen before and I'm not sure why, do you have a fix for this?
Im having issues where everything comes out in short breaths or stutters, barely comprehensible, I have pretty good specs and I tinkered around with as many settings as I could but I always get the same result, how do I fix this??
my only question is how do you keep yourself from hearing the changed voice on your own end? i have adhd and when i hear people talking it makes me lose my train of thought- so when i hear myself sayin exactly what i just said in the changed voice it throws me off a lot. is there a way to fix this?
For High cpu usage problems.... 1) Buy $10k PC setup. jk 2. Set index to 0. Even setting it to Index: 0.1 maxes out CPU 3) Set Extra to a lower amount. Higher Extra uses more CPU. Lower Extra uses more GPU. ( use 16384, any lower and doesn't decreases cpu usage, any higher and cpu usage doubles. 4) Use crepe_full (uses most gpu) 4) S.R. to 48000. (dont go beyond as echo forms.)
you're a legend!! insane video quality and tutorial, can't believe i found pure gold at 4 am. i guess youtube can also be a chad and recommend really good content wow
I forgot, has there been a place for a collection of trained voices so far? I definitely don't have the system nor time to find a character, compile voice lines and then train off that.
I got issues and need serious help... issue 1: The AI is using the CPU when I have the program selected for "GPU 0" which is my main GPU. issue 2: The audio is choppy, it cuts too often.
question, how to train? what does train do? i followed all previous tutorial and the voice output only sounds distorted repetitive. im using ryzen 3600x with gtx 1050ti 16gb ram.
Great video! I'm glad you showed what this program is capable of on a 4090. It seems we're not quite there yet with AI voices. I wonder if this is a small hurdle that will be overcome soon or a insurmountable mountain like hands are to AI art.
Ah, actually some AI voice models that I've tried are actually pretty scary accurate, meaning my models need a bit more training lol. I would say we're not that far away from indistinguishable voices
I find that the DirectML version (AMD) tends to randomly stop working and sometimes my GPU driver would crash, so that could use some more work to make it stable. And thanks for this guide! I barely know what each option really means.
@@Pepijaaj Honestly, I don't know if it's even using the GPU at all, I get ~50-80% CPU usage when RVC is active. I thought you meant if the software worked at all on AMD, my mistake. ^^; However, I got it running okay-enough on my 5800X3D with these settings; INDEX: 0, F0 Det.: harvest, S.Thresh.: 0, CHUNK: 320, EXTRA: 4096
I found that Crepe is a LOT faster than Harvest, BUT Crepe has robotic low tones when using O and A sounds. Can't fix that even with either quality or best settings. Harvest on the other hand has no robotic tones at all but its much slower than Crepe. Overall, Harvest is definitely the best quality in exchange for more latency
Holy hell you fixed it, I was trying to do whatever I can to take off the robotic tones and all but nothing was working. When all looked lost I saw your comment and changed it to harvest. It worked! Thank you so much
Anyone know why, when I use client, all my input and output options disappear? It was working fine, and then I just couldn't select them anymore. Edit: I figured it out but for anyone with the same problem, Ill leave this here. It only occurs when you make a shortcut version of the start file and try to run it. The way to fix it is to open the original start file, and switch between server and client on one of the sample voices. If it doesn't work just repeat until your input and output options reappear.
please do you have a fix for: [Voice Changer] Pipeline is not initialized. [Voice Changer] Waiting generate pipeline... pls it would be much appreciated
+12 or -12 pitch can be a loooooot. For nuance you should use your ear to listen to the average difference between yourself and the voice you’re emulating, but 8 is plenty of pitch deviation between sexes. Anime voices esp loli voices are more extreme ofc but yeah it’s often good to resist the urge to go overly cutesy if the goal is realism.
@@eagels3131 Best I've found is to fiddle with the settings somewhere I can hear it (like in a call with an alt discord account). But no good fixes really.
I tried fixing it, but i have a problem where the audio is said like 4 times in a row, with decreases in loudness, and the echo stuff does nothing to fix it. Would you have any ideas on how to correct this?
Hi im the one on your discord server who created a ticket about the gl not loading and you said to change the chunk to max, i just switched the audio to server and it worked perfectly for me, i guess theres a weird thing going on with client option. Just wanted to put it out there
man i really wanted this to work for me but for some reason as soon as i press start it just starts saying random voice lines like someone else is speaking on my mic and i dont even know what could be causing that
I tried this tool out. Honestly... I can see this kind of tool being put on a list of banned software at some point in the future. I gave it a go mostly to see if the DirectML implementation works on intel Arc and... well it didn't. Which is fair. I have a 1080ti as well so that's no biggy.
Ah, I think that's a slippery slope. If it does get banned, that leaves only bad actors to use it as nothing will ever stop them lol. It's still hardware intensive atm so newer Nvidia are still needed
@Jarods Journey do you have any resources for voice models, or do you train your own? All of your voice models sound much better than what I could find online. What settings do you use to train your models at?
I'm very sad, I've been trying to configure this for days, but my pc doesn't have any good gpu, only cpu, and I think that's why my voice is laggy too much, and it takes a long time to come out, I can't find a perfect cpu setting, please help me, because the voice doesn't even want to come out right, is just laggy 😢😢
I have a rtx 3050 laptop gpu 4 gb i usually runs in on 384 chunks and 4096 extra and the voice comes out after 1 second and it was only using 2.6gb vram so what should i do to decrease the response time? I tried to reduce the chunks but it resulted in choppy voice
bro, mine same like u but even worse cause my voice wont let out and after while it get error message.. btw my res is 0 and i think my gpu not detected bruh but im not using amd (on average amd doesn't detect gpu)
Hey there Jarod, I have a question regarding "Cuda Cores" / Cuda core compatible graphic cards. If i get this straight : Your GPU must be a good one , i.e: a newer and faster one (That much is clear) but it also must have cuda cores, since all the models are cuda core based. So a 1080 ti which is a strong gaming card and a favourite of many gamers cant handle the ai voice changer program. The program defaults to CPU usage. Thats what happens on my end with a 1080 ti (which has no cuda cores). The solution would be to go with the OTHER download (what you called "for amd users". Is my assessment correct?
CUDA cores are included in all modern GPUs actually. the software will default to CPU always. The benefit to having a higher end card would be the faster GDDR6 memory, larger amount of VRAM, and many other factors. I am not sure if RVC uses CUDA cores in any way, but a GTX 1080ti should still work, just increase the delay, run it through server, and have the audio encoder set to ASIO or WASAPI if you can. Note : Only use the alternate AMD version if you have an AMD GPU, if you have an NVIDIA graphics card, use the standard download you already have. I tried out the software with my GTX 1650ti on my laptop and it worked pretty well, once set to a 5sec delay. Your GTX 1080ti should do far better then my mobile GPU.
@@calebpeters191 I have done exactly that, but for me the delay needs to be really high to sound any good. For some reason, the GPU still sits at only 30% at max. The sound is still robotic and just sounds like i am a 8 year old kid with a generic voice morpher. Besides, sometimes (rarely) my CPU goes to 99% and on the more common times : 20-30% (5800x3d), which made me think that the GPU was ignored and cpu used instead
I believe the 1080ti does have cuda cores... But has 0 tensor cores which allows for much more efficient cycle usage of the GPU. CUDA should still be your best bet, but I'm not sure why it's not noticing your card
I assumed that the clear setting button at the top would clear out a slot for instance I've been testing a bunch of slots and I want to set them back to the default blank but apparently clear setting does not do what I thought it did in fact it seemed to screw up the whole program.
RVC Quality should be in HIGH if you want to use this in a social game as male - female ect. You might not notice much of a difference but there's a world of difference actually, people notice. What you do is you keep it in HIGH and the program will make sure the output is always the best that way people don't suspect a thing.
@@leexy3395 You are very very wrong in the context I spoke of, many people have good hearing, of course your friends who aren't paying much attention or taking is seriously won't notice a difference. But people in the context I mentioned absolutely will.
No matter what, I can personally hear the artifacts in RvC, and all other neural TTS systems if I’ve been exposed to them at least once, this goes for other kinds of audio processing as well, since I do audio engineering, people do absolutely know the differences, trust me on this.
Hey idk if its too late but can i pls have some hlp here i'm trying to install it but i just cannot seem to be able to hear or make it work. If possible can i contact you and arrange smth or a zoom or smth. Pls
Mine sometimes has a weird voice for half a second before it says what i say. I thought it was just background noise but even with all the suppression and sensitivity being lower, it still happens. It kinda sounds like glitchy breathing
Thanks for the video, but when I browse and select the onnx file in one of the slots, do I just leave the index blank? Because if I select the onnx file, then click upload and try to use the voice it doesn't work. Also, with piper it seems to require a json file as well but how is this generated? rvc doesn't give this to me.
MX250 gpu here (potato gpu). Best I can get is 2+ seconds delay (chunk 320 extra 4096) and WASAPI. - lower chunk will glitch, higher chunk will add more delay. - changing MME to WASAPI helps a lot (from 3+ to 2+ seconds delay), but I have to change my audio sample rate first. - I can't change the model after I use it because the gpu memory (2 GB) will be overloaded. I have to restart the client first before changing the model. I will try converting to onnx later hoping it will go faster (export onnx is currently broken)
i have to figure out a way to offload the processing to my laptop so it doesn't bug out when playing demanding games. is there a way to access it from a lan client? ps: i thought the directml version worked for my 6900xt, but it's processing on the cpu. but, on the plus side, converting to omnx literally halves the load on the cpu. when running this exclusively, i managed to get a total lag of 400ms buffer + processing. not bad for an octacore cpu (5800x3d)
UUUUUH HELP, since i downloaded this "AI Voice Changer Client" for some reason is not working for "me only" but who knows maybe people are having the same problem as me cause when i open the client and use it, none of the voice changers are clickable, like I'm sorry why is it not working I'm confused can someone help i need answers please 🤨❓
great video. I did everything and it worked 100%. A question, whenever I talk, for some reason the audio is too slow and cuts every 3 seconds. So i cant say a word. Is there a way to fix this? or is it because of my GB is too low?
Hi Jarod. I'm trying to use my bluetooth mic as an input for voice but it is not showing in the input options! Any help in this regard will be appreciated!
Maybe it's picking up on some background noise with your mic, but it could also be the models that you're using as it's still not a perfect voice-to-voice conversion
Hello sir, do I need to download any extra files prior to downloading these as shown in this video, or will everything work just fine when following the same process in this video? I have heard you talking about downloading RVC first or maybe I am mistaken. Thank you.
Thank you, sir, for replying. But I am done downloading the software but I think the voice changes often say things that I didn't say and weird noises and twist echo. I have an AMD potato laptop with just 4gb vram and 32gb ram, in the gpu drawer I see just gpu on/off option no room for choosing between GPU or CPU not sure why that is like that on AMD played with it all day yet the software hasn't been able to say a single word of mine only making a weird noise and short utterances. Not sure though, do you think upgrading the AMD drive would make any much difference? Or maybe time to change to Nvidia. Note. The laptop comes with a hybrid gpu AMD Rx / Vega 8. I think the Vega is interfering. 🤯
@@ivw1286I have a rented A100, a $10,000 GPU, and it still struggles a little. I don't think you meet the minimum requirements to even run the program.
so whenever i boot it up and do all the settings and whatnot, it never changes my voice even if i follow every step, i can hear my voice yeah and ive already fiddled with the tune, yet nothing changes even if i turn it on
there is no way to mute the output feedback, right? so that, when you talk, you don't actually hear yourself back but still can be recorded or heard by other people....
So it works pretty good for me, but my issue is under the Chunk and Extra options, I don't have anything. I just have something that says GPU (dml) with an option for on and off. I'm using a 3060 for reference.
my voice changer worked perfectly when i just installed it, but after reopening it, it started to give errors like "GL is not supported" and refuse to work. any suggestions how to fix this?
HI, I have an idea to use the RVC client on a karaoke party . I'm struggling to make the delay short enought to sing in real time. It's just not possible. I bought top graphic card - Zotac RTX 4090 with 16500 cuda cores. The delay (even in passthru mode) is ca 0,3 sec and theres no way to get rid of it. Only sound driver that work is MME , but the latency is enormous. I can't make it work (in server mode )with any of the ASIO drivers I have. It's mute. .I've Downoladed ASIO4All. Mute. Bought Focusrite 212 interface. Asio is dead. Realtek ASIO with built in sound card - doesnt work, as well. What I'm doing wrong ? \ My CPU is Intel i9 12900K
Great Video as always! but im having a problem with my gpu, i cant seem to find my current gpu (AMD) on the options for GPU (and i mean cpu is the only one appearing), how do i make it appear to be an option?
I have successfully put on voice models nad played around with them a little. Next time I opened the program it waits for web server indefinatelly and the voice changer native client pop up is permanately all stuck white. (Yes I deleted "stored_setting.json" it changed nothing)
All the audio is filtered, not just the microphone that is used, I tested it on Discord and in games and the audio of anyone's voices is filtered by the voice audio, no one has a decent answer so I guess there is no solution there in these versions of the program :/
Hello and nice video! i have a big problem with my output lagg (I hear my voice after few seconds like 10-12) i have buf:300ms and res:like 10k or something how can i fix that?
I don't know why, but every time I get onto a game, it always glitches out mid-sentence and becomes more evident for the people I'm trolling. Is there any possible solution for this?
because the RES comes out with 8000 up to 29000 you can lower it because it takes a long time to load the voice and sometimes you don't even hear anything
Why did I download it, turn it on and go to the voice changer client, but the GPU section shows (cpu, gpu0, gpu1, gpu2, gpu3) while my computer has a Geforce RTV 3050 Laptop Gpu card?
Hey, i have integrated with teams but for me its working perfeclty but the person at other end in the meeting gets its hears his own voice back ? what is the solution for this ?
I've got an error that says "TypeError Cannot read properties of null (reading 'enableServerAudio')" when I'm changing between character avatar and also I can't connect the sound changer to discord, even the app doesn't detect my sound but I alr set the input to my mic. Can someone help me with this problem?
sometimes it works when i use an audio file (i have no gpu so its slow) but other times it just creates popping noise? it seems really inconsistent why is this?
I’ve been tweaking and playing with the models all night and I can’t get them to sound clear as they do in your videos. I’m using a rtx3050 which I was told should be enough but they still sound unintelligible a lot and sound robotic like you can tell it’s an ai making the voice. I’ve tried toying with the chunks and stuff but still nothing seems to get it to work right. Idk what could be the issue other than maybe my pc isn’t cut out for it somehow or my mic isn’t good enough but it’s still very clear and I figured it’d be more than sufficient to be able to use for this purpose. If anyone could help me figure out what this is id appreciate it.
@@instabs I did some more playing with it the night after and was able to get it to sound better, I’m not exactly sure what all I did to make it work better. Though I did find that lowering the additional chunk size (at certain refresh rates(idk what it’s actually called but I think yk what I mean)) did help make it clearer somehow, I do remember that. I also bought a clearer microphone and that helped too.
vb cable causes popping when i talk through it voice changer is clear but audio that comes from the cable to talk on discord sounds robotic and idk how to fix it
Can someone please help me please, my changer comes in “choppy” like it’s not even close to as clear as Jarod’s .. so I change the chunk up and down and it still comes in choppy no matter what I do. Is there any solutions? Thank you
Great video. thank you
I'm one of the VCClient and RVC contributors. There are some additions to the content of the video.
Regarding the difference between the f0 estimator harvest and crepe, in addition to the sound quality, harvest uses a CPU and crepe uses a GPU. Crepe can improve latency if you have a good GPU.
In sever mode you can choose the sound driver. VCClient measures latency within VCClient, but additional latency is added when connecting to other devices.
Besides MME, WASAPI and ASIO can be selected, so if you can use them, I recommend using them.
For the protect item in advanced options, if protect is set to less than 0.5, the ratio of retrieved features will be reduced in cases where f0 estimation is unsuccessful (silence or breath sounds).
I've seen, appreciate your work and thank you for the additional information!
Would be better if next to every option will be ? icon when on hover you will see popup with explanation. It will help a lot.
Yo app just mines BTC stop the cap.
@@meoqtx proof?
@@paradym777 My Premium Kaspersky version 😀
From a musician experience: if you have ASIO supporting soundcard - use ASIO instead of MME. It decreases the audio delay provided by audio tract (e.g. on my PC guitar/mic recording delay for 1024 samples chunk is 180ms for standard MME, and 14ms for ASIO). Theoretically WASAPI can also work fast however I don't have WASAPI supported hardware.
It stutters for me when use my asio soundcard or just doesnt work at all
@@realxdey same here
Where do I change the settings for asio?
ASIO4All an any other ASIO doesnt work.
@@realxdey change the sample size lol
If anyone has issues exporting an ONNX file and getting an error message in the GUI (it usually just says error message: no error message), but if you check in the console it says that pytorch has tried to allocate VRAM and has failed. A quick workaround for this that worked for me was changing in the GUI to use my CPU instead of my GPU and then exporting the ONNX worked. Afterwards you can change it back to your GPU.
Smart, didn't even cross my mind xd
i still get an error in the GUI..
if it helps,this is what is shown on the console : "[Voice Changer] get_onnx ex: Can't get source for . TorchScript requires source access in order to carry out compilation, make sure original .py files are available."
yooooooooo thank you so much
What do you mean changing the gui?
@@Nebularban changing the setting in the GUI
i literally just upgraded from my 1050ti to a 4070 today just to use this + other AI tools. love these tutorials
GZ! Very solid card right there :D
i'm probably going to buy an rtx 4090 for AI stuff too. otherwise i'd buy a 7900xtx. it costs half as much, has the same amount of vram and the performance of a 4080. too bad amd sucks with these things.
man im using intel HD card 💀, will it work?💀
I'm still using rtx 2060
I'm using 920mx😀
I just wanna say thanks for the video, it helped me better understand what settings I needed to change to get the desired result I wanted. I was currently using a different program until recently when it decided to break after an update, so I had to start looking around at alternatives. Your video helped me understand this program very quickly, and it made things a lot more manageable for me as well. A lot less trial and error having to try and figure it with stuff on my own. So once again, thank you.
Thanks for all the help! You have responded to all of the comments and provided everything Ive needed. Anyways keep up the great work and keep doing what you are doing 👍
I'm so lost. My window is just empty when I click on the native client exe...
This worked so great on my first Voice Changing experience. Other videos on your channel are also great! Thank you very much.
Hear Botan/Marine speaking clear English is kind of weird&awesome at the same time XD
I hope they will make it in VST3 format so i can just put it on the daw track which my microphone is routed through. it would be so amazing holyy
Considering how calculation heavy this thing is it's unlikely.
I might try to make a VST3 client for this stuff.. but I couldn't find any API description on their page.
why cant you just use VAC and use VAC Line in as your input mic in your DAW?
@@PsychicType hmm ill try that ty!!!
I've found using this in games causes your mic to cut out and stutter, as well as spiking CPU usage. I hope in the near future it's even more optimized ^^
It's to be expected. The results outside of games are when it's able to fully utilize your GPU.
While playing a game, the game is eating some amount of resources. Ecspecially so if the game isn't able to run at a steady framerate, which very well causes the voice client to have unreliable processing times.
Have you figure a solution for that yet? Bc recording or even ingame mic check, it sounds fine. But people ingame hear it stutter alot
Great video as always glad you went through everything with a good explanation for everything! Keep up the great work, and i am excited for what will come after RVC!
it feels like the voice changer is picking up too much background noises and saying things I didnt say is that normal?
You have ghosts in ur house
1650 Super and even 1060 are doing surprisingly fine using this tool.
Awesome to hear! I had great success on my 2070 super so I'm glad to hear the pascal cards are still chugging along
hello mind telling me what settings you are using? i have a 1660 super and its super laggy nothing is understandable
@@raidentatsunoko same, i have 1660, but voice very bad and laggy
@@soluckymoon sad, still same to me.
@@raidentatsunoko have youn found a fix?
Tech is getting so cool, great video!
do you have any idea what happened? The voice changer was running pretty well a couple weeks ago but it looks like they had an update and now all the voices seem slurred and choppy, it was perfectly fine a couple weeks ago and I've made no changes to the settings, i don't know what happened, it affected all my models
Not sure, this would only have happened if you had installed a newer version of it. I would recommend just use and stay with the version that worked for you previously as there seems to be issues using other versions
@@Jarods_Journey I’ll try and play around with it, does it remove your models if you install it again?
@@kongk5772 yup, you gotta start from scratch
@@Jarods_Journey so far I've redownloaded everything, and everything is up to date, I optimized my settings the same as the older versions but for some reason, the millisecond per response keeps stacking, all the way up to 30k ms per response all the while it's eating my CPU, this didn't happen before and I'm not sure why, do you have a fix for this?
which version is the one that works for you?
even at highest chunk i cant get it to sound good
Im having issues where everything comes out in short breaths or stutters, barely comprehensible, I have pretty good specs and I tinkered around with as many settings as I could but I always get the same result, how do I fix this??
Same
I did all of this and can't use it. I hear my own voice changed but other can't. How can I solve this?
Run it through a virtual audio cable.
Just a quick question, the voice changer works perfectly but it always cuts off at the end of the sentence. Any clues why?
+1
same
my only question is how do you keep yourself from hearing the changed voice on your own end? i have adhd and when i hear people talking it makes me lose my train of thought- so when i hear myself sayin exactly what i just said in the changed voice it throws me off a lot.
is there a way to fix this?
It's simple, below Audio (Client) you'll see in, out and mon. I'm assuming mon is for monitoring. Just leave it blank
Do you think we could get a download for the Marine voice?
For High cpu usage problems....
1) Buy $10k PC setup. jk
2. Set index to 0. Even setting it to Index: 0.1 maxes out CPU
3) Set Extra to a lower amount. Higher Extra uses more CPU. Lower Extra uses more GPU. ( use 16384, any lower and doesn't decreases cpu usage, any higher and cpu usage doubles.
4) Use crepe_full (uses most gpu)
4) S.R. to 48000. (dont go beyond as echo forms.)
this works for amd gpu, i was getting crazy delays. followed this settings and now its only 2 seconds delay. thank you!
thx bro now im only getting a 1/2 second delay on amd gpu
What does S.R mean? I don't see this option.
@@IG7799-c4u same i dont know what that means
you're a legend!! insane video quality and tutorial, can't believe i found pure gold at 4 am.
i guess youtube can also be a chad and recommend really good content wow
same...
I forgot, has there been a place for a collection of trained voices so far? I definitely don't have the system nor time to find a character, compile voice lines and then train off that.
AI Hub has a comprehensive collection. I can't post their discord invite here, but it isn't hard to find.
@@gjvyigfghjghivff how to join their server? can u give me the invite code only. not the full link
@@ozymogaming It's "aihub"
There is in fact and it's the AI Hub discord. I plan on making a vid of it in the future just quickly going over it
I got issues and need serious help...
issue 1: The AI is using the CPU when I have the program selected for "GPU 0" which is my main GPU.
issue 2: The audio is choppy, it cuts too often.
Chunk 192, Extra 8192. F0 Det rvmpe_onnx. Run audio off server.
question, how to train? what does train do?
i followed all previous tutorial and the voice output only sounds distorted repetitive. im using ryzen 3600x with gtx 1050ti 16gb ram.
If you've already trained, you might need better audio samples, more samples, or adjust settings to better fit your system via chunks/extra
Great video! I'm glad you showed what this program is capable of on a 4090. It seems we're not quite there yet with AI voices. I wonder if this is a small hurdle that will be overcome soon or a insurmountable mountain like hands are to AI art.
Ah, actually some AI voice models that I've tried are actually pretty scary accurate, meaning my models need a bit more training lol. I would say we're not that far away from indistinguishable voices
So what were you saying about hands again?
1 year later, it's done
I find that the DirectML version (AMD) tends to randomly stop working and sometimes my GPU driver would crash, so that could use some more work to make it stable. And thanks for this guide! I barely know what each option really means.
Wait is your gpu recognized in the software?
@@Pepijaaj It is, it's an RX 6900XT.
how do u get it to use ur gpu mine only uses the cpu even on the directml version 😭😭
@DesuVR wtf my rx 6800xt is not how did you do 💀
@@Pepijaaj Honestly, I don't know if it's even using the GPU at all, I get ~50-80% CPU usage when RVC is active. I thought you meant if the software worked at all on AMD, my mistake. ^^;
However, I got it running okay-enough on my 5800X3D with these settings; INDEX: 0, F0 Det.: harvest, S.Thresh.: 0, CHUNK: 320, EXTRA: 4096
i seem to have a problem with delay, theres a long delay before the voice starts working, is there a way i can fix it?
I found that Crepe is a LOT faster than Harvest, BUT Crepe has robotic low tones when using O and A sounds.
Can't fix that even with either quality or best settings.
Harvest on the other hand has no robotic tones at all but its much slower than Crepe.
Overall, Harvest is definitely the best quality in exchange for more latency
Holy hell you fixed it, I was trying to do whatever I can to take off the robotic tones and all but nothing was working. When all looked lost I saw your comment and changed it to harvest. It worked! Thank you so much
Anyone know why, when I use client, all my input and output options disappear? It was working fine, and then I just couldn't select them anymore.
Edit: I figured it out but for anyone with the same problem, Ill leave this here. It only occurs when you make a shortcut version of the start file and try to run it. The way to fix it is to open the original start file, and switch between server and client on one of the sample voices. If it doesn't work just repeat until your input and output options reappear.
please do you have a fix for:
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
pls it would be much appreciated
did u find a soluti0on?
+12 or -12 pitch can be a loooooot. For nuance you should use your ear to listen to the average difference between yourself and the voice you’re emulating, but 8 is plenty of pitch deviation between sexes. Anime voices esp loli voices are more extreme ofc but yeah it’s often good to resist the urge to go overly cutesy if the goal is realism.
I like +12 and -12 becuase you can sing and not have the apparent pitch change becuase the pitch is going up a whole octave.
I notice my voice cuts out a lot while speaking. So it misses a few words sometimes. Whats the best way to solve that?
I'm also having this issue, so I'm tagging on to your comment.
any fixes?
@@Radhaun any fixes?
@@eagels3131 Best I've found is to fiddle with the settings somewhere I can hear it (like in a call with an alt discord account). But no good fixes really.
How to make Res: lower?
I raise chunks, but along with them rises and res:
and it turns out that all models stutter.
Most likely hardware limited, it will rise if your hardware can't keep up with the rate that it's set at
I've got a average PC, and my RES still goes up. Any ideas on what to do? even tried it on the lowest settings and its not working also!
I'm here because I can't even get it running 😭
Same here
@@mmm-zr3dymine is super lacking and Idk why
@@mmm-zr3dyWhen changing the voices live, you have to sit there for a moment so the software can adjust.
I tried fixing it, but i have a problem where the audio is said like 4 times in a row, with decreases in loudness, and the echo stuff does nothing to fix it. Would you have any ideas on how to correct this?
This is happening to me too and honestly even i dont know how to fix it im just as clueless as you
Same issue for me. I would like to get this sorted out
Are you on Windows 11?
Yeah😅@@linuxtuxvolds5917
Hi im the one on your discord server who created a ticket about the gl not loading and you said to change the chunk to max, i just switched the audio to server and it worked perfectly for me, i guess theres a weird thing going on with client option. Just wanted to put it out there
man i really wanted this to work for me but for some reason as soon as i press start it just starts saying random voice lines like someone else is speaking on my mic and i dont even know what could be causing that
SAME did u figure it out?
I've got an ARC A750, but it's not detecting my graphics card. I only get CPU option.
Intel is a no go, try the directml version but it may not work.
Damnn, the intro transition was goood
Its amazing but I wish it was better optimized for AMD, I use a 7900xtx and it randomly loops the last syllable and stops.
Yeah, a lot of people have had issues with AMD. You might benefit from running chunk size lower or extra lower to see if it prevents the loopage.
@@Jarods_Journey alrighty! thank you so much!
@@DippinDoughnutz Could you drop the settings that work for you? I can't get it to work at all without being super laggy and very cut up
@@faded8975 for me no, nothing helps. the only ones that work are the ones that come with the voice changer
Why is my voice changer says no embedder? When i changed into a custome voice it says failed idk why it says fail i cant even hear the voice changer
i really want to try this but my 1060 says no.
help!!! my ai keeps repeating what i said after i finished talking. it like mumbles exactly what it just finished saying
I tried this tool out. Honestly... I can see this kind of tool being put on a list of banned software at some point in the future. I gave it a go mostly to see if the DirectML implementation works on intel Arc and... well it didn't. Which is fair. I have a 1080ti as well so that's no biggy.
Ah, I think that's a slippery slope. If it does get banned, that leaves only bad actors to use it as nothing will ever stop them lol.
It's still hardware intensive atm so newer Nvidia are still needed
@Jarods Journey do you have any resources for voice models, or do you train your own? All of your voice models sound much better than what I could find online. What settings do you use to train your models at?
I train all of my models using RVC v2. No real special settings, I just clean and curate my data so it's crystal clear input audio.
@@Jarods_Journey Trained a couple models already, came out great. Thanks for all the help between comments and videos.
HOW DO U CLEAN DATA@@Jarods_Journey
my voice echoes and i can hear what i say again 2-3 times any suggestions?
Turn on echo filter and set higher protection. (but i don't think you actually need it after one year passed)
I'm very sad, I've been trying to configure this for days, but my pc doesn't have any good gpu, only cpu, and I think that's why my voice is laggy too much, and it takes a long time to come out, I can't find a perfect cpu setting, please help me, because the voice doesn't even want to come out right, is just laggy 😢😢
Praying for the day it works on AMD, i'm currently stuck with a 5700XT.
I have a rtx 3050 laptop gpu 4 gb i usually runs in on 384 chunks and 4096 extra and the voice comes out after 1 second and it was only using 2.6gb vram so what should i do to decrease the response time? I tried to reduce the chunks but it resulted in choppy voice
I think your gpu just ain't fast enough
bro, mine same like u but even worse cause my voice wont let out and after while it get error message.. btw my res is 0 and i think my gpu not detected bruh but im not using amd (on average amd doesn't detect gpu)
im on a 3050ti laptop and i can run mine at 96 chunk, 131072 extra perfectly
Hey there Jarod, I have a question regarding "Cuda Cores" / Cuda core compatible graphic cards.
If i get this straight : Your GPU must be a good one , i.e: a newer and faster one (That much is clear) but it also must have cuda cores, since all the models are cuda core based. So a 1080 ti which is a strong gaming card and a favourite of many gamers cant handle the ai voice changer program. The program defaults to CPU usage. Thats what happens on my end with a 1080 ti (which has no cuda cores). The solution would be to go with the OTHER download (what you called "for amd users".
Is my assessment correct?
CUDA cores are included in all modern GPUs actually. the software will default to CPU always. The benefit to having a higher end card would be the faster GDDR6 memory, larger amount of VRAM, and many other factors. I am not sure if RVC uses CUDA cores in any way, but a GTX 1080ti should still work, just increase the delay, run it through server, and have the audio encoder set to ASIO or WASAPI if you can.
Note : Only use the alternate AMD version if you have an AMD GPU, if you have an NVIDIA graphics card, use the standard download you already have. I tried out the software with my GTX 1650ti on my laptop and it worked pretty well, once set to a 5sec delay. Your GTX 1080ti should do far better then my mobile GPU.
@@calebpeters191 I have done exactly that, but for me the delay needs to be really high to sound any good. For some reason, the GPU still sits at only 30% at max. The sound is still robotic and just sounds like i am a 8 year old kid with a generic voice morpher.
Besides, sometimes (rarely) my CPU goes to 99% and on the more common times : 20-30% (5800x3d), which made me think that the GPU was ignored and cpu used instead
@@calebpeters191 thanks nonetheless dude
I believe the 1080ti does have cuda cores... But has 0 tensor cores which allows for much more efficient cycle usage of the GPU.
CUDA should still be your best bet, but I'm not sure why it's not noticing your card
@@Jarods_Journey I just bought a 4070 ti a few hours ago, and let it run. The performance is LEAPS better, thanks for the answer Jarod
I assumed that the clear setting button at the top would clear out a slot for instance I've been testing a bunch of slots and I want to set them back to the default blank but apparently clear setting does not do what I thought it did in fact it seemed to screw up the whole program.
RVC Quality should be in HIGH if you want to use this in a social game as male - female ect.
You might not notice much of a difference but there's a world of difference actually, people notice.
What you do is you keep it in HIGH and the program will make sure the output is always the best that way people don't suspect a thing.
Doesn't sound that different and it's not worth the hit on CPU resources.
I compared High and low to my friends (markeplier model) and they said there is not deference at all.
edit: I'm using 1060
@@leexy3395 You are very very wrong in the context I spoke of, many people have good hearing, of course your friends who aren't paying much attention or taking is seriously won't notice a difference. But people in the context I mentioned absolutely will.
No matter what, I can personally hear the artifacts in RvC, and all other neural TTS systems if I’ve been exposed to them at least once, this goes for other kinds of audio processing as well, since I do audio engineering, people do absolutely know the differences, trust me on this.
the tutorial its good, but i have a problem The voice sounds cutted I think its becuase my "RES" its to high, what can i do?
Using a RTX 3070 with:
Chunk 256
Extra 131072
Sounds perfect on these even with the half second delay!
Hey idk if its too late but can i pls have some hlp here i'm trying to install it but i just cannot seem to be able to hear or make it work. If possible can i contact you and arrange smth or a zoom or smth. Pls
why is mine so laggy??
Mine sometimes has a weird voice for half a second before it says what i say. I thought it was just background noise but even with all the suppression and sensitivity being lower, it still happens. It kinda sounds like glitchy breathing
Can we get a download for the marine voice please?
Yes this
Thanks for the video, but when I browse and select the onnx file in one of the slots, do I just leave the index blank? Because if I select the onnx file, then click upload and try to use the voice it doesn't work. Also, with piper it seems to require a json file as well but how is this generated? rvc doesn't give this to me.
MX250 gpu here (potato gpu). Best I can get is 2+ seconds delay (chunk 320 extra 4096) and WASAPI.
- lower chunk will glitch, higher chunk will add more delay.
- changing MME to WASAPI helps a lot (from 3+ to 2+ seconds delay), but I have to change my audio sample rate first.
- I can't change the model after I use it because the gpu memory (2 GB) will be overloaded. I have to restart the client first before changing the model.
I will try converting to onnx later hoping it will go faster (export onnx is currently broken)
but u can run smooth with high delay? cause if u can theres hope for my 1660ti
@@soimpressivesodogetothemoo8027 yes, as long as there's enough delay, it will run smoothly
Onnx is fixed, wasapi is a good driver as that one makes mine faster as well
How do i change it from MME to smt else
i have to figure out a way to offload the processing to my laptop so it doesn't bug out when playing demanding games. is there a way to access it from a lan client?
ps: i thought the directml version worked for my 6900xt, but it's processing on the cpu. but, on the plus side, converting to omnx literally halves the load on the cpu. when running this exclusively, i managed to get a total lag of 400ms buffer + processing. not bad for an octacore cpu (5800x3d)
Great! can't wait to try this out to see if it improves performance. If it doesn't I might just install windows 11 to match your settings exactly.
UUUUUH HELP, since i downloaded this "AI Voice Changer Client" for some reason is not working for "me only" but who knows maybe people are having the same problem as me cause when i open the client and use it, none of the voice changers are clickable, like I'm sorry why is it not working I'm confused can someone help i need answers please 🤨❓
I have an RTX 4060 ti .
I am hearing some weird sounds (like predator lol)} i 've changed many settings and yet haven't figure it out.
How do you do it with obs cause I can't record microphone (that voice) it seems to detect the voice but i can't hear it on recording
Have you solved this problem yet?
Setup with vb audio cable, then set up microphone source in OBS as vb audio cable
@@Jarods_Journey thx for advice but i still can't manage to record it. I don't know what i'm doing wrong :
Hello, I have a GPU but for some reason it’s not selectable in the processing option
there is an echo after the first sentence and it gets so squeeky till u cant hear it anymore
I have an issue, I can hear myself that my voice is changed, but when I tried to use sound recorder, the result is that my voice is not changing, why?
great video. I did everything and it worked 100%. A question, whenever I talk, for some reason the audio is too slow and cuts every 3 seconds. So i cant say a word. Is there a way to fix this? or is it because of my GB is too low?
Hi Jarod. I'm trying to use my bluetooth mic as an input for voice but it is not showing in the input options! Any help in this regard will be appreciated!
for me the main issue for me is no matter what i change i still have static in my playback. any idea on what i should do?
Maybe it's picking up on some background noise with your mic, but it could also be the models that you're using as it's still not a perfect voice-to-voice conversion
Hello sir, do I need to download any extra files prior to downloading these as shown in this video, or will everything work just fine when following the same process in this video? I have heard you talking about downloading RVC first or maybe I am mistaken.
Thank you.
You will need to download or train AI models for voices. It only comes with 4 preinstalled
Thank you, sir, for replying. But I am done downloading the software but I think the voice changes often say things that I didn't say and weird noises and twist echo. I have an AMD potato laptop with just 4gb vram and 32gb ram, in the gpu drawer I see just gpu on/off option no room for choosing between GPU or CPU not sure why that is like that on AMD played with it all day yet the software hasn't been able to say a single word of mine only making a weird noise and short utterances. Not sure though, do you think upgrading the AMD drive would make any much difference? Or maybe time to change to Nvidia.
Note. The laptop comes with a hybrid gpu AMD Rx / Vega 8. I think the Vega is interfering. 🤯
@@ivw1286I have a rented A100, a $10,000 GPU, and it still struggles a little. I don't think you meet the minimum requirements to even run the program.
so whenever i boot it up and do all the settings and whatnot, it never changes my voice even if i follow every step, i can hear my voice yeah and ive already fiddled with the tune, yet nothing changes even if i turn it on
there is no way to mute the output feedback, right?
so that, when you talk, you don't actually hear yourself back but still can be recorded or heard by other people....
So it works pretty good for me, but my issue is under the Chunk and Extra options, I don't have anything. I just have something that says GPU (dml) with an option for on and off. I'm using a 3060 for reference.
I've got the same thing going on, as well as my buff is pretty high as well
@@blueblue4078 you guys are using the AMD versions, make sure you download the CUDA option
my voice changer worked perfectly when i just installed it, but after reopening it, it started to give errors like "GL is not supported" and refuse to work. any suggestions how to fix this?
HI, I have an idea to use the RVC client on a karaoke party . I'm struggling to make the delay short enought to sing in real time. It's just not possible.
I bought top graphic card - Zotac RTX 4090 with 16500 cuda cores. The delay (even in passthru mode) is ca 0,3 sec and theres no way to get rid of it. Only sound driver that work is MME , but the latency is enormous.
I can't make it work (in server mode )with any of the ASIO drivers I have. It's mute. .I've Downoladed ASIO4All. Mute. Bought Focusrite 212 interface. Asio is dead. Realtek ASIO with built in sound card - doesnt work, as well. What I'm doing wrong ? \
My CPU is Intel i9 12900K
my voice is glitching out like: "A-a-a-a-a-a-a-a- A" and the buffer is very high for no reason
2:00 Sir, where do you download extra voice models?
he has a video of it
@@xiaoch_n Thanks mate
Very helpful 👌 . Just where can you get models?
Great Video as always! but im having a problem with my gpu, i cant seem to find my current gpu (AMD) on the options for GPU (and i mean cpu is the only one appearing), how do i make it appear to be an option?
You will have to download the directml version of the package, it should appear there so I've heard after this version is installed.
You ever figure out how to get it to work on AMD?
@@faded8975 download the directml version not the gpu cuda one
Can you tell me why when I say something through the AI it keeps repeating it?
I have successfully put on voice models nad played around with them a little. Next time I opened the program it waits for web server indefinatelly and the voice changer native client pop up is permanately all stuck white. (Yes I deleted "stored_setting.json" it changed nothing)
All the audio is filtered, not just the microphone that is used, I tested it on Discord and in games and the audio of anyone's voices is filtered by the voice audio, no one has a decent answer so I guess there is no solution there in these versions of the program :/
Can you make a video on how to train a custom model for this? Thanks!
Completed :), check out the RVC videos playlist
Hello and nice video! i have a big problem with my output lagg (I hear my voice after few seconds like 10-12) i have buf:300ms and res:like 10k or something how can i fix that?
Sometimes it lags a bit, give it a minute or so and the res will go down. I turn off my mic and it speeds up the process tremendously
I don't know why, but every time I get onto a game, it always glitches out mid-sentence and becomes more evident for the people I'm trolling. Is there any possible solution for this?
because the RES comes out with 8000 up to 29000 you can lower it because it takes a long time to load the voice and sometimes you don't even hear anything
why cant i download it , it says i need to wait 24 hours the thing is i've been waiting for 1 week yet still couldnt
Why did I download it, turn it on and go to the voice changer client, but the GPU section shows (cpu, gpu0, gpu1, gpu2, gpu3) while my computer has a Geforce RTV 3050 Laptop Gpu card?
Is microphone important on sound quality too? Because I'm using my headset microphone and it sounds ass and robotic
Yes, but model quality play a role here too. If you can upgrade your mic, it may help out but it may not if the models are not good.
Hey, i have integrated with teams but for me its working perfeclty but the person at other end in the meeting gets its hears his own voice back ? what is the solution for this ?
I've got an error that says "TypeError Cannot read properties of null (reading 'enableServerAudio')" when I'm changing between character avatar and also I can't connect the sound changer to discord, even the app doesn't detect my sound but I alr set the input to my mic. Can someone help me with this problem?
did you get a virtual audio cable driver?
sometimes it works when i use an audio file (i have no gpu so its slow) but other times it just creates popping noise? it seems really inconsistent why is this?
I’ve been tweaking and playing with the models all night and I can’t get them to sound clear as they do in your videos. I’m using a rtx3050 which I was told should be enough but they still sound unintelligible a lot and sound robotic like you can tell it’s an ai making the voice. I’ve tried toying with the chunks and stuff but still nothing seems to get it to work right. Idk what could be the issue other than maybe my pc isn’t cut out for it somehow or my mic isn’t good enough but it’s still very clear and I figured it’d be more than sufficient to be able to use for this purpose. If anyone could help me figure out what this is id appreciate it.
i have an rtx 3060ti and im having the same exact issues as you, even the preset voices sound bad for me.
@@instabs I did some more playing with it the night after and was able to get it to sound better, I’m not exactly sure what all I did to make it work better. Though I did find that lowering the additional chunk size (at certain refresh rates(idk what it’s actually called but I think yk what I mean)) did help make it clearer somehow, I do remember that. I also bought a clearer microphone and that helped too.
@@doomkitty6401 yea it could be my microphone since i use a headset mic
Hey I tried to use this today but while my mic is on sometimes out of nowhere there is a robotic static.. is there any way to fix it?
vb cable causes popping when i talk through it voice changer is clear but audio that comes from the cable to talk on discord sounds robotic
and idk how to fix it
Bro every voice changer j use it lags like when i say hello it say hhhhhn
Can someone please help me please, my changer comes in “choppy” like it’s not even close to as clear as Jarod’s .. so I change the chunk up and down and it still comes in choppy no matter what I do. Is there any solutions? Thank you