Great video. thank you I'm one of the VCClient and RVC contributors. There are some additions to the content of the video. Regarding the difference between the f0 estimator harvest and crepe, in addition to the sound quality, harvest uses a CPU and crepe uses a GPU. Crepe can improve latency if you have a good GPU. In sever mode you can choose the sound driver. VCClient measures latency within VCClient, but additional latency is added when connecting to other devices. Besides MME, WASAPI and ASIO can be selected, so if you can use them, I recommend using them. For the protect item in advanced options, if protect is set to less than 0.5, the ratio of retrieved features will be reduced in cases where f0 estimation is unsuccessful (silence or breath sounds).
From a musician experience: if you have ASIO supporting soundcard - use ASIO instead of MME. It decreases the audio delay provided by audio tract (e.g. on my PC guitar/mic recording delay for 1024 samples chunk is 180ms for standard MME, and 14ms for ASIO). Theoretically WASAPI can also work fast however I don't have WASAPI supported hardware.
i'm probably going to buy an rtx 4090 for AI stuff too. otherwise i'd buy a 7900xtx. it costs half as much, has the same amount of vram and the performance of a 4080. too bad amd sucks with these things.
If anyone has issues exporting an ONNX file and getting an error message in the GUI (it usually just says error message: no error message), but if you check in the console it says that pytorch has tried to allocate VRAM and has failed. A quick workaround for this that worked for me was changing in the GUI to use my CPU instead of my GPU and then exporting the ONNX worked. Afterwards you can change it back to your GPU.
i still get an error in the GUI.. if it helps,this is what is shown on the console : "[Voice Changer] get_onnx ex: Can't get source for . TorchScript requires source access in order to carry out compilation, make sure original .py files are available."
I just wanna say thanks for the video, it helped me better understand what settings I needed to change to get the desired result I wanted. I was currently using a different program until recently when it decided to break after an update, so I had to start looking around at alternatives. Your video helped me understand this program very quickly, and it made things a lot more manageable for me as well. A lot less trial and error having to try and figure it with stuff on my own. So once again, thank you.
Thanks for all the help! You have responded to all of the comments and provided everything Ive needed. Anyways keep up the great work and keep doing what you are doing 👍
Great video as always glad you went through everything with a good explanation for everything! Keep up the great work, and i am excited for what will come after RVC!
you're a legend!! insane video quality and tutorial, can't believe i found pure gold at 4 am. i guess youtube can also be a chad and recommend really good content wow
I've found using this in games causes your mic to cut out and stutter, as well as spiking CPU usage. I hope in the near future it's even more optimized ^^
It's to be expected. The results outside of games are when it's able to fully utilize your GPU. While playing a game, the game is eating some amount of resources. Ecspecially so if the game isn't able to run at a steady framerate, which very well causes the voice client to have unreliable processing times.
Considering how calculation heavy this thing is it's unlikely. I might try to make a VST3 client for this stuff.. but I couldn't find any API description on their page.
Great video! I'm glad you showed what this program is capable of on a 4090. It seems we're not quite there yet with AI voices. I wonder if this is a small hurdle that will be overcome soon or a insurmountable mountain like hands are to AI art.
Ah, actually some AI voice models that I've tried are actually pretty scary accurate, meaning my models need a bit more training lol. I would say we're not that far away from indistinguishable voices
do you have any idea what happened? The voice changer was running pretty well a couple weeks ago but it looks like they had an update and now all the voices seem slurred and choppy, it was perfectly fine a couple weeks ago and I've made no changes to the settings, i don't know what happened, it affected all my models
Not sure, this would only have happened if you had installed a newer version of it. I would recommend just use and stay with the version that worked for you previously as there seems to be issues using other versions
@@Jarods_Journey so far I've redownloaded everything, and everything is up to date, I optimized my settings the same as the older versions but for some reason, the millisecond per response keeps stacking, all the way up to 30k ms per response all the while it's eating my CPU, this didn't happen before and I'm not sure why, do you have a fix for this?
+12 or -12 pitch can be a loooooot. For nuance you should use your ear to listen to the average difference between yourself and the voice you’re emulating, but 8 is plenty of pitch deviation between sexes. Anime voices esp loli voices are more extreme ofc but yeah it’s often good to resist the urge to go overly cutesy if the goal is realism.
For High cpu usage problems.... 1) Buy $10k PC setup. jk 2. Set index to 0. Even setting it to Index: 0.1 maxes out CPU 3) Set Extra to a lower amount. Higher Extra uses more CPU. Lower Extra uses more GPU. ( use 16384, any lower and doesn't decreases cpu usage, any higher and cpu usage doubles. 4) Use crepe_full (uses most gpu) 4) S.R. to 48000. (dont go beyond as echo forms.)
question, how to train? what does train do? i followed all previous tutorial and the voice output only sounds distorted repetitive. im using ryzen 3600x with gtx 1050ti 16gb ram.
my only question is how do you keep yourself from hearing the changed voice on your own end? i have adhd and when i hear people talking it makes me lose my train of thought- so when i hear myself sayin exactly what i just said in the changed voice it throws me off a lot. is there a way to fix this?
I found that Crepe is a LOT faster than Harvest, BUT Crepe has robotic low tones when using O and A sounds. Can't fix that even with either quality or best settings. Harvest on the other hand has no robotic tones at all but its much slower than Crepe. Overall, Harvest is definitely the best quality in exchange for more latency
Holy hell you fixed it, I was trying to do whatever I can to take off the robotic tones and all but nothing was working. When all looked lost I saw your comment and changed it to harvest. It worked! Thank you so much
I tried this tool out. Honestly... I can see this kind of tool being put on a list of banned software at some point in the future. I gave it a go mostly to see if the DirectML implementation works on intel Arc and... well it didn't. Which is fair. I have a 1080ti as well so that's no biggy.
Ah, I think that's a slippery slope. If it does get banned, that leaves only bad actors to use it as nothing will ever stop them lol. It's still hardware intensive atm so newer Nvidia are still needed
I find that the DirectML version (AMD) tends to randomly stop working and sometimes my GPU driver would crash, so that could use some more work to make it stable. And thanks for this guide! I barely know what each option really means.
@@Pepijaaj Honestly, I don't know if it's even using the GPU at all, I get ~50-80% CPU usage when RVC is active. I thought you meant if the software worked at all on AMD, my mistake. ^^; However, I got it running okay-enough on my 5800X3D with these settings; INDEX: 0, F0 Det.: harvest, S.Thresh.: 0, CHUNK: 320, EXTRA: 4096
Im having issues where everything comes out in short breaths or stutters, barely comprehensible, I have pretty good specs and I tinkered around with as many settings as I could but I always get the same result, how do I fix this??
I got issues and need serious help... issue 1: The AI is using the CPU when I have the program selected for "GPU 0" which is my main GPU. issue 2: The audio is choppy, it cuts too often.
RVC Quality should be in HIGH if you want to use this in a social game as male - female ect. You might not notice much of a difference but there's a world of difference actually, people notice. What you do is you keep it in HIGH and the program will make sure the output is always the best that way people don't suspect a thing.
@@leexy3395 You are very very wrong in the context I spoke of, many people have good hearing, of course your friends who aren't paying much attention or taking is seriously won't notice a difference. But people in the context I mentioned absolutely will.
No matter what, I can personally hear the artifacts in RvC, and all other neural TTS systems if I’ve been exposed to them at least once, this goes for other kinds of audio processing as well, since I do audio engineering, people do absolutely know the differences, trust me on this.
I forgot, has there been a place for a collection of trained voices so far? I definitely don't have the system nor time to find a character, compile voice lines and then train off that.
@@eagels3131 Best I've found is to fiddle with the settings somewhere I can hear it (like in a call with an alt discord account). But no good fixes really.
Hi im the one on your discord server who created a ticket about the gl not loading and you said to change the chunk to max, i just switched the audio to server and it worked perfectly for me, i guess theres a weird thing going on with client option. Just wanted to put it out there
I tried fixing it, but i have a problem where the audio is said like 4 times in a row, with decreases in loudness, and the echo stuff does nothing to fix it. Would you have any ideas on how to correct this?
please do you have a fix for: [Voice Changer] Pipeline is not initialized. [Voice Changer] Waiting generate pipeline... pls it would be much appreciated
@Jarods Journey do you have any resources for voice models, or do you train your own? All of your voice models sound much better than what I could find online. What settings do you use to train your models at?
Hey there Jarod, I have a question regarding "Cuda Cores" / Cuda core compatible graphic cards. If i get this straight : Your GPU must be a good one , i.e: a newer and faster one (That much is clear) but it also must have cuda cores, since all the models are cuda core based. So a 1080 ti which is a strong gaming card and a favourite of many gamers cant handle the ai voice changer program. The program defaults to CPU usage. Thats what happens on my end with a 1080 ti (which has no cuda cores). The solution would be to go with the OTHER download (what you called "for amd users". Is my assessment correct?
CUDA cores are included in all modern GPUs actually. the software will default to CPU always. The benefit to having a higher end card would be the faster GDDR6 memory, larger amount of VRAM, and many other factors. I am not sure if RVC uses CUDA cores in any way, but a GTX 1080ti should still work, just increase the delay, run it through server, and have the audio encoder set to ASIO or WASAPI if you can. Note : Only use the alternate AMD version if you have an AMD GPU, if you have an NVIDIA graphics card, use the standard download you already have. I tried out the software with my GTX 1650ti on my laptop and it worked pretty well, once set to a 5sec delay. Your GTX 1080ti should do far better then my mobile GPU.
@@calebpeters191 I have done exactly that, but for me the delay needs to be really high to sound any good. For some reason, the GPU still sits at only 30% at max. The sound is still robotic and just sounds like i am a 8 year old kid with a generic voice morpher. Besides, sometimes (rarely) my CPU goes to 99% and on the more common times : 20-30% (5800x3d), which made me think that the GPU was ignored and cpu used instead
I believe the 1080ti does have cuda cores... But has 0 tensor cores which allows for much more efficient cycle usage of the GPU. CUDA should still be your best bet, but I'm not sure why it's not noticing your card
MX250 gpu here (potato gpu). Best I can get is 2+ seconds delay (chunk 320 extra 4096) and WASAPI. - lower chunk will glitch, higher chunk will add more delay. - changing MME to WASAPI helps a lot (from 3+ to 2+ seconds delay), but I have to change my audio sample rate first. - I can't change the model after I use it because the gpu memory (2 GB) will be overloaded. I have to restart the client first before changing the model. I will try converting to onnx later hoping it will go faster (export onnx is currently broken)
because the RES comes out with 8000 up to 29000 you can lower it because it takes a long time to load the voice and sometimes you don't even hear anything
man i really wanted this to work for me but for some reason as soon as i press start it just starts saying random voice lines like someone else is speaking on my mic and i dont even know what could be causing that
Anyone know why, when I use client, all my input and output options disappear? It was working fine, and then I just couldn't select them anymore. Edit: I figured it out but for anyone with the same problem, Ill leave this here. It only occurs when you make a shortcut version of the start file and try to run it. The way to fix it is to open the original start file, and switch between server and client on one of the sample voices. If it doesn't work just repeat until your input and output options reappear.
Mine sometimes has a weird voice for half a second before it says what i say. I thought it was just background noise but even with all the suppression and sensitivity being lower, it still happens. It kinda sounds like glitchy breathing
Hello sir, do I need to download any extra files prior to downloading these as shown in this video, or will everything work just fine when following the same process in this video? I have heard you talking about downloading RVC first or maybe I am mistaken. Thank you.
Thank you, sir, for replying. But I am done downloading the software but I think the voice changes often say things that I didn't say and weird noises and twist echo. I have an AMD potato laptop with just 4gb vram and 32gb ram, in the gpu drawer I see just gpu on/off option no room for choosing between GPU or CPU not sure why that is like that on AMD played with it all day yet the software hasn't been able to say a single word of mine only making a weird noise and short utterances. Not sure though, do you think upgrading the AMD drive would make any much difference? Or maybe time to change to Nvidia. Note. The laptop comes with a hybrid gpu AMD Rx / Vega 8. I think the Vega is interfering. 🤯
@@ivw1286I have a rented A100, a $10,000 GPU, and it still struggles a little. I don't think you meet the minimum requirements to even run the program.
Hey idk if its too late but can i pls have some hlp here i'm trying to install it but i just cannot seem to be able to hear or make it work. If possible can i contact you and arrange smth or a zoom or smth. Pls
I'm very sad, I've been trying to configure this for days, but my pc doesn't have any good gpu, only cpu, and I think that's why my voice is laggy too much, and it takes a long time to come out, I can't find a perfect cpu setting, please help me, because the voice doesn't even want to come out right, is just laggy 😢😢
great video. I did everything and it worked 100%. A question, whenever I talk, for some reason the audio is too slow and cuts every 3 seconds. So i cant say a word. Is there a way to fix this? or is it because of my GB is too low?
Maybe it's picking up on some background noise with your mic, but it could also be the models that you're using as it's still not a perfect voice-to-voice conversion
HI, I have an idea to use the RVC client on a karaoke party . I'm struggling to make the delay short enought to sing in real time. It's just not possible. I bought top graphic card - Zotac RTX 4090 with 16500 cuda cores. The delay (even in passthru mode) is ca 0,3 sec and theres no way to get rid of it. Only sound driver that work is MME , but the latency is enormous. I can't make it work (in server mode )with any of the ASIO drivers I have. It's mute. .I've Downoladed ASIO4All. Mute. Bought Focusrite 212 interface. Asio is dead. Realtek ASIO with built in sound card - doesnt work, as well. What I'm doing wrong ? \ My CPU is Intel i9 12900K
UUUUUH HELP, since i downloaded this "AI Voice Changer Client" for some reason is not working for "me only" but who knows maybe people are having the same problem as me cause when i open the client and use it, none of the voice changers are clickable, like I'm sorry why is it not working I'm confused can someone help i need answers please 🤨❓
Great instructions! I saw you don't have any of the original models that come with the software loaded up anymore, how did you delete those? I can't seem to figure out how to get rid of them lol
@@Jarods_Journey Gotcha, it didn't seem to work when I did that so maybe I'll try reinstalling or using an older version. I downloaded whatever the most recent version was as of yesterday (7/10/23) and the UI looked a bit different than yours too so maybe some changes the dev made broke overwriting. Thanks man, appreciate you!
When I lower my chucks, the time for it to process is so much slower then it goes all jerky (16 Chucks) if I go back to 512, it's faster and sounds clear, but 16 chucks should be nearly real time but less clear (Using a RTX 4070)
All the audio is filtered, not just the microphone that is used, I tested it on Discord and in games and the audio of anyone's voices is filtered by the voice audio, no one has a decent answer so I guess there is no solution there in these versions of the program :/
i have to figure out a way to offload the processing to my laptop so it doesn't bug out when playing demanding games. is there a way to access it from a lan client? ps: i thought the directml version worked for my 6900xt, but it's processing on the cpu. but, on the plus side, converting to omnx literally halves the load on the cpu. when running this exclusively, i managed to get a total lag of 400ms buffer + processing. not bad for an octacore cpu (5800x3d)
Great video. thank you
I'm one of the VCClient and RVC contributors. There are some additions to the content of the video.
Regarding the difference between the f0 estimator harvest and crepe, in addition to the sound quality, harvest uses a CPU and crepe uses a GPU. Crepe can improve latency if you have a good GPU.
In sever mode you can choose the sound driver. VCClient measures latency within VCClient, but additional latency is added when connecting to other devices.
Besides MME, WASAPI and ASIO can be selected, so if you can use them, I recommend using them.
For the protect item in advanced options, if protect is set to less than 0.5, the ratio of retrieved features will be reduced in cases where f0 estimation is unsuccessful (silence or breath sounds).
I've seen, appreciate your work and thank you for the additional information!
Would be better if next to every option will be ? icon when on hover you will see popup with explanation. It will help a lot.
Yo app just mines BTC stop the cap.
@@Erlraith proof?
@@paradym777 My Premium Kaspersky version 😀
From a musician experience: if you have ASIO supporting soundcard - use ASIO instead of MME. It decreases the audio delay provided by audio tract (e.g. on my PC guitar/mic recording delay for 1024 samples chunk is 180ms for standard MME, and 14ms for ASIO). Theoretically WASAPI can also work fast however I don't have WASAPI supported hardware.
It stutters for me when use my asio soundcard or just doesnt work at all
@@realxdey same here
Where do I change the settings for asio?
ASIO4All an any other ASIO doesnt work.
@@realxdey change the sample size lol
i literally just upgraded from my 1050ti to a 4070 today just to use this + other AI tools. love these tutorials
GZ! Very solid card right there :D
i'm probably going to buy an rtx 4090 for AI stuff too. otherwise i'd buy a 7900xtx. it costs half as much, has the same amount of vram and the performance of a 4080. too bad amd sucks with these things.
man im using intel HD card 💀, will it work?💀
I'm still using rtx 2060
I'm using 920mx😀
If anyone has issues exporting an ONNX file and getting an error message in the GUI (it usually just says error message: no error message), but if you check in the console it says that pytorch has tried to allocate VRAM and has failed. A quick workaround for this that worked for me was changing in the GUI to use my CPU instead of my GPU and then exporting the ONNX worked. Afterwards you can change it back to your GPU.
Smart, didn't even cross my mind xd
i still get an error in the GUI..
if it helps,this is what is shown on the console : "[Voice Changer] get_onnx ex: Can't get source for . TorchScript requires source access in order to carry out compilation, make sure original .py files are available."
yooooooooo thank you so much
What do you mean changing the gui?
@@Nebularban changing the setting in the GUI
Hear Botan/Marine speaking clear English is kind of weird&awesome at the same time XD
I just wanna say thanks for the video, it helped me better understand what settings I needed to change to get the desired result I wanted. I was currently using a different program until recently when it decided to break after an update, so I had to start looking around at alternatives. Your video helped me understand this program very quickly, and it made things a lot more manageable for me as well. A lot less trial and error having to try and figure it with stuff on my own. So once again, thank you.
Thanks for all the help! You have responded to all of the comments and provided everything Ive needed. Anyways keep up the great work and keep doing what you are doing 👍
I'm so lost. My window is just empty when I click on the native client exe...
This worked so great on my first Voice Changing experience. Other videos on your channel are also great! Thank you very much.
Great video as always glad you went through everything with a good explanation for everything! Keep up the great work, and i am excited for what will come after RVC!
Tech is getting so cool, great video!
1650 Super and even 1060 are doing surprisingly fine using this tool.
Awesome to hear! I had great success on my 2070 super so I'm glad to hear the pascal cards are still chugging along
hello mind telling me what settings you are using? i have a 1660 super and its super laggy nothing is understandable
@@raidentatsunoko same, i have 1660, but voice very bad and laggy
@@soluckymoon sad, still same to me.
@@raidentatsunoko have youn found a fix?
I'm here because I can't even get it running 😭
Same here
@@mmm-zr3dymine is super lacking and Idk why
@@mmm-zr3dyWhen changing the voices live, you have to sit there for a moment so the software can adjust.
Same here
you're a legend!! insane video quality and tutorial, can't believe i found pure gold at 4 am.
i guess youtube can also be a chad and recommend really good content wow
same...
it feels like the voice changer is picking up too much background noises and saying things I didnt say is that normal?
You have ghosts in ur house
I've found using this in games causes your mic to cut out and stutter, as well as spiking CPU usage. I hope in the near future it's even more optimized ^^
It's to be expected. The results outside of games are when it's able to fully utilize your GPU.
While playing a game, the game is eating some amount of resources. Ecspecially so if the game isn't able to run at a steady framerate, which very well causes the voice client to have unreliable processing times.
Have you figure a solution for that yet? Bc recording or even ingame mic check, it sounds fine. But people ingame hear it stutter alot
I hope they will make it in VST3 format so i can just put it on the daw track which my microphone is routed through. it would be so amazing holyy
Considering how calculation heavy this thing is it's unlikely.
I might try to make a VST3 client for this stuff.. but I couldn't find any API description on their page.
why cant you just use VAC and use VAC Line in as your input mic in your DAW?
@@PsychicType hmm ill try that ty!!!
Great video! I'm glad you showed what this program is capable of on a 4090. It seems we're not quite there yet with AI voices. I wonder if this is a small hurdle that will be overcome soon or a insurmountable mountain like hands are to AI art.
Ah, actually some AI voice models that I've tried are actually pretty scary accurate, meaning my models need a bit more training lol. I would say we're not that far away from indistinguishable voices
So what were you saying about hands again?
1 year later, it's done
i seem to have a problem with delay, theres a long delay before the voice starts working, is there a way i can fix it?
do you have any idea what happened? The voice changer was running pretty well a couple weeks ago but it looks like they had an update and now all the voices seem slurred and choppy, it was perfectly fine a couple weeks ago and I've made no changes to the settings, i don't know what happened, it affected all my models
Not sure, this would only have happened if you had installed a newer version of it. I would recommend just use and stay with the version that worked for you previously as there seems to be issues using other versions
@@Jarods_Journey I’ll try and play around with it, does it remove your models if you install it again?
@@kongk5772 yup, you gotta start from scratch
@@Jarods_Journey so far I've redownloaded everything, and everything is up to date, I optimized my settings the same as the older versions but for some reason, the millisecond per response keeps stacking, all the way up to 30k ms per response all the while it's eating my CPU, this didn't happen before and I'm not sure why, do you have a fix for this?
which version is the one that works for you?
even at highest chunk i cant get it to sound good
+12 or -12 pitch can be a loooooot. For nuance you should use your ear to listen to the average difference between yourself and the voice you’re emulating, but 8 is plenty of pitch deviation between sexes. Anime voices esp loli voices are more extreme ofc but yeah it’s often good to resist the urge to go overly cutesy if the goal is realism.
I like +12 and -12 becuase you can sing and not have the apparent pitch change becuase the pitch is going up a whole octave.
For High cpu usage problems....
1) Buy $10k PC setup. jk
2. Set index to 0. Even setting it to Index: 0.1 maxes out CPU
3) Set Extra to a lower amount. Higher Extra uses more CPU. Lower Extra uses more GPU. ( use 16384, any lower and doesn't decreases cpu usage, any higher and cpu usage doubles.
4) Use crepe_full (uses most gpu)
4) S.R. to 48000. (dont go beyond as echo forms.)
this works for amd gpu, i was getting crazy delays. followed this settings and now its only 2 seconds delay. thank you!
thx bro now im only getting a 1/2 second delay on amd gpu
What does S.R mean? I don't see this option.
@@IG7799-c4u same i dont know what that means
I did all of this and can't use it. I hear my own voice changed but other can't. How can I solve this?
Run it through a virtual audio cable.
question, how to train? what does train do?
i followed all previous tutorial and the voice output only sounds distorted repetitive. im using ryzen 3600x with gtx 1050ti 16gb ram.
If you've already trained, you might need better audio samples, more samples, or adjust settings to better fit your system via chunks/extra
my only question is how do you keep yourself from hearing the changed voice on your own end? i have adhd and when i hear people talking it makes me lose my train of thought- so when i hear myself sayin exactly what i just said in the changed voice it throws me off a lot.
is there a way to fix this?
It's simple, below Audio (Client) you'll see in, out and mon. I'm assuming mon is for monitoring. Just leave it blank
Just a quick question, the voice changer works perfectly but it always cuts off at the end of the sentence. Any clues why?
+1
same
my voice is glitching out like: "A-a-a-a-a-a-a-a- A" and the buffer is very high for no reason
I found that Crepe is a LOT faster than Harvest, BUT Crepe has robotic low tones when using O and A sounds.
Can't fix that even with either quality or best settings.
Harvest on the other hand has no robotic tones at all but its much slower than Crepe.
Overall, Harvest is definitely the best quality in exchange for more latency
Holy hell you fixed it, I was trying to do whatever I can to take off the robotic tones and all but nothing was working. When all looked lost I saw your comment and changed it to harvest. It worked! Thank you so much
I tried this tool out. Honestly... I can see this kind of tool being put on a list of banned software at some point in the future. I gave it a go mostly to see if the DirectML implementation works on intel Arc and... well it didn't. Which is fair. I have a 1080ti as well so that's no biggy.
Ah, I think that's a slippery slope. If it does get banned, that leaves only bad actors to use it as nothing will ever stop them lol.
It's still hardware intensive atm so newer Nvidia are still needed
I find that the DirectML version (AMD) tends to randomly stop working and sometimes my GPU driver would crash, so that could use some more work to make it stable. And thanks for this guide! I barely know what each option really means.
Wait is your gpu recognized in the software?
@@Pepijaaj It is, it's an RX 6900XT.
how do u get it to use ur gpu mine only uses the cpu even on the directml version 😭😭
@DesuVR wtf my rx 6800xt is not how did you do 💀
@@Pepijaaj Honestly, I don't know if it's even using the GPU at all, I get ~50-80% CPU usage when RVC is active. I thought you meant if the software worked at all on AMD, my mistake. ^^;
However, I got it running okay-enough on my 5800X3D with these settings; INDEX: 0, F0 Det.: harvest, S.Thresh.: 0, CHUNK: 320, EXTRA: 4096
Great! can't wait to try this out to see if it improves performance. If it doesn't I might just install windows 11 to match your settings exactly.
Im having issues where everything comes out in short breaths or stutters, barely comprehensible, I have pretty good specs and I tinkered around with as many settings as I could but I always get the same result, how do I fix this??
Same
Same
I'm having the same issues have you solved it?
I got issues and need serious help...
issue 1: The AI is using the CPU when I have the program selected for "GPU 0" which is my main GPU.
issue 2: The audio is choppy, it cuts too often.
Chunk 192, Extra 8192. F0 Det rvmpe_onnx. Run audio off server.
Praying for the day it works on AMD, i'm currently stuck with a 5700XT.
RVC Quality should be in HIGH if you want to use this in a social game as male - female ect.
You might not notice much of a difference but there's a world of difference actually, people notice.
What you do is you keep it in HIGH and the program will make sure the output is always the best that way people don't suspect a thing.
Doesn't sound that different and it's not worth the hit on CPU resources.
I compared High and low to my friends (markeplier model) and they said there is not deference at all.
edit: I'm using 1060
@@leexy3395 You are very very wrong in the context I spoke of, many people have good hearing, of course your friends who aren't paying much attention or taking is seriously won't notice a difference. But people in the context I mentioned absolutely will.
No matter what, I can personally hear the artifacts in RvC, and all other neural TTS systems if I’ve been exposed to them at least once, this goes for other kinds of audio processing as well, since I do audio engineering, people do absolutely know the differences, trust me on this.
I forgot, has there been a place for a collection of trained voices so far? I definitely don't have the system nor time to find a character, compile voice lines and then train off that.
AI Hub has a comprehensive collection. I can't post their discord invite here, but it isn't hard to find.
@@gjvyigfghjghivff how to join their server? can u give me the invite code only. not the full link
@@ozymogaming It's "aihub"
There is in fact and it's the AI Hub discord. I plan on making a vid of it in the future just quickly going over it
Why is my voice changer says no embedder? When i changed into a custome voice it says failed idk why it says fail i cant even hear the voice changer
Using a RTX 3070 with:
Chunk 256
Extra 131072
Sounds perfect on these even with the half second delay!
Settings? I’m on a Rtx 3070
How? 😮 Mine is constantly breaking
why is mine so laggy??
Do you think we could get a download for the Marine voice?
I notice my voice cuts out a lot while speaking. So it misses a few words sometimes. Whats the best way to solve that?
I'm also having this issue, so I'm tagging on to your comment.
any fixes?
@@Radhaun any fixes?
@@eagels3131 Best I've found is to fiddle with the settings somewhere I can hear it (like in a call with an alt discord account). But no good fixes really.
Hi im the one on your discord server who created a ticket about the gl not loading and you said to change the chunk to max, i just switched the audio to server and it worked perfectly for me, i guess theres a weird thing going on with client option. Just wanted to put it out there
there is an echo after the first sentence and it gets so squeeky till u cant hear it anymore
I've got an ARC A750, but it's not detecting my graphics card. I only get CPU option.
Intel is a no go, try the directml version but it may not work.
I tried fixing it, but i have a problem where the audio is said like 4 times in a row, with decreases in loudness, and the echo stuff does nothing to fix it. Would you have any ideas on how to correct this?
This is happening to me too and honestly even i dont know how to fix it im just as clueless as you
Same issue for me. I would like to get this sorted out
Are you on Windows 11?
Yeah😅@@linuxtuxvolds5917
Its amazing but I wish it was better optimized for AMD, I use a 7900xtx and it randomly loops the last syllable and stops.
Yeah, a lot of people have had issues with AMD. You might benefit from running chunk size lower or extra lower to see if it prevents the loopage.
@@Jarods_Journey alrighty! thank you so much!
@@DippinDoughnutz Could you drop the settings that work for you? I can't get it to work at all without being super laggy and very cut up
@@faded8975 for me no, nothing helps. the only ones that work are the ones that come with the voice changer
The thing that kills me is the lisp on literally all of the voices, is there any way to stop that?
omg i bet is the chunk sdhjfsjdf
my voice echoes and i can hear what i say again 2-3 times any suggestions?
Turn on echo filter and set higher protection. (but i don't think you actually need it after one year passed)
Thanks, I really need to try this one.
please do you have a fix for:
[Voice Changer] Pipeline is not initialized.
[Voice Changer] Waiting generate pipeline...
pls it would be much appreciated
did u find a soluti0on?
@Jarods Journey do you have any resources for voice models, or do you train your own? All of your voice models sound much better than what I could find online. What settings do you use to train your models at?
I train all of my models using RVC v2. No real special settings, I just clean and curate my data so it's crystal clear input audio.
@@Jarods_Journey Trained a couple models already, came out great. Thanks for all the help between comments and videos.
HOW DO U CLEAN DATA@@Jarods_Journey
i really want to try this but my 1060 says no.
How to make Res: lower?
I raise chunks, but along with them rises and res:
and it turns out that all models stutter.
Most likely hardware limited, it will rise if your hardware can't keep up with the rate that it's set at
I've got a average PC, and my RES still goes up. Any ideas on what to do? even tried it on the lowest settings and its not working also!
Hey there Jarod, I have a question regarding "Cuda Cores" / Cuda core compatible graphic cards.
If i get this straight : Your GPU must be a good one , i.e: a newer and faster one (That much is clear) but it also must have cuda cores, since all the models are cuda core based. So a 1080 ti which is a strong gaming card and a favourite of many gamers cant handle the ai voice changer program. The program defaults to CPU usage. Thats what happens on my end with a 1080 ti (which has no cuda cores). The solution would be to go with the OTHER download (what you called "for amd users".
Is my assessment correct?
CUDA cores are included in all modern GPUs actually. the software will default to CPU always. The benefit to having a higher end card would be the faster GDDR6 memory, larger amount of VRAM, and many other factors. I am not sure if RVC uses CUDA cores in any way, but a GTX 1080ti should still work, just increase the delay, run it through server, and have the audio encoder set to ASIO or WASAPI if you can.
Note : Only use the alternate AMD version if you have an AMD GPU, if you have an NVIDIA graphics card, use the standard download you already have. I tried out the software with my GTX 1650ti on my laptop and it worked pretty well, once set to a 5sec delay. Your GTX 1080ti should do far better then my mobile GPU.
@@calebpeters191 I have done exactly that, but for me the delay needs to be really high to sound any good. For some reason, the GPU still sits at only 30% at max. The sound is still robotic and just sounds like i am a 8 year old kid with a generic voice morpher.
Besides, sometimes (rarely) my CPU goes to 99% and on the more common times : 20-30% (5800x3d), which made me think that the GPU was ignored and cpu used instead
@@calebpeters191 thanks nonetheless dude
I believe the 1080ti does have cuda cores... But has 0 tensor cores which allows for much more efficient cycle usage of the GPU.
CUDA should still be your best bet, but I'm not sure why it's not noticing your card
@@Jarods_Journey I just bought a 4070 ti a few hours ago, and let it run. The performance is LEAPS better, thanks for the answer Jarod
I have an issue, I can hear myself that my voice is changed, but when I tried to use sound recorder, the result is that my voice is not changing, why?
MX250 gpu here (potato gpu). Best I can get is 2+ seconds delay (chunk 320 extra 4096) and WASAPI.
- lower chunk will glitch, higher chunk will add more delay.
- changing MME to WASAPI helps a lot (from 3+ to 2+ seconds delay), but I have to change my audio sample rate first.
- I can't change the model after I use it because the gpu memory (2 GB) will be overloaded. I have to restart the client first before changing the model.
I will try converting to onnx later hoping it will go faster (export onnx is currently broken)
but u can run smooth with high delay? cause if u can theres hope for my 1660ti
@@soimpressivesodogetothemoo8027 yes, as long as there's enough delay, it will run smoothly
Onnx is fixed, wasapi is a good driver as that one makes mine faster as well
How do i change it from MME to smt else
because the RES comes out with 8000 up to 29000 you can lower it because it takes a long time to load the voice and sometimes you don't even hear anything
Can we get a download for the marine voice please?
Yes this
man i really wanted this to work for me but for some reason as soon as i press start it just starts saying random voice lines like someone else is speaking on my mic and i dont even know what could be causing that
SAME did u figure it out?
Anyone know why, when I use client, all my input and output options disappear? It was working fine, and then I just couldn't select them anymore.
Edit: I figured it out but for anyone with the same problem, Ill leave this here. It only occurs when you make a shortcut version of the start file and try to run it. The way to fix it is to open the original start file, and switch between server and client on one of the sample voices. If it doesn't work just repeat until your input and output options reappear.
Very helpful 👌 . Just where can you get models?
Mine sometimes has a weird voice for half a second before it says what i say. I thought it was just background noise but even with all the suppression and sensitivity being lower, it still happens. It kinda sounds like glitchy breathing
help!!! my ai keeps repeating what i said after i finished talking. it like mumbles exactly what it just finished saying
this is where my vtuber career starts
Hello sir, do I need to download any extra files prior to downloading these as shown in this video, or will everything work just fine when following the same process in this video? I have heard you talking about downloading RVC first or maybe I am mistaken.
Thank you.
You will need to download or train AI models for voices. It only comes with 4 preinstalled
Thank you, sir, for replying. But I am done downloading the software but I think the voice changes often say things that I didn't say and weird noises and twist echo. I have an AMD potato laptop with just 4gb vram and 32gb ram, in the gpu drawer I see just gpu on/off option no room for choosing between GPU or CPU not sure why that is like that on AMD played with it all day yet the software hasn't been able to say a single word of mine only making a weird noise and short utterances. Not sure though, do you think upgrading the AMD drive would make any much difference? Or maybe time to change to Nvidia.
Note. The laptop comes with a hybrid gpu AMD Rx / Vega 8. I think the Vega is interfering. 🤯
@@ivw1286I have a rented A100, a $10,000 GPU, and it still struggles a little. I don't think you meet the minimum requirements to even run the program.
"And it's going to give you the best audio 👹" 9:05
8:38 Alas, Houshou Marine finally speaks English🙏🙏🙏
Its cool bro, As expected I want to prank my virtual friends using a voice changer😂😂
I subscribe you channel😂😂
One day I'll make my own ai voice using my voice to then mimic myself with an ai voice of myself.
Is microphone important on sound quality too? Because I'm using my headset microphone and it sounds ass and robotic
Yes, but model quality play a role here too. If you can upgrade your mic, it may help out but it may not if the models are not good.
the tutorial its good, but i have a problem The voice sounds cutted I think its becuase my "RES" its to high, what can i do?
Hey idk if its too late but can i pls have some hlp here i'm trying to install it but i just cannot seem to be able to hear or make it work. If possible can i contact you and arrange smth or a zoom or smth. Pls
Great video !
I fear that one day i'll listen to a really sexy asmr and it turns out a 60 year old man is behind the ai-modified voice
I'm very sad, I've been trying to configure this for days, but my pc doesn't have any good gpu, only cpu, and I think that's why my voice is laggy too much, and it takes a long time to come out, I can't find a perfect cpu setting, please help me, because the voice doesn't even want to come out right, is just laggy 😢😢
great video. I did everything and it worked 100%. A question, whenever I talk, for some reason the audio is too slow and cuts every 3 seconds. So i cant say a word. Is there a way to fix this? or is it because of my GB is too low?
for me the main issue for me is no matter what i change i still have static in my playback. any idea on what i should do?
Maybe it's picking up on some background noise with your mic, but it could also be the models that you're using as it's still not a perfect voice-to-voice conversion
Omg i have never heard an english fluent Suisei before
If you use a voice changer, you won't be able to laugh, because when you laugh your voice will become like a robot
Bro every voice changer j use it lags like when i say hello it say hhhhhn
I have an RTX 4060 ti .
I am hearing some weird sounds (like predator lol)} i 've changed many settings and yet haven't figure it out.
How to fix 50000 ms delay? I have 1650 geforse GTX and intel core i7
Hello, I'm encountering a similar issue as you. Have you managed to find a potential solution by any chance?
How do you do it with obs cause I can't record microphone (that voice) it seems to detect the voice but i can't hear it on recording
Have you solved this problem yet?
Setup with vb audio cable, then set up microphone source in OBS as vb audio cable
@@Jarods_Journey thx for advice but i still can't manage to record it. I don't know what i'm doing wrong :
HI, I have an idea to use the RVC client on a karaoke party . I'm struggling to make the delay short enought to sing in real time. It's just not possible.
I bought top graphic card - Zotac RTX 4090 with 16500 cuda cores. The delay (even in passthru mode) is ca 0,3 sec and theres no way to get rid of it. Only sound driver that work is MME , but the latency is enormous.
I can't make it work (in server mode )with any of the ASIO drivers I have. It's mute. .I've Downoladed ASIO4All. Mute. Bought Focusrite 212 interface. Asio is dead. Realtek ASIO with built in sound card - doesnt work, as well. What I'm doing wrong ? \
My CPU is Intel i9 12900K
UUUUUH HELP, since i downloaded this "AI Voice Changer Client" for some reason is not working for "me only" but who knows maybe people are having the same problem as me cause when i open the client and use it, none of the voice changers are clickable, like I'm sorry why is it not working I'm confused can someone help i need answers please 🤨❓
Can you tell me why when I say something through the AI it keeps repeating it?
please tell me a 1050 ti and an i5 7300 could do this somehwhat okay??
I tried running this, but its only white screen, what does it mean?
Great instructions! I saw you don't have any of the original models that come with the software loaded up anymore, how did you delete those? I can't seem to figure out how to get rid of them lol
You can just overwrite them by uploading new ones :)!
@@Jarods_Journey Gotcha, it didn't seem to work when I did that so maybe I'll try reinstalling or using an older version. I downloaded whatever the most recent version was as of yesterday (7/10/23) and the UI looked a bit different than yours too so maybe some changes the dev made broke overwriting. Thanks man, appreciate you!
You need to combo real time translator with text to speech so whatever you say is going to be heard by others in japanese with "waifu" voice
too bad japanese sentence structure is almost backwards compared to english.. :3
@@InverseCh I'm pretty sure translator takes care of it...
@@203tronnot in real time unless you do it sentence by sentence. The fundamental difference in sentence structure
Bro. You are too good . A wizard
I have an AMD CPU with nvidia graphics card so idk if it matters which one I use right?
When I lower my chucks, the time for it to process is so much slower then it goes all jerky (16 Chucks) if I go back to 512, it's faster and sounds clear, but 16 chucks should be nearly real time but less clear (Using a RTX 4070)
All the audio is filtered, not just the microphone that is used, I tested it on Discord and in games and the audio of anyone's voices is filtered by the voice audio, no one has a decent answer so I guess there is no solution there in these versions of the program :/
how will should with gtx 1650 i pruve with this gpu but dont work :'v
sond so lag
Hardware limitation unfortunately, people with that GPU have reported 2-3 second delay. Recommend GPU that is an Nvidia 30+ series card
7:00 sounds like Tricia Takanawa from Family Guy
Hello, I have a GPU but for some reason it’s not selectable in the processing option
i have to figure out a way to offload the processing to my laptop so it doesn't bug out when playing demanding games. is there a way to access it from a lan client?
ps: i thought the directml version worked for my 6900xt, but it's processing on the cpu. but, on the plus side, converting to omnx literally halves the load on the cpu. when running this exclusively, i managed to get a total lag of 400ms buffer + processing. not bad for an octacore cpu (5800x3d)
it's work with vietnamese or only English
It'll work in vietnamese
doesnt work, says "[Voice Changer] Waiting generate pipeline... [Voice Changer] Pipeline is not initialized."
Cool video. Ive been wondering if i can use it to talk to other people as i dont really want people to hear my voice. Like in discord or something.
Mb i found the vid
@@AxooD1 what’s the title of the vid