I just tried this. Holly hell. This is some fast rendering. Thank you for sharing your repo. I tested with stereo mix with a YT video. It did not miss a beat.
That's incredibly accurate. Nice work! Can you active-transcribe AND wake-word for commands? It'd be great if you could have it always listening and then do something on wake word.
I'd like to use it but there isn't code snippets with just specific functions. I only want the live speech transcription and real time TTS in a simple script, I don't need all of the code for wake words etc etc etc etc, And there's like 2000 encompassing lines of code so I cant even figure out what parts are that specific feature.
Also please look into the tests folder of both Realtime libs. There are simple code examples without usage of wakewords in the RealtimeSTT tests folder.
Is it possible to make the voice dictation instantaneous at the cost of accuracy? I want to try controlling the servos on an animatronic mouth with voice dictation. It doesn't have to be accurate, it just needs to be accurate enough to be convincing and as fast as possible
You could also train a wake word model to do this. They are crazy fast and reliable but specialized on few keywords. Check Openwakeword or PvPorcupine.
What do you want to do? The "tests" folder contains some examples how you can use it: github.com/KoljaB/RealtimeSTT/tree/master/tests Maybe also the "tests" of RealtimeTTS can help, they also use RealtimeSTT a lot: github.com/KoljaB/RealtimeTTS/tree/master/tests
I just tried this. Holly hell. This is some fast rendering. Thank you for sharing your repo. I tested with stereo mix with a YT video. It did not miss a beat.
This looks really promising, i will try to test it on my programs! Thanks for the work my dude!
Thank you for showing us your library in action as well as letting us know how we can support it!
Nice one! I look forward to trying this out
This is awesome ! Thanks
That's incredibly accurate. Nice work! Can you active-transcribe AND wake-word for commands? It'd be great if you could have it always listening and then do something on wake word.
No, currently not. The idea is good, I can see some use-cases for this. I'll think about that.
Great work..do you have any ideas to reduce latency in text to speech..im working on it..
I'd like to use it but there isn't code snippets with just specific functions. I only want the live speech transcription and real time TTS in a simple script,
I don't need all of the code for wake words etc etc etc etc, And there's like 2000 encompassing lines of code so I cant even figure out what parts are that specific feature.
Please look into github.com/KoljaB/LocalAIVoiceChat/ project and how both Realtime libraries are used in the ai_voicetalk_local.py
Also please look into the tests folder of both Realtime libs. There are simple code examples without usage of wakewords in the RealtimeSTT tests folder.
@@Linguflex Okay, thanks for replying!
I'll try to use this and customize it for our Undergrad Thesis, is it okay?
Yes sure. It's MIT license so you can use it for whatever you like.
im gonna try to make a Vrchat STT app that puts the words above my head using their osc system :D
Is it possible to make the voice dictation instantaneous at the cost of accuracy? I want to try controlling the servos on an animatronic mouth with voice dictation. It doesn't have to be accurate, it just needs to be accurate enough to be convincing and as fast as possible
You probably want to use whisper.cpp with a quantized tiny model and grammar sampling, look up Georgi Gerganov's chess example.
You could also train a wake word model to do this. They are crazy fast and reliable but specialized on few keywords. Check Openwakeword or PvPorcupine.
I don't understand how to use it...
What do you want to do?
The "tests" folder contains some examples how you can use it:
github.com/KoljaB/RealtimeSTT/tree/master/tests
Maybe also the "tests" of RealtimeTTS can help, they also use RealtimeSTT a lot:
github.com/KoljaB/RealtimeTTS/tree/master/tests