Hey this is pretty cool! I am thinking about doing a project similar to this, but using what my headphones is picking up (+my microphone), it would transcribe it and you could parse it to something like chatgpt, should be fast & reliable without any delay. It would have many use-cases such as transcribing meetings, voice calls etc and you can get summarizes, feedback and other nice outputs. Does that sound interesting to you?
Really great! I guess if you want to reduce the response time, instead of ChatGPT, you can run any such LLM locally on your system, that would help
For LLMs i need much more that only 12GB VRAM of my 3080 TI (
Hey this is pretty cool! I am thinking about doing a project similar to this, but using what my headphones is picking up (+my microphone), it would transcribe it and you could parse it to something like chatgpt, should be fast & reliable without any delay. It would have many use-cases such as transcribing meetings, voice calls etc and you can get summarizes, feedback and other nice outputs. Does that sound interesting to you?
look at this project github.com/collabora/WhisperFusion. Is it similar to what you want?
really cool but but still a bit long
I reduced the chatGPT response time and reduced the total time to 7 seconds. But yes, it still too long