Use Meta's Voice SDK for Wake Word Detection & Speech To Text (STT)

Поделиться
HTML-код
  • Опубликовано: 26 дек 2024

Комментарии • 16

  • @jandecker3110
    @jandecker3110 4 месяца назад +1

    Very well structured and paced Tutorials. Im eager to see the follow up with the OpenAI Integration!

  • @mixedworld-yt
    @mixedworld-yt 4 месяца назад +1

    Great tutorial! Thanks for the walkthrough and the additional info regarding clean up.

  • @ethanordorica2808
    @ethanordorica2808 2 месяца назад +2

    Hello, I am finding the voice recognition to be terribly inaccurate. For instance, I can say "bloople", and although my wake word is "hey matt", "bloople" will activate the speech to text. Additionally, after giving a transcription, and hearing the noise that confirms it received my transcription, the app voice experience does not stop listening, and will continue to take more prompts, even past 20 seconds. Any tips?

    • @Sapix_
      @Sapix_ Месяц назад

      You could try a different model

  • @kipsadams
    @kipsadams 4 месяца назад +1

    Thanks for video! I checked out your other video about Whisper ib Apple OS as well and came up with the question: So the Voice SDK from Meta work through Wit and requires network. Does Quest provides any way to transcribe speech on device? Or the only option if I want fast speech recognition is to use whicper.cpp inside the app?

    • @blackwhalestudio
      @blackwhalestudio  4 месяца назад +1

      Yes, the only way would be to use a speech model directly on the device, such as Whisper

  • @Th3Shnizz
    @Th3Shnizz 3 месяца назад +1

    I followed along with this tutorial, but the appVoiceExperience doens't start on awake. It will not respond to my wake word on play. Instead I have to manually press activate on the App Voice Experience component. However when I do this then the application stops listening to my wake word and will instead always trigger wake word detected event and complete transcription event within ms of each other. Do you have any advice?

  • @iABOoDxZ
    @iABOoDxZ 4 месяца назад

    Im new to VR games and I'm working on a project that makes the airplane move,rotate and stop within my hand gestures, i've tried XR Hands but it seems buggy and has some glitches do you know any other way to set up hand tracking?

  • @immanuel6954
    @immanuel6954 4 месяца назад

    would this work for mobile Augmented Reality app development?

    • @blackwhalestudio
      @blackwhalestudio  4 месяца назад

      not really, this is part of a Meta XR SDK. it might be able to run certain things on a mobile phone but it is not intended to do so and I personally have never tested that. let me know the results if you do so!

    • @immanuel6954
      @immanuel6954 3 месяца назад

      @@blackwhalestudio Yes, somehow it worked after I made some adjustments here and there. However, the result is not what I expected. It seems the microphone is always on instead of semi-on. It constantly detects voice input until it hears the wake word and executes the prepared function. Also, if you use it in a noisy or crowded place, the recording will be useless because it can't hear your voice unless you're speaking louder. Yeah... it runs, but it's kind of lagging because it’s always listening. Well, maybe it's because my smartphone's specs aren’t very high. I’m using a Vivo Y100.

  • @abhikhubby
    @abhikhubby 4 месяца назад

    Great video! I know you briefly mentioned why - but can you elaborate on why using intents & wake words is better than using the dictation experience? It's a higher quality language model?
    Right now, am using Whisper + ChatGPT but am open to switching approaches given the Meta transcription options are free.

    • @blackwhalestudio
      @blackwhalestudio  4 месяца назад +2

      Hey, glad you liked it!
      It functions similarly to voice commands except that it does not process the resultant text with natural-language processing. The dictation feature is not designed to be used for voice commands, but instead as a text input modality.
      Whisper + ChatGPT is certainly viable as well and whisper can be run on the headset without requiring internet too (check my last video). Running Whisper locally is faster but I found the Whisper model I used to be way more inaccurate than Wit.ai. I guess it depends on what exactly you are trying to build. I like that the Voice SDK also offers cool TTS capabilities (will look into that for my next video)!

    • @kipsadams
      @kipsadams 4 месяца назад

      @@blackwhalestudioThanks for explaining!