Build your own AI Robot - Ep9 (Speech Transcription)

Поделиться
HTML-код
  • Опубликовано: 4 ноя 2024

Комментарии • 29

  • @AlaaSalomon
    @AlaaSalomon 28 дней назад +1

    Amazing work! Thanks a lot of sharing this project. I was searching online because I wanted to start building something similar, for now it seems you are one of the first people who started something like this which I do really appreciate. I’ll be trying to do something similar and will keep you posted about using the new development with OpenAI new real time capabilities. Other than that I am very excited about the new video. Thanks again for the great work!

    • @LarrysWorkbench
      @LarrysWorkbench  28 дней назад +1

      @AlaaSalomon Thank you for your kind words. Check out my series of how-to videos, and I definitely have more coming. I've also posted my latest Python code at Github.com/LarrysWorkbench. Keep me posted on your project!

  • @davelandry646
    @davelandry646 2 месяца назад +1

    Thank you for posting your script! I'm going to try to get my RPi to talk back to me!! Great job with this series.

    • @LarrysWorkbench
      @LarrysWorkbench  2 месяца назад +1

      Thank you! ChatGPT helped me alot with writing the Python syntax. Let me know how your project goes -

    • @davelandry646
      @davelandry646 2 месяца назад +1

      @@LarrysWorkbench Will do...

    • @davelandry646
      @davelandry646 2 месяца назад +1

      @@LarrysWorkbench Got Floyd talking to me! Your suggestion to have ChatGPT help me did the trick! I used parts of your code and ChatGPT showed me how to install the various modules I needed. Much fun indeed!

    • @LarrysWorkbench
      @LarrysWorkbench  2 месяца назад +1

      Wow that's fantastic!

  • @adrianinvents
    @adrianinvents 2 месяца назад +1

    Excellent work. And thanks for sharing the code. Your videos inspired me to use a hiwonder masterpi robot to build a huge 3D printer. I should be publishing a video about it soon. Thank you and keep up the good work.

    • @LarrysWorkbench
      @LarrysWorkbench  2 месяца назад +2

      Wow a 3D printer, I would love to see that! They seem to have lots of interesting products. I've got more videos coming as well especially now that Floyd has a camera to actually see his surroundings -

  • @thenoblerot
    @thenoblerot 2 месяца назад +1

    Great project - I'm doing something similar with Claude Haiku... Maybe going to migrate to Gemini because Google has not only free inference but also free fine-tuning to play with!

    • @LarrysWorkbench
      @LarrysWorkbench  Месяц назад +1

      Interesting. I don't have much familiarity with any of the models other than OpenAI -

    • @thenoblerot
      @thenoblerot Месяц назад +1

      @@LarrysWorkbench I also just learned today Gemini can take audio input natively! (Video too, though it basically sees it at 1 fps) No need for speech to text in the pipeline, and it can interpret tone of voice! I have generally found Google's models to be underwhelming, but free is free, especially for something so frivolous. (I know the API costs are fractions of a cent but I have a mental block about just letting the robot go tearing around the house, cuz oh no, it'll cost 25 cents. Downside, Google authentication is a pain to set up.) Long-term I'm thinking I'll curate robo-Claude's outputs into a dataset to fine tune Gemini. I still use OpenAI a bit, but imho the Claudes are better, for now anyways!

    • @LarrysWorkbench
      @LarrysWorkbench  Месяц назад +1

      @thenobelrot Wow that's great information! I haven't even thought about migrating, but if OpenAI doesn't give me the 4o advanced voice capabilities by the end of the year I might.
      I'm still quite a beginner in this space, so until now I've had my hands full just trying to learn one model.
      What hardware are you working with?

    • @thenoblerot
      @thenoblerot Месяц назад +1

      @@LarrysWorkbench I'm really new too. Hadn't coded since the 90s before ChatGPT came out. I've got an old second hand turtlebot2 type platform, running ROS Noetic. I was using a pi 4, but was finding it a little limiting, so recently upgraded to a repurposed motherboard from a busted up laptop.

    • @LarrysWorkbench
      @LarrysWorkbench  Месяц назад +1

      @thenobelrot I did some light coding in college back in the 90s, but when ChatGPT came out I got inspired. The Raspberry Pi seems ok for what I'm doing so far, but I'm *really* looking forward to OpenAI tightening up the latency on their API calls. Your project sounds interesting do you have any videos anywhere?

  • @danielgaskin4734
    @danielgaskin4734 Месяц назад +1

    Incredible work ! I have a couple of questions - you mention that you use raspberry pi 4B, is there an added benefit to using raspberry pi 5? Also is there anything that can be done to reduce the response times? Perhaps using the 4o-mini or better internet connection?
    Thanks and again amazing work!

    • @LarrysWorkbench
      @LarrysWorkbench  Месяц назад +2

      @danielgaskin4734
      Thank you!
      Right now the biggest latency comes from the fact that I have to do three API calls for every prompt/response pair (transcription, chat, TTS). The Pi 5 and/or 4o-mini would perhaps speed things up just a bit. But the biggest game changer is when I can implement 4o “realtime”, which has been trained as a speech-to-speech model. So that will be one API call instead of three. I believe that will have a profound effect……

    • @danielgaskin4734
      @danielgaskin4734 Месяц назад +1

      @@LarrysWorkbench just looked it up and you’re right that should make a huge difference ! Would be amazing if you kept your GitHub up to date when you do make the switch :) thanks again for your reply

    • @LarrysWorkbench
      @LarrysWorkbench  Месяц назад +2

      @danielgaskin4734 I’ll post the code, although I suspect it’s going to take a fair bit of rewriting work. Per the API docs I think I need to open some sort of web socket to allow for two way data transfer, so there’ll be some learning curve as well. But the result will be a dramatic improvement…..

    • @danielgaskin4734
      @danielgaskin4734 Месяц назад +1

      @@LarrysWorkbench yep I saw that too - also the cost of it doesn’t sound great either as of yet .. might need to wait for a “realtime mini” 😅. Appreciate your responsiveness and if you crack the new API code and share that would be amazing :)

    • @LarrysWorkbench
      @LarrysWorkbench  Месяц назад +2

      @danielgaskin4734 I’ll share that for sure whenever I get it figured out. I don’t get paid anything for this project so it really motivates me to know that people are following along and finding it interesting :)

  • @AlaaSalomon
    @AlaaSalomon 28 дней назад +1

    Amazing work! Thanks a lot of sharing this project. I was searching online because I wanted to start building something similar, for now it seems you are one of the first people who started something like this which I do really appreciate. I’ll be trying to do something similar and will keep you posted about using the new development with OpenAI new real time capabilities. Other than that I am very excited about the new video. Thanks again for the great work!

    • @LarrysWorkbench
      @LarrysWorkbench  28 дней назад +2

      @AlaaSalomon Thank you for your kind words. Check out my series of how-to videos, and I definitely have more coming. I've also posted my latest Python code at Github.com/LarrysWorkbench. Keep me posted on your project!

  • @AlaaSalomon
    @AlaaSalomon 28 дней назад +1

    Amazing work! Thanks a lot of sharing this project. I was searching online because I wanted to start building something similar, for now it seems you are one of the first people who started something like this which I do really appreciate. I’ll be trying to do something similar and will keep you posted about using the new development with OpenAI new real time capabilities. Other than that I am very excited about the new video. Thanks again for the great work!

    • @LarrysWorkbench
      @LarrysWorkbench  28 дней назад +2

      @AlaaSalomon Thank you for your kind words. Check out my series of how-to videos, and I definitely have more coming. I've also posted my latest Python code at Github.com/LarrysWorkbench. Keep me posted on your project!