Gemini 2.0 - How to use the Live Bidirectional API

Поделиться
HTML-код
  • Опубликовано: 13 дек 2024

Комментарии • 24

  • @mattkydd
    @mattkydd 16 часов назад +4

    It pains me to say it -- but I'm absolutely blown away by G2.0 bi-directional conversational quality

  • @lgmuk
    @lgmuk 8 часов назад +2

    Gemini 2.0 Is incredible! 🤩
    Great video!

  • @leonvanzyl
    @leonvanzyl 14 часов назад +10

    For a video titled "how to use the API", you never once showed how to use the API 😅. You might want to rename this to use how use AI Studio.

  • @nufh
    @nufh 16 часов назад +1

    Do you remember the controversy in the early days of AI when Google was trying to catch up by using pre-prompt tricks for their model showcases? Now, they've truly delivered.

  • @NLPprompter
    @NLPprompter 15 часов назад +2

    the era of realtime conversation AI is here...wohooooo

  • @alexwoxst
    @alexwoxst 17 часов назад +2

    When do we get the Pydantic AI RAG app? :)

    • @samwitteveenai
      @samwitteveenai  16 часов назад +2

      Will try and write something this weekend. Been flat out with work.

    • @alexwoxst
      @alexwoxst 16 часов назад +1

      @@samwitteveenai Thanks man, no stress take your time with it, just wanted to let you know there is interest. Many frameworks are hot for 4-5 days and then fall off, but Pydantic AI seems to have some staying power. Maybe thats something you could comment on in the video? :)

    • @samwitteveenai
      @samwitteveenai  16 часов назад +3

      Love your comment about frameworks for 4-5 days, so so true. I have started using Pydantic in a few things so will certainly make more vids about it.

  • @BM-ni4uz
    @BM-ni4uz 13 часов назад +1

    Hi Sam, thanks for the great videos. Do you think that the bidirectional API will be cost-effective for real world applications (once officially released) ? I imagine building apps that use this API for continuous periods of time. What are your thoughts?

  • @immortalbk00
    @immortalbk00 4 часа назад

    Damn... I could terminate my child's English tutor and save on monthly tuition fees.
    I gave the system prompt below, and a screenshot of his school's final year exam English paper and it work wonders!
    "You are an english language tutor. You would help the student on english language questions but not providing the answer directly. If a student gets the answer wrong, guide the student towards the correct answer by explanation and a series of questions to assist him or her."
    RIP tuition centres

  • @MrKrzysiek9991
    @MrKrzysiek9991 15 часов назад

    Thanks, as always a great video :)

  • @oOserkanCakmakOo
    @oOserkanCakmakOo 11 часов назад

    Thank you very much

  • @thenoblerot
    @thenoblerot 16 часов назад +1

    It's strange to me that the textual output of the model almost seems to be like a speech-to-text transcription? I mean, it often has incorrect punctuation, capitalization, even the wrong words sometimes?
    Not that one can trust a model to know about itself, but it says it generates text first and tts is secondary. Odd

    • @samwitteveenai
      @samwitteveenai  16 часов назад +1

      It actually could be that. I don’t think the model outputs the text and audio together

    • @thenoblerot
      @thenoblerot 15 часов назад

      @@samwitteveenai If true.. what a bizarre pipeline. I always thought dear sweet dumb Moshi had a cool approach, with distinct internal monologue, text, and audio streams.

  • @DanielWeikert
    @DanielWeikert 15 часов назад

    I tried it, and it is quite good in english. Sometimes i notice a slight delay in the conversation. Have you experienced that? I also tried it in german, and it's really bad, not even close to ChatGPTs voice mode. I fully understand that german is difficult, I just hope there will be some improvment there. br

    •  14 часов назад

      Same here. It’s unusable in Czech and French.

    • @samwitteveenai
      @samwitteveenai  13 часов назад

      Did you try multiple times? I had some issues with it before the release where on one time the non English language would be good then extremely bad. It was like it was a seed issue was changing the voice. This affected speed and accent. It is still an experimental model so I will pass the feedback along.

    •  9 часов назад

      @@samwitteveenai I went back and tried a few times with different voices and also a Czech system prompt. Nothing. It is not just bad. It is absolutely awful -- like an old style English TTS reading out a text in a foreign language. But it could understand anything I say in Czech. Also, when I turned the Realtime API into text output, the Czech was just fine. So it's just pronunciation - GPT4o Realtime, btw, is perfect at this. I'm working on a project on language teaching with LLMs, so was really looking forward to trying this as an alternative.

  • @unclecode
    @unclecode 3 часа назад

    The first playground for developers that actually non-developer can use it for daily life issues 😅 amazing

  • @rcoding513
    @rcoding513 4 часа назад

    so.... no api action here!!!!

  • @joselobo6902
    @joselobo6902 17 часов назад

    First 😅

  • @mrchongnoi
    @mrchongnoi Час назад

    Google got it. Sometimes it is best to come from behind. Review he golang API. Looks good. Will test it out later this week.