OpenAI Realtime API - The NEW ERA of Speech to Speech? - TESTED

Поделиться
HTML-код
  • Опубликовано: 26 ноя 2024

Комментарии • 46

  • @KCM25NJL
    @KCM25NJL Месяц назад +29

    Yeah, no basement dweller dev's are gonna be messing with that API until the costs drop by at least 100x, which I honestly only see as a near term incentive for Meta to get a Llama Voice model cookin'

    • @jamesjonnes
      @jamesjonnes Месяц назад

      I'll use it, but can't wait for an uncensored open source version. Text only is too boring. I lack the patience to use text only for too long for the tasks I want, like learning languages.

    • @karmcy
      @karmcy 3 дня назад +1

      Well said, 3 tests today ~2mins each conversation. $1.5. Yikes!

  • @almirkaza
    @almirkaza Месяц назад +8

    can you share the url to the repo?

  • @boxeemusic
    @boxeemusic Месяц назад +8

    where can i find the code? pls help

  • @DarrenJohn10X
    @DarrenJohn10X Месяц назад +8

    Looking forward to seeing your alleged "spaghetti" code! (Right now 2 weeks ago is your latest repo)

  • @OliNorwell
    @OliNorwell Месяц назад +3

    Great work! You must have had a busy couple of days getting it working

    • @meetsummdev
      @meetsummdev Месяц назад +1

      you can really implement it in a few hours

  • @sykexz6793
    @sykexz6793 Месяц назад +10

    I don't think this is the same model as advanced voice mode.

  • @三川富資訊股份有限公
    @三川富資訊股份有限公 Месяц назад +2

    The Realtime API cost is high. I suggest that there is a cheaper way. 1.Using Google STT to get user's speech texts. 2.Send texts to GPT. 3. Get responses from GPT. 4.Send responses to Google TTS. 5.User gets AI responses in both texts and voices. The response time is longer and it costs lower.

    • @李征-u3n
      @李征-u3n 3 дня назад +1

      In that case, you don't need to use realtime API. OpenAI chat completion API I think works just fine.
      I think the key point is that realtime API has the ability to not miss any information from your voice (tone, intonation or accent), which means it can feel you like a real person, as least it is trying to.

  • @tommoves9935
    @tommoves9935 Месяц назад

    Happy to be the first to comment. Kris you are always up to date. Once again cool stuff from you. Spaghetti code... 🤣. Great that you did talk about the costs as well. I like your creative and often real funny ideas. Please keep up the great work! Regarding your phone call: saw a video from a guy in the US weeks ago (no Realtime API) - he did let his AI order a Pizza and it worked great. Latency even back then was good enough - should work perfectly. Maybe try it with an italian accent 😉. Thx from Tom!

  • @ibrahimaba8966
    @ibrahimaba8966 Месяц назад +1

    I just integrated it on Twilio, it changes everything, but it took me a bit of time.

  • @jamesyoungerdds7901
    @jamesyoungerdds7901 Месяц назад

    Great video, thanks Kris! I'm interesting in the function calling and structured output from the voice websocket return. Can you use agents or agentic flows with constrained and structured outputs with the voice mode 🤔

  • @Bangs_Theory
    @Bangs_Theory Месяц назад +2

    Which function controls the interruption?

  • @李征-u3n
    @李征-u3n 3 дня назад

    I don't quite understand what realtime means here, especially in text version
    In voice version, yes, you can interact with it like really talking to a person, such as you can interrupt the conversation, or maybe openAI can understand extra information from your tone or intonation or accent.
    But in text version, I don't see any difference with just use OpenAI chat completion API

  • @pjm17
    @pjm17 Месяц назад +1

    Could you achieve these results in an app just using the text to speech and speech to text with native ios features alongside openai NON realtime api's?

  • @JaredVBrown
    @JaredVBrown Месяц назад

    Would love the bankrupt myself with your code, i wont judge spaghetti, tried for 20 prompts with the new claude to get it up and running - no dice. Examples would be much apricated :)

  • @d3xrd527
    @d3xrd527 5 дней назад

    Where to find code?

  • @DeepSucess
    @DeepSucess Месяц назад

    can we have speech/voice as input to this app using websockets and get result as text as output?

  • @DesignDesigns
    @DesignDesigns Месяц назад

    This is mindblowing...

  • @DeepSucess
    @DeepSucess Месяц назад

    can It work for other languages such as urdu, hindi?

  • @nmana9759
    @nmana9759 Месяц назад

    Why wouldn't you share the repo?

  • @drewpeer
    @drewpeer Месяц назад

    Does everyone have access to this beta? Anything we have to do?

  • @Akander20
    @Akander20 Месяц назад

    where can i get the repo?

  • @Dea07thox
    @Dea07thox Месяц назад

    Can't you just better prompt it to have a less talkative output so you don't have to break it's response that often? That would make a big difference and everything more seamless :)

  • @Cutestreetcats
    @Cutestreetcats 27 дней назад

    where is the code?

  • @micbab-vg2mu
    @micbab-vg2mu Месяц назад

    Thanks :)

  • @icydemon9749
    @icydemon9749 28 дней назад

    can you provide a code ? please

  • @dievas_
    @dievas_ Месяц назад

    I still don't have access to it :/

  • @alarconfilms1
    @alarconfilms1 Месяц назад +1

    What is the code used?

  • @MagagnaJayzxui
    @MagagnaJayzxui Месяц назад

    What is AVA?

  • @saksham3
    @saksham3 Месяц назад

    Doesn't it have emotions?

  • @AI_Escaped
    @AI_Escaped Месяц назад

    No one is going to be even able to develop at these prices other than those with deep pockets. Just testing and figuring things out would be too expensive to even try.

  • @contentfreeGPT5-py6uv
    @contentfreeGPT5-py6uv Месяц назад

    i tested yesterday ,but
    Error al conectar: 403
    Acceso denegado. Verifica tu clave de API y los permisos para usar el API Realtime.

  • @thenoblerot
    @thenoblerot Месяц назад

    By telling it it is playing a game with the user, it might be failing on purpose to let you win!

  • @TheTrainstation
    @TheTrainstation Месяц назад

    Im waiting to hear the Irish accent to be sure

  • @benbrahimjamil1976
    @benbrahimjamil1976 Месяц назад

    How to get the repo ?

  • @DhairyaMarwah-l1u
    @DhairyaMarwah-l1u Месяц назад +5

    Can you share the repo link ?

  • @khanhhq2044
    @khanhhq2044 Месяц назад +3

    Can you share the repo link ?