That’s pretty expensive. Especially if you wanted to build something with this for consumers, think about how pricy it would get. Monthly subscriptions would have to be like $50
@MaliRasko yes you can interrupt it, and it has automatic voice detection. So you pay only for the time you speak, not for silence. Still $20 for an hour of conversation requires a solid use case.
What would be the cost of the api usage given a scenario where there calls volume goes between 200000 min in a given month?? On an avg. cos it involves calls that goes on for hours n 10000 of calls,.
I haven’t had a chance to circle back to this yet! I did see cloudflare had a really nice looking relay for this though that I have been meaning to try!
What was the latency? Also is there a way to have it await the function call return via the websocket? Def a non starter if we just have to deal with it coming back in pieces
This API is too expensive; I think we should avoid sending all chunks. We need a local VAD (Voice Activity Detection) to send only the chunks that contain voice; otherwise, it could become costly.
Yup, just tinkering around to figure out how things work will drain your account. I don't see many people using this unless they have huge funding. Guess most of us will have to wait for open source or when openai drops the price later. Horrible pricing OpenAI.
@AI_Escaped I'm sure it's going to drop in price in a year from now... but I was hoping to start using this today for many usecases... like many others, I cobbled together a version of this using VAD, STT and TTS to/from GPT Chat Completions which wasn't overly fast to initial response (3-6 seconds), but otherwise a decent two-way conversation. I am going to try handling VAD and STT (send as text is 1/10th the cost) to see if this balances the tradeoff of converting to text to lower cost to use.
It’s normal; this system sends everything to the model, even if you’re not saying anything. It keeps filling the buffer, so we need to add a local VAD.
Love it, I am still waiting for the production like video :)
Talked with it for 5 min in the playground today. The cost was $2.35. Not too shabby.
That’s pretty expensive. Especially if you wanted to build something with this for consumers, think about how pricy it would get. Monthly subscriptions would have to be like $50
That's a 1990s sexline...
What service would work at that price?
The price is $20/hour. Like a junior sales rep.
What I want to know is if you can interrupt it?
@MaliRasko yes you can interrupt it, and it has automatic voice detection. So you pay only for the time you speak, not for silence. Still $20 for an hour of conversation requires a solid use case.
Great tool, if this was cheaper I would develop with it. Also, just emailed you about a sponsor opportunity. Cheers!
Cheers - I’ll have a look. Agree, I think as the price comes down it will be much more viable for more apps
@@DevelopersDigest Hey developer digest, following up here. Did you see my email? Thanks!
how to end call , how do we know if last audion has been played
What would be the cost of the api usage given a scenario where there calls volume goes between 200000 min in a given month?? On an avg. cos it involves calls that goes on for hours n 10000 of calls,.
When i give phonenumber as voice input, numbers gets mixed up. Could you help me?
I have the same issue, also difficulties understanding the last name. Twilio was more accurate
This is crazy!!
Any luck deploying?
I haven’t had a chance to circle back to this yet! I did see cloudflare had a really nice looking relay for this though that I have been meaning to try!
thanks :)
Thanks for watching!
What was the latency? Also is there a way to have it await the function call return via the websocket? Def a non starter if we just have to deal with it coming back in pieces
This API is too expensive; I think we should avoid sending all chunks. We need a local VAD (Voice Activity Detection) to send only the chunks that contain voice; otherwise, it could become costly.
can you make a cartoon character voice with it?
Always on 429 *to many req
Oh interesting - I hadn’t thought about the rate limit for this offering. I haven’t run into any issues yet
Crazy expensive @3.6 min cost $12.
Yup, just tinkering around to figure out how things work will drain your account. I don't see many people using this unless they have huge funding. Guess most of us will have to wait for open source or when openai drops the price later. Horrible pricing OpenAI.
And the voice sounds like crap
@AI_Escaped I'm sure it's going to drop in price in a year from now... but I was hoping to start using this today for many usecases... like many others, I cobbled together a version of this using VAD, STT and TTS to/from GPT Chat Completions which wasn't overly fast to initial response (3-6 seconds), but otherwise a decent two-way conversation. I am going to try handling VAD and STT (send as text is 1/10th the cost) to see if this balances the tradeoff of converting to text to lower cost to use.
It’s normal; this system sends everything to the model, even if you’re not saying anything. It keeps filling the buffer, so we need to add a local VAD.
@@johnnylarue3933 This is the way.