How to Build Your Own AI Phone Assistant for Just 1¢/Minute (No Cloud, 1 Second Latency)

Bart Slodyczka

Просмотров 5 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 24 дек 2024

Комментарии • 92

@BartSlodyczka Месяц назад
📺 WATCH PART 2 - AI Cold Caller With Google Calendar: ruclips.net/video/J3d92Ak-P7o/видео.html
👉 GET THE CODE FOR FREE: bartslodyczka.gumroad.com/l/zsjdn
🛠 Hire me to build out an EPIC AI Voice Assistant for you: bart@supportlaunchpad.com
🧠 If you are interested in joining my incubator please fill out this form: forms.gle/KJxiqhB3aWxbgGoh8
📋 Take This Quick Survey: forms.gle/otAr1xUamgyYZE5y7
@JhnyBravos 14 дней назад
Don’t download the free code; it doesn’t work. Don’t anger yourself.
@mmdls602 29 дней назад ⁺²
Works flawlessly. The peeps mentioning latency -- its most likely your connection. I have consistently achieved sub 1 second, almost realtime performance with this. Nicely done dude. Function calling would be neat; especially crud ops with a db
@BartSlodyczka 29 дней назад
Noiccee!!!! 💪
@malikjaid5163 4 дня назад
Amazing video
I have one question, why are we using replit, can we deploy it on own servers like ec2 , and what things we need to change if done so.. thankyou
@arnabing 24 дня назад
This is amazing work! How does this compare in intelligence of the OpenaAI realtime api?
@BartSlodyczka 24 дня назад
Realtime API is MUCH better and if you can afford it, I would use that. The main reason is because the backend of the realtime api is a built in thread so you’re having a conversation with an “agent” - whereas in this set up we’re sending calls to the completions endpoint along with the entire conversation history. So it’s still very good, but inherently it is not an “agent” (so to speak). For basic calls/ tasks this current set up works great :)
@arnabing 23 дня назад
@ appreciate that! Also there’s the conversion delay. I wish the realtime was cheaper and had other voices.
@gurindersingh1713 Месяц назад
Yes really wanna see function calling like book appointments and transfer calls. Btw isn't it easier to do with livekit?
@BartSlodyczka Месяц назад
Good suggestions, will pencil them in 💪 have never used live kit before will check it out :)
@gurindersingh1713 Месяц назад
@@BartSlodyczka bro you can handle alot with livekit more easily. make sure you check it out. you will thank later, thats how good it is
@wordpressobsessed9067 Месяц назад
Thanks for this video! I've been meaning to set this up with the real time Twilio API, but just haven't gotten to it yet. Been using Vapi but its so expensive. i would like to see how to transfer a call to a real person, or actually book an appointment in a Google calendar. Definitely Eleven Labs integration too!
@BartSlodyczka Месяц назад
Great suggestions, google calendar keeps coming up so I will also look into this :)
@radoslav07 Месяц назад
Can you interrupt current voice response? Or can you try to finish your thought if you didn’t manage to say it in full and the agent started voice response? Like saying “continue” which will interrupt the response keeping the previous input prompt and allowing you to properly finish input prompt.
I implemented this Command words using Microsoft Azure speech services with continuous voice recognition.
+1 for adding function calling
@BartSlodyczka Месяц назад
You can do interruptions and toward the end of my video in the final demo I interrupt and continue speaking about the same topic, and the response was in line with what I was saying. The mechanism that sends API calls to the GPT actually holds all conversation items (user message and agent response) and sends the entire history with each api call, so each response is always contextually correct. I don't know how efficient this process is, but it works for now. And haven't thought about commands just yet, but good idea! And noted on function calling 🙏
@brentpope1497 29 дней назад
Yes 11 Labs, definitely!
Also, would love to see how you would implement a script rather than a faq
@BartSlodyczka 29 дней назад
Script is a solid idea, will do more thinking about this :)
@bradleyfraser4026 Месяц назад
I would like to see more the infrastructure side. How to have a small call centre structure
@BartSlodyczka Месяц назад
Very interesting suggestion! I will do more thinking about this 💪
@aliabassi1 Месяц назад ⁺¹
Solid build man amazing job!
@BartSlodyczka Месяц назад
Thank you legend :)
@danielpistola Месяц назад
why not use openai's realtime API? just because of the voices, right? please pardon my ignorance
@BartSlodyczka Месяц назад ⁺²
I’ve got other videos showing how to do that too 💪 but the realtime api is currently like 30 cents per minute to run, and since it’s still in beta it has some stability issues. But realtime api is very fast and I’m sure all the kinks will be ironed out soon :) great question to ask legend
@danielpistola 29 дней назад
@@BartSlodyczka
@emmanuelkolawole6720 Месяц назад
When I interrupt, the agent stops talking. Is there some kind of bug? I think it has to do with speaker. When I put my phone call on speaker the agent does not reply with audio after the third or fourth interaction. But when I take the phone off speaker it works fine
@BartSlodyczka Месяц назад
Hmm, that is strange. When I demo'd the interaction on youtube I had it on speaker and I had multiple conversation turns (so I spoke many times and the ai replied many times). Not really sure what it could be 🙏
@matt.lehodey Месяц назад
Need to figure out how to make that reasoner model that formulates the text think on graph now hmm
@BartSlodyczka Месяц назад
Very interesting 🤔
@neozys 26 дней назад
great it works! Can you expand on implementing function calling and eleven labs or cartesia as an alternative for TTS
@BartSlodyczka 24 дня назад
Awesome! And done will pencil it in 💪
@cryptnyuz6842 Месяц назад
can this ai agent can also speaks in different languages or just restricted to english only ?
@BartSlodyczka Месяц назад ⁺¹
Haven't tested but should be able to speak in different languages!
@mmdls602 29 дней назад
@@BartSlodyczka Tried it; doesn't come out as good as chatgpt, but it definitely works. I just added a line "you can understand and reply in Punjabi" in the prompt haha. The bottleneck in this pipeline is Deepgram's transcription.
@KasanThe Месяц назад
hmm what about using gsm modem for calling - AT commands and you are in home. or use voip gateway. Second thought i was thinking about building same purpose app but my main goals are be independent - selfhosted and do it as 'realistic ' as possible with low latency. Using external api it is to easy, building whole from scratch is a good challange to get to know with whole llm - ai -stuff.
@BartSlodyczka Месяц назад
I have heard of people using a local LLM to run the backend and it is possible, fast, and cheap if you did it this way. I haven't looked into this yet but there may be other videos about this online already. As for calling with GSM modem or VOIP, great ideas!
@reider340 Месяц назад
Hello Bart,
If you were to use deepgram's TTS Streaming service instead of plain REST api calls, wouldn't the response time be faster?
@BartSlodyczka Месяц назад
Hey legend, yes you're 100% correct, would be even faster than standard REST api calls. I think using elevenlabs streaming would be faster yet again. So really, there is so much opportunity in this code to have a really fast, really cheap AI Caller 💪
@danielpistola Месяц назад
can we do this connecting it to a custom GPT?
@BartSlodyczka Месяц назад
Yes you can, but this will be slightly more unstable as the assistants api is in beta (and there are like 5 or 6 api calls per request)
@danielpistola 29 дней назад
@@BartSlodyczka That makes sense. Thanks a lot for taking the time to respond!
@wawaldekidsfun4850 Месяц назад ⁺²
Cool tech demo, but let's think twice about automating every customer interaction just because we can. Sure, AI phone systems are cheaper than human staff, but real human connection in customer service is priceless. Personal relationships, genuine empathy, and human judgment are what build lasting customer loyalty. Maybe instead of replacing humans, we should use AI to help them do their jobs better? Sometimes the 'old way' with real people is still the best way, even if it costs more than 1¢ per minute. 🤔 Great tutorial though - the technical implementation is impressive!
@BartSlodyczka Месяц назад ⁺¹
Thank you and excellent point, for pretty much my entire journey with ai I have this assumption/ belief that initially businesses will adopt ai to save costs and have faster experiences, but then when everyone uses ai, the question will become “what is actually a good support experience?” And for that I think businesses will revert back to human support. It might not be 100% human, but maybe 50/50 with ai and human. Either way, I still use a 100% human customer support team for my ecommerce brand, but I do give my agents ai tools and augment other parts of our support experience with ai (eg ai chatbot, ai search on our help desk). I agree the tech is cool but we should use it wisely 💪 love the comment, I always want to see this kind of discussion 🤝
@ColdCallSteve Месяц назад
I couldn’t find your video where you layout how to use Ai on how to help real humans do their jobs. Any help?
@danielpistola Месяц назад
What about the MANY times customer service doesn't give a damn about their job and treat customers as if they were asking for a favor. What about the long waiting times? What about the lack of good manners?
@reserseAI Месяц назад
Its priceless when employing “customer service” not lazy employees
@VijiJohn-w3p 27 дней назад
It's the pareto 80/20 rule. 80% of CS requests are easily manageable and answerable through the various channels (bots, agents, knowledge base etc). It's then augmenting this with the human experience for the 20% of more involved requests of support and service.
@solarexclusivePL Месяц назад
Hello Bart! Do you think its possible to create something like this for polish market? But without using Twilio cause their rates are crazy
@BartSlodyczka Месяц назад
Siema! I'm not sure what Twilio alternatives work in Poland but you should be able to forward calls from the provider to the Replit code :) And I'm pretty sure you can also change the language to polish - so then you'd have a mega AI Caller 💪
@emmanuelkolawole6720 Месяц назад
Outbound agent please? In a way that we can schedule multiple calls one after another, to different customers
@BartSlodyczka Месяц назад
Great suggestion, will pencil it in!
@zubairkhankharooti3621 Месяц назад
hi bart... First of all thankew... Secondly... are you going to extend this video... like adding functions/tools..... that's the main purpose of building these callers.....
@BartSlodyczka Месяц назад ⁺¹
Hey legend! Yeah I will make a part 2 video with function calling 💪
@zubairkhankharooti3621 Месяц назад
@@BartSlodyczka thanks legends Chief...
@robertfigueroa425 29 дней назад
thank you so much.amazing video.i look forward to your other videos. im looking to create super reliable appointment booking ai assistants.i would definitely apppreciate a video on that subject.thank you.
@BartSlodyczka 29 дней назад ⁺¹
great suggestion my man!
@asithakoralage628 Месяц назад
You’re a legend mate,, great work. I’m learning a lot from your videos.. thanks mate.
@BartSlodyczka Месяц назад
Thank you very much 🤝 keep going man 🚀🚀
@mastermason Месяц назад ⁺²
Awesome! Thank you for sharing this. I have big plans for you.
@BartSlodyczka Месяц назад
Shit yeah!! 💪
@TheSopk Месяц назад
Thanks, what about Deepgram Voice Agent API Real Time?
@BartSlodyczka Месяц назад
Haven't thought about this before! Nice suggestion 💪
@cb4623 Месяц назад
Function calling booking appoinments
@BartSlodyczka Месяц назад ⁺¹
Penciling it in 💪
@mikew2883 Месяц назад
Very cool stuff! Function call would be nice to see. 👍
@BartSlodyczka Месяц назад ⁺¹
Thank you and done will pencil this in 💪
@vladimirrumyantsev7445 Месяц назад
Very nice explanation, love, watching your videos 👍
@BartSlodyczka Месяц назад
Thank you 💪
@erickmarin228 Месяц назад
Awesome! Thanks for sharing. I will definitely give it a try
@BartSlodyczka Месяц назад
Woot woot! Enjoy :)
@victorvanvas Месяц назад
FIRE CONTENT AS USUAL
@BartSlodyczka Месяц назад
Thank you Viski 💪
@digitalsoultech Месяц назад
Sorry but how is this 1c per minute? I'd really love to know how you came to that conclusion
@BartSlodyczka Месяц назад
I calculated the number or transcription minutes (STT) along with the characters spoken (TTS) via deepgram, then I compared this to the total cost spent via deepgram. This came to ~0.89 Cents (so under 1 Cent). From there I looked at OpenAI API Usage for the same period, which was negligible. So then I decided to just say it was 1 cent total. Hope this makes sense 💪
@Scienceiscool355 4 дня назад
Eleven labs plz
@micbab-vg2mu Месяц назад
thanks:)
@BartSlodyczka Месяц назад
Always 🤝
@sanjuburkule Месяц назад
this is 2s latency. didn't work.
@BartSlodyczka Месяц назад ⁺¹
Can be even faster with streaming api for deep gram TTS and even faster with streaming TTS elevenlabs
@zubairkhankharooti3621 Месяц назад ⁺¹
The problem is in sanju not in the app..
@sanjuburkule 28 дней назад
@zubairkhankharooti3621 You try it. Let me know if you are able to get 1s latency. Text to speech and speech to text WITH interruption support from India did not work. But I do want it to work. I will retry and post my findings. If it works, then awesomeness 👌
@sanjuburkule 28 дней назад
@mmdls602 mentioned he tried it and it worked for him. Let me find the fault in my deployment.
@Dispo-co4po Месяц назад
🔥🔥🔥🔥🔥🔥🔥
@BartSlodyczka Месяц назад
Letsss goooo 💪💪
@magicaldocs 17 дней назад
But this definitely has HORRIBLE turn taking, emotion detection and latency ..
Or Am i wrong ? Thats what the secret sauce of Retell, Vapi is :)
@BartSlodyczka 17 дней назад
Yeah the value prop here is the 1 cent per minute cost, and I agree that other purpose built tools like Retell and Vapi are better at the backend operations of AI calling systems 💪

Следующие

Автовоспроизведение

How to Code Your Own AI Cold Caller for Just 1¢ Per Minute