World’s Fastest Talking AI: Deepgram + Groq

Поделиться
HTML-код
  • Опубликовано: 27 янв 2025

Комментарии • 175

  • @geopopos
    @geopopos 10 месяцев назад +3

    I built a voice bot with this exact set up steaming and all quite a few months ago and the biggest issue was the latency! So excited to see this is no longer a Problem!

  • @matten_zero
    @matten_zero 10 месяцев назад +29

    Deepgram is the most slept on AI player.

    • @DataIndependent
      @DataIndependent  10 месяцев назад +3

      yeah they have a ton of good data to work with

    • @deepbhatt6339
      @deepbhatt6339 7 месяцев назад

      Hi is this completely free and opensource?

    • @kelvindimson
      @kelvindimson 7 месяцев назад

      @@deepbhatt6339you want deepgram to be free? 😂😂😂😂 you lot are funny

  • @StuartJ
    @StuartJ 10 месяцев назад +7

    I purchased a ReSpeaker Mic Array v2.0 for this purpose. It captures speech with great clarity. Works out of the box on Linux, so should be able to make a standalone voice assistant with it.

    • @DataIndependent
      @DataIndependent  10 месяцев назад +3

      Cool that sounds good, please share when it's out

  • @theflipbit01
    @theflipbit01 10 месяцев назад +5

    I experimented with integrating Groq with Siri shortcut, and it was quite interesting. The response time was pretty impressive.

    • @kaiwenhe5518
      @kaiwenhe5518 6 месяцев назад +1

      how?

    • @theflipbit01
      @theflipbit01 6 месяцев назад

      @@kaiwenhe5518 groq provides free API and it was a just matter of using that by selecting one of the free open source model with Siri shortcuts. You can see the examples on my channel. I have made the Siri shortcut available too on vid description that you can edit and try yourself...😃. I'm afraid it would be considered spamming if I post a link here lol.

  • @stevecoxiscool
    @stevecoxiscool 10 месяцев назад +6

    Great heads up on new STT/LLM/TTS technology. I had been working on a Unreal Meta Human demo which got pretty close to RT using a Google STT/ChatGPT/TTS. One of the other things to think about if one want's to get into 2D/3D talking head chat apps is streaming back viseme data as well as TTS audio. Plus maybe emotion tokens of some kind. I can't wait for all of this to be offered on one platform/api service.

    • @nerdyboi-p4l
      @nerdyboi-p4l 8 месяцев назад

      you can only have somewhat close to RT with Google or Deepgram, you should go with on-device

  • @avi7278
    @avi7278 10 месяцев назад +9

    This is exactly what I wanted to do this weekend. Great timing.

    • @damien2198
      @damien2198 10 месяцев назад

      Same, I want to try to do translation (language identification per speaker / switching whisper model could be tricky)

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      nice to both of you!

    • @avi7278
      @avi7278 10 месяцев назад +1

      @@DataIndependent I ended up pulling down your project and ran into a few issues, fixed them and made a few improvements like response interruption. It's pretty good but groq being limites to llama 2 and Mistral is a shame. It is at least able to fill the time of latency of the more powerful models so that at least it still feels more natural even if the first 10 seconds are filler until openai or Claude 3 ingests input and starts streaming.

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      @@avi7278 That's awesome! If you were open to doing a PR to add interruptions I would definitely want to share it with the community.
      Did you add filler too?

    • @avi7278
      @avi7278 10 месяцев назад

      @@DataIndependent it's on my list for this weekend, cheers, appreciate the starter.

  • @xXWillyxWonkaXx
    @xXWillyxWonkaXx 3 месяца назад +1

    So in a nutshell, streaming is basically using a websocket - getting a chunk of the text to analyze quickly rather than have bit sent over little by little?

  • @balubalaji9956
    @balubalaji9956 10 месяцев назад +4

    this is exactly i am looking for .
    thank you youtube.
    love you lots

  • @Slimshady68356
    @Slimshady68356 10 месяцев назад +3

    Thanks Greg you do alot for the community ,I have respect for you did in semantic chunking in langchain repo

  • @clumsymoe
    @clumsymoe 8 месяцев назад +5

    Thank you for creating this tutorial it's exactly what I was looking for. Great content!

  • @HideousSlots
    @HideousSlots 9 месяцев назад +1

    conversational endpointing is a great idea, but I'd like to see that combined with a small model agent that was only looking for breaks in the conversation and an appropriate time to interject. Maybe with a crude scale for the length of the response. So if the user has a break in the point they're trying to make - we don't want the user interrupted and the conversation moved on - what would be more appropriate would be a simple acknowledgement. But once the point is complete, we would then pass back that we want a longer response.

    • @frothyphilosophy7000
      @frothyphilosophy7000 9 месяцев назад

      This. I need something like this for a project, but I'm not very familiar with Groq or Deepgram yet; just starting to dig in. This thing starts responding with the first little pause, so it constantly cuts me off when I'm just pausing momentarily to think of how I want to phrase the rest of my sentence. If it wants to send data at every minor pause in order to understand context, predict the full query, and begin formulating a response, that's fine-- but it needs to wait until I've finished my entire input before verifying/sending its response. Out of the box, this is like a person who doesn't actually listen to what you're saying and is just waiting for their turn to speak. Is there an easy way to affect the response times and/or understanding of when the user has finished a full thought or do I need to develop logic/rules from scratch?

    • @HideousSlots
      @HideousSlots 9 месяцев назад

      @@frothyphilosophy7000 not that I’ve seen. And this would be a massive leap in improving conversation. It literally just needs a small model to parse the text at every pause and see if it’s an appropriate time to interject. Just the same as a polite human would do. The groq api should be able to do it. I’m really surprised we haven’t seen this effectively enabled anywhere yet.

    • @frothyphilosophy7000
      @frothyphilosophy7000 9 месяцев назад

      @@HideousSlots Gotcha. Yeah, guess I'll need to implement that, as it's unusable otherwise.

  • @IvarDaigon
    @IvarDaigon 10 месяцев назад +2

    FYI you dont need to use an API for TTS or Speech to Text both can be run locally using Faster Whisper for TTS and Coqui for Speech even if you dont have the worlds most expensive GPU because both of those only use a couple of GB of Video Ram. Going forward, on device will be the way to go for TTS and STT because they simply do not require that much processing power.

    • @DataIndependent
      @DataIndependent  10 месяцев назад +1

      Nice - what kind of latency are you getting with those? Along with accuracy

    • @vaisakh_km
      @vaisakh_km 10 месяцев назад +1

      There is also Piper for tts, it's open source and wirks realy well in my non-gpu laptop

    • @moodiali7324
      @moodiali7324 9 месяцев назад +2

      there is a caveat with ur solution becuase fatser whisper does not detect silence out of the box and hence, you wouldnt know if the user has finished talking or not, which Deepgram does.

  • @abdelkrimdakouan7211
    @abdelkrimdakouan7211 10 месяцев назад +2

    Rasa NLU would be good for intention detection like greeting, closure , and the domain or custom intentions

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      I'm not familiar with that - thanks for sharing I'll check it out

  • @damien2198
    @damien2198 10 месяцев назад +4

    I suppose Whisper/ tiny model would be faster than this Deepgram ? have you tried?

    • @DataIndependent
      @DataIndependent  10 месяцев назад +1

      Nope not yet - let me know how it goes

  • @106rutvik
    @106rutvik 10 месяцев назад +1

    Hi Greg, Rutvik here. We have created something similar to this but using GPT as LLM and Elevenlabs as TTS. We are facing issues with Silence detection with Deepgram. I know you did mentioned in your video at 3:53 that we need to make sure that we dont talk too slowly. And unfortunately Deepgram only has MAX value of 500 ms for endpointing (Silence detection). Can you confirm if we are using proper configurations with Deepgram? Following are my configurations.
    'punctuate': True,
    'interim_results': True,
    'language': 'en-US',
    'channels': 1,
    'sample_rate': 16000,
    'model': 'nova-2-conversationalai',
    'endpointing': 500

  • @souravbarua3991
    @souravbarua3991 10 месяцев назад +2

    Looks cool. Thank you for showing this idea. Will definitely implement it in my project.

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      Nice! Good luck! What is the project you're building?

    • @souravbarua3991
      @souravbarua3991 10 месяцев назад

      @@DataIndependent I didn't started yet. But I will start it soon.

  • @AdrianIsfan
    @AdrianIsfan 10 месяцев назад +1

    Am i the only one not able to hear the TTS back even if i installed the ffmpeg? What am i mising? i tried both from vscode and from normal terminal.. nothing plays out, no errors though. connections to deepgram is checked and successful.
    Any hints?

    • @TheColdharbour
      @TheColdharbour 10 месяцев назад

      I’d also love to know this. I’m in the same boat! It works .. just silently for me too

    • @DataIndependent
      @DataIndependent  10 месяцев назад +1

      I updated the code, try again or make these changes yourself
      github.com/gkamradt/QuickAgent/commit/21ae2b0e286759e186e12a76addd250a5a491381

    • @ChigosGames
      @ChigosGames 25 дней назад

      I had the same, tried everything, didn't work. Ended up implementing it with other TTS's like Elevenlabs.

  • @Todorkotev
    @Todorkotev 10 месяцев назад

    Getting error when firing off the "request" in "speak()" - "err_msg":"Failed to deserialize JSON payload. Please specify exactly one of `text` or `url` in the JSON body." It works though, if I take the "voice" attribute out of the the "payload" AND also change the model to the model in their docs, which is "aura-helios-en". Other than that, thank you so much for sharing! It's hard work!

    • @DataIndependent
      @DataIndependent  10 месяцев назад +1

      Thanks for this - I found this same problem. I had it working in beta (before the video game out) so these were the changes needed for their prod version.
      Updated!

    • @Todorkotev
      @Todorkotev 10 месяцев назад

      @@DataIndependent Thanks Greg! Awesome content! I appreciate you!

  • @ajaykumarreddy8841
    @ajaykumarreddy8841 6 месяцев назад +1

    Hi Greg. Great video! Thanks for sharing.
    But I have some issues when running the code:
    Firstly, the Speech-to-text performance is not very good. I literally have to shout into my mic for it to be able to hear. I thought it was my microphone issue and tested it normally on a simple voice recorder and it worked as expected.
    Secondly, the text-to-speech voice output keeps breaking in between. Not sure if that is expected because of ffplay but definitely wasn't as smooth as what you showed in the video.
    Thirdly, the voice input is not getting recognized immediately once I get the response. It seems there is a small but noticeable delay to when I get the response back as voice to when I can again start speaking even though it says "Listening" in the console. I have to wait for like 5-10 seconds before I start speaking for the program to recognize my voice or else it isn't doing so.
    Is anyone else facing the same issue?

  • @jeffersonhighsmith7757
    @jeffersonhighsmith7757 6 месяцев назад

    "Filler words" this is hugely important, IMO. Because it's literally how human beings speak. They use these delaying tactics constantly.

  • @ScottSummerill
    @ScottSummerill 10 месяцев назад +1

    The original Groq demo had an impressive speech demo. How did that work? Interviewer interrupted Groq repeatedly.

    • @DataIndependent
      @DataIndependent  10 месяцев назад +1

      Can you link it?

    • @ScottSummerill
      @ScottSummerill 10 месяцев назад

      ruclips.net/video/pRUddK6sxDg/видео.htmlsi=Kg15nRUEEr1AHTGx

    • @ChigosGames
      @ChigosGames 25 дней назад

      @@DataIndependent I think Scott means this clip: ruclips.net/video/pRUddK6sxDg/видео.htmlsi=LYowVW7oODcbfqhh&t=233

  • @darkreader01
    @darkreader01 10 месяцев назад +2

    This is exactly what I was looking for. Thanks. But the text to speech functions seems to be not working in my windows. I tried writing the audio in a wav file (for debugging purpose), but the file is also can not be played. I thought may be codec issue, so I tried to convert the file in online to mp3, but got an error message showing " Invalid data found when processing input".
    Any idea, how can I make the text to speech function working? Another note: it does not show any error message in terminal, it just does not play the audio.

    • @ocin3055
      @ocin3055 10 месяцев назад +1

      Same issue here. Hoping for an answer, too.

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      Try again with these changes
      github.com/gkamradt/QuickAgent/commit/21ae2b0e286759e186e12a76addd250a5a491381

  • @cameronlanier7169
    @cameronlanier7169 10 месяцев назад

    Super fast example - really highlights the power of this tech

  • @jellybeanthe2nd223
    @jellybeanthe2nd223 4 месяца назад

    I'm having a lot of problems with the "building blocks" you put on github could you release the full finished script ?

  • @DadEncyclopedia
    @DadEncyclopedia 5 месяцев назад

    I don’t know why there is no sound when running TTS.

    • @didarkamiljanow4488
      @didarkamiljanow4488 4 месяца назад

      same issue, did you figure out whit the sound isn't playing?

    • @Blampa1456
      @Blampa1456 4 месяца назад

      @@didarkamiljanow4488 same issue have you figured it out? Haha

  • @VeronicaLightspeed
    @VeronicaLightspeed 8 месяцев назад +1

    how can we interrupt the ai??? plsss helpp

    • @kapilkevlani145
      @kapilkevlani145 7 месяцев назад

      Looking for the same. Did you got any solution?

  • @urglik
    @urglik 9 месяцев назад +1

    This app won't find my API keys either Groq or Openai though they are there. Too bad. Any suggestions greg?

    • @urglik
      @urglik 8 месяцев назад

      API's being found either!

    • @ChigosGames
      @ChigosGames 27 дней назад

      Try it with a Deepgram API key instead of OpenAI

  • @hjoseph777
    @hjoseph777 5 месяцев назад

    Can that be installed locally? Can you provide more detail on how to do the first step. The transcription is extremely fast

  • @personal1872
    @personal1872 5 месяцев назад

    I am not able to install PyAudio in WSL2

  • @abhijoy.sarkar
    @abhijoy.sarkar 15 часов назад

    I think you can get faster using audio to audio models like ultravox.

  • @You_Got_Us
    @You_Got_Us 10 месяцев назад

    Great video, Greg!

  • @aravindchandrasekaran8838
    @aravindchandrasekaran8838 8 месяцев назад

    Could you also add the latency for Audio to text of your voice?

  • @BitBlendAi
    @BitBlendAi 10 месяцев назад

    Facing WARNING: API key is missing
    Could not open socket: server rejected WebSocket connection: HTTP 401
    on Stt model

    • @DataIndependent
      @DataIndependent  10 месяцев назад +1

      Ya…you’ll need an API key which I can call out better in the read me

    • @BitBlendAi
      @BitBlendAi 10 месяцев назад

      I have my own api key with 300$ credit. Still it's showing this error
      Can you please send me a screenshot of your .env(hide your api)file​@@DataIndependent

    • @BitBlendAi
      @BitBlendAi 10 месяцев назад

      I have my api key with 300$ credit but still it's showing this error massage. So can you please send me your screenshot of.env( hide your API)​@@DataIndependent

  • @yudyjimeneztv692
    @yudyjimeneztv692 10 месяцев назад

    I have tthis error C:\Users\vnt>pip install deepgram
    ERROR: Could not find a version that satisfies the requirement deepgram (from versions: none)
    ERROR: No matching distribution found for deepgram

    • @Todorkotev
      @Todorkotev 10 месяцев назад

      You trying to use their python-sdk? Maybe try "pip install deepgram-sdk"

    • @DataIndependent
      @DataIndependent  10 месяцев назад +1

      Yeah - the suggestion here is the key
      github.com/deepgram/deepgram-python-sdk?tab=readme-ov-file#installation

  • @InterpretingInterpretability
    @InterpretingInterpretability 8 месяцев назад

    Could you help me understand what's going on. I'm running this in Docker and keep getting an error when it gets to running the .py where it's trying to use ffmpeg audio player and pyaudio.

  • @crystalstudioswebdesign
    @crystalstudioswebdesign 9 месяцев назад

    Can this be added to a website?

  • @cyberthugFi
    @cyberthugFi 4 месяца назад

    Do you have any Github Repository to be able to git clone?

  • @kapilkevlani145
    @kapilkevlani145 7 месяцев назад

    can somebody helps me with interruption handling in this provided video as I have created a voicebot which is running on UI but unfortunately it cannot handle voice interruptions.

  • @kevinduck3714
    @kevinduck3714 10 месяцев назад

    Holy hell that is incredibly fast

  • @michielsmissaert
    @michielsmissaert 9 месяцев назад

    Great stuff! Did you not cut in the video to reduce the waiting time for the LLM response? If you did not, the speed is impressive! Thank you so much!

  • @NoidoDev
    @NoidoDev 4 месяца назад

    I want a LPU for home (local hosting).

  • @TheRealHassan789
    @TheRealHassan789 10 месяцев назад +1

    How’s this compare against VAPI ?

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      Haven't tried that yet

    • @mmdls602
      @mmdls602 10 месяцев назад +1

      Vapi is extremely expensive. I think in a couple of months we should be close to Vapis performance with open sourced models and tooling.

    • @Blampa1456
      @Blampa1456 4 месяца назад

      ​@@mmdls602 I have been using Vapi. It works perfectly how I want to integrate this into my clients, but it's $0.12/minute, which is too expensive. Do you have any suggestions? Thank you!

  • @viralbakchodi2296
    @viralbakchodi2296 9 месяцев назад

    Sir can you provide full tutorial of this plzzzz

  • @elirothblatt5602
    @elirothblatt5602 10 месяцев назад

    Awesome, subscribed on the strength of this video.

  • @BezosAIDirector
    @BezosAIDirector 10 месяцев назад +4

    as a total beginner, I can't find a way to run this.

    • @mnagy0101
      @mnagy0101 9 месяцев назад

      Same here..

    • @hamishbrindle9754
      @hamishbrindle9754 8 месяцев назад

      aww

    • @lakshaynz
      @lakshaynz 5 месяцев назад

      Whisper is free - demo is on huggingface

    • @lakshaynz
      @lakshaynz 5 месяцев назад

      Google groq AI for the demo of ultrafast AI

  • @Javed.humayun
    @Javed.humayun 6 месяцев назад

    how can i put this in web app any idea?

  • @arunarun0386
    @arunarun0386 5 месяцев назад

    Can you provide us github url of this code

  • @nessrinetrabelsi8581
    @nessrinetrabelsi8581 9 месяцев назад

    Thanks! How does it compare with assemblyai universal 1? do you know which speech-to-text support arabic with the best accuracy in real time?

  • @princecanuma
    @princecanuma 10 месяцев назад

    Great video, Greg!
    You gave me some interesting ideas :)

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      Awesome - excited to see what you build

  • @tecnopadre
    @tecnopadre 10 месяцев назад

    on windows, having some problems with libraries. After fixing all dependencies, etc. I think I'm having problems with ffplay for the computer to send the audio to the speakers. Taking a look. Somehow I can't listen to ffplay.

    • @aiamaazing
      @aiamaazing 10 месяцев назад +1

      Debug the response from TTS without streaming first. I had to change the URL (it was some beta URL), they have changed it now and the one in Github right now returns an internal error (non 200 response), so the audio won't stream.

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      Both good points, I'll update the URL at least

    • @ferencdalnoki-veress163
      @ferencdalnoki-veress163 10 месяцев назад

      @@DataIndependent I tried it on Ubuntu Linux and (I think I) also have a problem with ffplay. However, somehow while it does convert my voice to text and LLM responds with text it does not convert text to sound. To test it I did different checks to verify that the connection with Deepgram is working. I used a test script where I used the Deepgram API with a text message and then I streamed the audio directly back to ffplay for playback and it worked. So that is why I am puzzled why the code is not working on the Linux side. Any help is appreciated. I truly enjoy your thoughtful tutorials and videos.

    • @ferencdalnoki-veress163
      @ferencdalnoki-veress163 10 месяцев назад +1

      There was an issue with payload. When I commented out the "voice": self.MODEL_NAME it worked. I also changed to pygame from ffplay on Linux. But that may not have been the issue.

  • @VeronicaLightspeed
    @VeronicaLightspeed 8 месяцев назад

    how could we interrupt the voicebot can anyone help (pls)

  • @aliabassi1
    @aliabassi1 10 месяцев назад

    Thanks for this! Testing it now ... the TTS audio isn't streaming and im struggling to fix it but ... THANK You for sharing your code! Super helpful and informative video!

    • @TheColdharbour
      @TheColdharbour 10 месяцев назад +1

      I’m in the same situation, got the listening and response working great but no TTS. Spent all day breaking and fixing it 😂 still confused why it won’t talk! Great guide (I needed python 3.11 to get I back working and had some issues with dotenv, ended up hard coding the APIs) - it’s a great piece of work! 👍👍

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      Try these updates
      github.com/gkamradt/QuickAgent/commit/21ae2b0e286759e186e12a76addd250a5a491381

  • @deeplearningdummy
    @deeplearningdummy 9 месяцев назад

    Awesome Greg! Best TTS-STT demo yet. Do you have any ideas on how to modify your example for two people having a conversation, and the AI participating as a third person. For example, debate students are debating and want the AI to be the judge to help them improve their debate skills. I would love to hear your thoughts on this. Thanks for this tutorial. I've been looking for this solution since the 90's!

    • @ChigosGames
      @ChigosGames 25 дней назад

      This would only work if the two persons keep talking to each other without any pauses. But I am also interested in seeing this working.

  • @FedeTango
    @FedeTango 9 месяцев назад

    Is there any alternative for Spanish? I cannot find it.

  • @ParthivShah
    @ParthivShah 3 месяца назад

    Pretty Amazing! Thanks.

  • @andrejss
    @andrejss 8 месяцев назад +1

    Thank you! Amazing!

  • @gjsxnobody7534
    @gjsxnobody7534 10 месяцев назад

    how to add those filler words?

  • @loryo80
    @loryo80 10 месяцев назад

    thank you so much for this video, i have a problem when i run the scipt : i have this error message :" Could not open socket: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:997) "
    any help please

    • @DataIndependent
      @DataIndependent  10 месяцев назад +1

      I had this problem a while back too and it was super annoying. The solution was a lot of googling and I think I might have even used it as a push to upgrade my python

    • @loryo80
      @loryo80 10 месяцев назад

      @@DataIndependent This doesn't look good for me then, hhhhhh, was the solution obvious or did it require a lot of manipulation?

  • @janyshosmon7414
    @janyshosmon7414 10 месяцев назад

    This is great, thanks for the awesome demo. I’m not a tech person, but can I clone my sales agent’s voice using ElevenLabs and integrate it to this process? I guess I can train the LLM to respond in his tone and sales style? Thanks

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      Yeah you could for sure. It would take a bit of work back and forth but it’s there

  • @zahaby
    @zahaby 10 месяцев назад

    Thank you very much for the vide, how can I integrate RAG into this pipeline?

    • @DataIndependent
      @DataIndependent  10 месяцев назад +1

      Put it in the LLM step, set up a retriever and then call to it

    • @zahaby
      @zahaby 9 месяцев назад

      @@DataIndependent Yes, thank you very much.
      vectordb = FAISS.from_documents(all_splits, embeddings)
      retriever = vectordb.as_retriever()
      BUT LLM doesn't accept retriver param,
      So I used ConversationalRetrievalChain

  • @maxxflyer
    @maxxflyer 5 месяцев назад

    what languages will work? where can I check it? thanks

  • @digitald74
    @digitald74 10 месяцев назад

    very nice, many thanks. latency of the llm still way too high. maybe 6-12 months to bring it down to

  • @lets-talk-ai
    @lets-talk-ai 6 месяцев назад

    Amazing video as always, thanks Greg. Did anyone had some issues with ffplay?

    • @ChigosGames
      @ChigosGames 25 дней назад

      I think all windows users... No sound.

  • @mnagy0101
    @mnagy0101 9 месяцев назад

    Is there any developper who can help me to set up this?

  • @hiranga
    @hiranga 10 месяцев назад

    @DataIndependent great video! Do you know how to get ChatGrok / Mixtral working with Langchain 'bind_tools'? I'd love to swap out my ChatOpenAI for ChatGrok if possible!

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      I haven't tried that out yet - sorry!

  • @Celso-tb6eb
    @Celso-tb6eb 9 месяцев назад

    i cloned the code but response time is like 12 seconds. 4 weeks past and i'm late to the party

  • @dgpreston5593
    @dgpreston5593 7 месяцев назад

    Nicely done

  • @BlayneOliver
    @BlayneOliver 10 месяцев назад

    Wow that was fast… Grok literally went open source 😮

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      I was able to sneak early access for the video

  • @sethjchandler
    @sethjchandler 10 месяцев назад +2

    Would be great for training lawyers to take depositions and do other oral tasks.

    • @J3R3MI6
      @J3R3MI6 10 месяцев назад +2

      Yep it would… But unfortunately lawyers are probably done for… like 99% of them.

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      It'll help augment roles ha, not replace

  • @RADKIT
    @RADKIT 10 месяцев назад

    Showcase it within a streamlet app doing Langchain shenanigans! Please!

    • @DataIndependent
      @DataIndependent  10 месяцев назад +1

      Like what kind of shenanigans?

    • @RADKIT
      @RADKIT 10 месяцев назад

      @@DataIndependent
      Starter difficulty
      Steamlit app using Langchain and Deepgramdeepgram that allows you to upload a pdf / embed then when ready we can simply chat with it live , asking a question , retrieving information
      Advanced difficulty / Aspirational
      An agent using function calling and a set of tools like web search / calculators etc ...
      if we can have an asynchronous continues conversation with one supervisor agent who asynchronously can "ask" other agents to do time consuming tasks, like being able to talk to the supervisor agent in Langgraph schema

  • @JOSEGARCIA-ch2jp
    @JOSEGARCIA-ch2jp 7 месяцев назад +1

    pricing, pricing, pricing always pricing.

  • @paulhilton74
    @paulhilton74 10 месяцев назад

    Amazing video.. Just wish it had a walk though of instructions

  • @andrewtschesnok5582
    @andrewtschesnok5582 9 месяцев назад +1

    Nice. But in reality your demo is 3,500-4,000 ms from when you stop speaking to getting a response. It does not match the numbers you are printing...

  • @ahmadalis1517
    @ahmadalis1517 10 месяцев назад

    I miss your videos!

  • @interspacer4277
    @interspacer4277 10 месяцев назад

    ElevenLabs turbo with deepgram stt. As tested. Cant beat it.

  • @suvarnadhiraj
    @suvarnadhiraj 10 месяцев назад

    do you know if any of the open source models (STT and TTS) with Groq give the same latency?

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      My guess is no - oss usually isn't as fast as paid

  • @igornefedovi
    @igornefedovi 10 месяцев назад

    Wow - incredible!

  • @thedoctor5478
    @thedoctor5478 10 месяцев назад

    Doesn't this suck? I've been subscribed to Deepgram for quite a while now. You'd think they would have some good competition. Alas, none that have stuck around, and open-source is infested of dragons.

  • @JasonHamilton38
    @JasonHamilton38 10 месяцев назад

    I'm not a coder, but can someone please build one of these that I can pay for so I can basically have my own Jarvis to talk to on my computer?

  • @gaborm5673
    @gaborm5673 10 месяцев назад

    I can't drop the LLM below 3000ms :( maybe im fucking it up?

  • @SzamBacsi
    @SzamBacsi 3 месяца назад

    Groq performs well with English but struggles with more complex languages such as Hungarian.

  • @coopernelson6947
    @coopernelson6947 8 месяцев назад

    Yuppppp

  • @gaijinshacho
    @gaijinshacho 10 месяцев назад

    No good can come of this.....lol

    • @DataIndependent
      @DataIndependent  10 месяцев назад

      yeah - it'll be tricky to navigate but there will be use somewhere

  • @EDUSidekick
    @EDUSidekick 6 месяцев назад

    The voices are all terrible, imho. They would work well in a cyberpunk game - but that's about it. :D

    • @ChigosGames
      @ChigosGames 25 дней назад

      Try it with Elevenlabs voices.