Creating J.A.R.V.I.S.

Поделиться
HTML-код
  • Опубликовано: 15 май 2024
  • A sneak peek of voice-to-voice chat assistant.
    🦾 Discord: / discord
    ☕ Buy me a Coffee: ko-fi.com/promptengineering
    |🔴 Patreon: / promptengineering
    💼Consulting: calendly.com/engineerprompt/c...
    📧 Business Contact: engineerprompt@gmail.com
    Become Member: tinyurl.com/y5h28s6h
    💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
    Signup for Advanced RAG:
    tally.so/r/3y9bb0
    All Interesting Videos:
    Everything LangChain: • LangChain
    Everything LLM: • Large Language Models
    Everything Midjourney: • MidJourney Tutorials
    AI Image Generation: • AI Image Generation Tu...
  • НаукаНаука

Комментарии • 50

  • @MeinDeutschkurs
    @MeinDeutschkurs Месяц назад +2

    Wooohooo!! Yeah, can‘t wait for it! ⭐️

  • @barackobama4552
    @barackobama4552 Месяц назад +2

    Impressive, thanks!

  • @comfyuiadrian
    @comfyuiadrian Месяц назад

    Wahooo..really looking forward to your new project!

  • @Techonsapevole
    @Techonsapevole Месяц назад +1

    it's fast which TTS and STT did you use ?

  • @Thorin632
    @Thorin632 Месяц назад

    Please make beginner friendly tutorial, step by step guide on how to integrate this with localgpt 🙏🙏

  • @brianpereira7757
    @brianpereira7757 Месяц назад +2

    That doesnt sound like Jarvis, I want the real Jarvis voice!!!

    • @engineerprompt
      @engineerprompt  Месяц назад +1

      Good point, I think elevanlabs have that. Will try to integrate that :)

    • @sayantandas7544
      @sayantandas7544 Месяц назад

      ​@@engineerprompt How about you add a little UI also? And maybe add a button to take continuous screenshot with a regular interval as well. In that way, you will be releasing the OpenAI's demo app before OpenAI.

  • @aa-xn5hc
    @aa-xn5hc Месяц назад

    Great looking forward

  • @RickySupriyadi
    @RickySupriyadi Месяц назад

    yes please is it going open source?

  • @user-jq1gc8lt7s
    @user-jq1gc8lt7s Месяц назад

    I LIKE IT GREAT JOB

  • @joepropertykey3612
    @joepropertykey3612 Месяц назад

    Right on Bro, RIGHT ON. ......... but we need the voice of Cortana for this, for when we are sitting around in our Mark V Armor and coding...:)

  • @3choff
    @3choff Месяц назад

    Very interesting project! Do you use any VAD to detect the end of the request?

  • @GetzAI
    @GetzAI Месяц назад

    EXCITED!

  • @GroqSummarizer
    @GroqSummarizer Месяц назад

    Nice!

  • @RickySupriyadi
    @RickySupriyadi Месяц назад

    also i request a video about this vs gpt-4o

  • @themax2go
    @themax2go Месяц назад

    should edit title to add "using openai"

  • @im-notai
    @im-notai Месяц назад

    Idk know, why there is a folder on my desktop named Jarvis-v6 since 5 months and surprisingly that's also doing the same job 😮

    • @engineerprompt
      @engineerprompt  Месяц назад

      Would love to see what's in the folder :D I am v0 now

    • @im-notai
      @im-notai Месяц назад

      @@engineerprompt it's gonna become interesting. I thought I was the one who was able to crack speech while streaming to reduce the latency.

  • @KiyotokaAyanakoji-ss1gn
    @KiyotokaAyanakoji-ss1gn Месяц назад +2

    What TTS are you using and is it running locally

    • @engineerprompt
      @engineerprompt  Месяц назад +3

      Whisper but via the api. Nothing is running locally in this video. Local version will be coming soon.

    • @KiyotokaAyanakoji-ss1gn
      @KiyotokaAyanakoji-ss1gn Месяц назад

      @@engineerprompt loved it 👍

    • @Gun_ForFun
      @Gun_ForFun Месяц назад +1

      @@engineerprompt but Whisper is ASR, not TTS??

    • @snapman218
      @snapman218 Месяц назад

      Gross.

    • @themax2go
      @themax2go Месяц назад

      someone already made a fully local version and works w/ little latency and with voice training. there already exist projects on github for continuous speech using a keyword to trigger recording, and a version with a ptt implementation instead of keyword

  • @borisrusev9474
    @borisrusev9474 Месяц назад

    I don't get it, how's that different from GPT-4o?

    • @engineerprompt
      @engineerprompt  Месяц назад +1

      You are right, very similar in functionality. In fact, this version is using GPT-4o for text generation. But the voice functionality is not available in GPT-4o yet.

  • @smoofwah3552
    @smoofwah3552 Месяц назад

    Is there a way to speed it up?

    • @engineerprompt
      @engineerprompt  Месяц назад

      Yes, Groq has whisper support now. Going with that but the issue is the rate limit!

    • @alx8439
      @alx8439 Месяц назад

      To use rhasspy3 as a base. It streams audio directly to asr model

  • @Soniboy84
    @Soniboy84 Месяц назад

    how it's different than gpt4o voice?

  • @danieldjinishiandebriquez1858
    @danieldjinishiandebriquez1858 Месяц назад

    What apis are being used?

    • @engineerprompt
      @engineerprompt  Месяц назад

      currently everything is openai. Just got access to whisper from Groq, will update it and hope will be much faster!

    • @danieldjinishiandebriquez1858
      @danieldjinishiandebriquez1858 Месяц назад

      @@engineerprompt great! Looking forward the tutorial or git repo. Literally yesterday I was searching about Jarvis haha

  • @temp911Luke
    @temp911Luke Месяц назад

    Nice but would be great without that annoying 2-3 sec delay.

    • @engineerprompt
      @engineerprompt  Месяц назад

      I agree, I just got access to Groq Whisper. Will be interesting to see how that works.

    • @fontende
      @fontende Месяц назад

      ​@@engineerpromptGeorge Hotz on stream called groq a scam...

  • @themax2go
    @themax2go Месяц назад +2

    not local. not the jarvis voice. misleading title. disappointed

    • @javiergimenezmoya86
      @javiergimenezmoya86 Месяц назад

      Why do you think that is not local? The only bad thing is that he do not use voice streaming for make it faster (I did it so)