Mistral - MoE | The Most Unusual Release & How to Run

Поделиться
HTML-код
  • Опубликовано: 1 дек 2024

Комментарии • 23

  • @henkhbit5748
    @henkhbit5748 11 месяцев назад +3

    We have to wait about the new Mistral; the 7b model has been finetuned multiple times with very good results.. thanks for the update👍

  • @blacksage81
    @blacksage81 11 месяцев назад +17

    Eric Hartman and the Bloke will have this model trained, quantized, and singing within 2 weeks of the official release.

    • @marshallmcluhan33
      @marshallmcluhan33 11 месяцев назад +1

      Andreessen Horowitz (a16z) bankrolls all of them.

    • @jeffwads
      @jeffwads 11 месяцев назад +1

      Try more like 4 hours or so. He said so on HF.

  • @sarfarazhassan
    @sarfarazhassan 11 месяцев назад +3

    Yes please. It would be great if you can explain the "mixture of experts" concept..and maybe how it's different from multi model

    • @excido7107
      @excido7107 11 месяцев назад

      MoE is how GPT-4 and stuff work, imagine multiple fine-tuned models on different tasks and topics etc and merged together. So when you ask a question or input a prompt depending on the nature of the prompt a different "Expert" will be used to fulfill that prompt. At least I'm pretty sure thats the gist of it

  • @MikewasG
    @MikewasG 11 месяцев назад

    Thanks for sharing, it's very helpful! Looking forward to your other videos!

  • @satalaj
    @satalaj 11 месяцев назад +1

    Yes please. Lets us know how it works 🙏

  • @spotterinc.engineering5207
    @spotterinc.engineering5207 11 месяцев назад +1

    Please explain the "mixture of experts" concept

  • @888cromartie
    @888cromartie 11 месяцев назад +1

    I just ran a better implementation of the inference engine on an M1 Mac, it's definitely possible to run a quantized version on consumer hardware

    • @engineerprompt
      @engineerprompt  11 месяцев назад

      Nice, can you share some details? what was the memory usage like?

  • @MadsBuch
    @MadsBuch 8 месяцев назад

    Funny watching this video while I am running Mixtral on consumer hardware (MacBook M1 and X1 Extreme + RTX 4060) quite effeciently.

  • @Aarifshah-A
    @Aarifshah-A 11 месяцев назад +6

    Hindi and text is right :)

  • @ripper5941
    @ripper5941 11 месяцев назад

    Very based from mistral, love it

  • @Nick_With_A_Stick
    @Nick_With_A_Stick 11 месяцев назад

    Is it just me, or does it feel like Gemini is trained on benchmarks? Like how is it possibly so bad at coding, yet destroys gpt4 on human eval.

    • @michaelbarry755
      @michaelbarry755 11 месяцев назад +2

      I think they have to much to lose to cheat like that, they would be caught immediately and it would be extremely damaging to their reputation. Keep in mind it's fully multimodal from the ground up, so it should behave differently than what we're used to.

    • @luckerooni1153
      @luckerooni1153 11 месяцев назад +2

      @@michaelbarry755 but they were caught lol

    • @irabucc469
      @irabucc469 11 месяцев назад +1

      ​@@luckerooni1153and you don't even know what logic means 😂

    • @michaelbarry755
      @michaelbarry755 11 месяцев назад

      @@luckerooni1153 They weren't caught cheating on evaluations, they just spiced up their PR, which definitely backfired. I'm not trying to defend Google, as I'm all for open source, but I do believe we should give them credit where it's due, accusing them of deliberately training on test data without evidence is libel and could land you in hot water if you had anything to lose.

  • @ayrengreber5738
    @ayrengreber5738 11 месяцев назад

    It dropped because EU passed an AI law

  • @GiovanneAfonso
    @GiovanneAfonso 11 месяцев назад

    How not to trust someone13574