MoE is how GPT-4 and stuff work, imagine multiple fine-tuned models on different tasks and topics etc and merged together. So when you ask a question or input a prompt depending on the nature of the prompt a different "Expert" will be used to fulfill that prompt. At least I'm pretty sure thats the gist of it
I think they have to much to lose to cheat like that, they would be caught immediately and it would be extremely damaging to their reputation. Keep in mind it's fully multimodal from the ground up, so it should behave differently than what we're used to.
@@luckerooni1153 They weren't caught cheating on evaluations, they just spiced up their PR, which definitely backfired. I'm not trying to defend Google, as I'm all for open source, but I do believe we should give them credit where it's due, accusing them of deliberately training on test data without evidence is libel and could land you in hot water if you had anything to lose.
We have to wait about the new Mistral; the 7b model has been finetuned multiple times with very good results.. thanks for the update👍
Eric Hartman and the Bloke will have this model trained, quantized, and singing within 2 weeks of the official release.
Andreessen Horowitz (a16z) bankrolls all of them.
Try more like 4 hours or so. He said so on HF.
Yes please. It would be great if you can explain the "mixture of experts" concept..and maybe how it's different from multi model
MoE is how GPT-4 and stuff work, imagine multiple fine-tuned models on different tasks and topics etc and merged together. So when you ask a question or input a prompt depending on the nature of the prompt a different "Expert" will be used to fulfill that prompt. At least I'm pretty sure thats the gist of it
Thanks for sharing, it's very helpful! Looking forward to your other videos!
Yes please. Lets us know how it works 🙏
Please explain the "mixture of experts" concept
I just ran a better implementation of the inference engine on an M1 Mac, it's definitely possible to run a quantized version on consumer hardware
Nice, can you share some details? what was the memory usage like?
Funny watching this video while I am running Mixtral on consumer hardware (MacBook M1 and X1 Extreme + RTX 4060) quite effeciently.
Hindi and text is right :)
Very based from mistral, love it
Is it just me, or does it feel like Gemini is trained on benchmarks? Like how is it possibly so bad at coding, yet destroys gpt4 on human eval.
I think they have to much to lose to cheat like that, they would be caught immediately and it would be extremely damaging to their reputation. Keep in mind it's fully multimodal from the ground up, so it should behave differently than what we're used to.
@@michaelbarry755 but they were caught lol
@@luckerooni1153and you don't even know what logic means 😂
@@luckerooni1153 They weren't caught cheating on evaluations, they just spiced up their PR, which definitely backfired. I'm not trying to defend Google, as I'm all for open source, but I do believe we should give them credit where it's due, accusing them of deliberately training on test data without evidence is libel and could land you in hot water if you had anything to lose.
It dropped because EU passed an AI law
How not to trust someone13574
:)