Mistral AI API - Mixtral 8x7B and Mistral Medium | Tests and First Impression

All About AI

Просмотров 33 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 8 янв 2025

Комментарии •

@jorinator123 Год назад ⁺³⁶
The answer from mixtral medium "ball is lost during transit" doesn't necessarily mean that the ball is lost in the box during shipping, it could be lost during transit between the moment you put the ball in the bag and the moment you put the bag in the box. imho, the model got it right, the human just didn't interpret the result correctly. And the gpt4 answer you labeled as 'perfect' could be wrong as well. Depending on how the bag was tilted, the ball wouldn't have fallen out of the bag. I feel like the mistral-medium answer was the most accurate one.
@pickle9158 Год назад ⁺⁵
yeah in term of reasoning mistral is better. Even the small one said that it could be lost during transit or arrives at well
@kipchickensout 9 месяцев назад
I don't think he used GPT4 at all? There's a huge difference to GPT3.5
@DJPapzin Год назад ⁺⁵
🎯 Key Takeaways for quick navigation:
00:00 🚀 *Overview and Platform Introduction*
- Mistol AI API access for testing compared to GPT-3.5 and GPT-4.
- Introduction to Mistol AI platform features, including models, streaming options, and safe mode.
- Pricing overview and initial impressions of Mistol AI's competitiveness.
01:24 💰 *Pricing Comparison*
- Detailed pricing calculations for Mistol AI's medium and small models.
- Competitive pricing compared to GPT-3.5 Turbo.
- Ready to proceed with testing after pricing analysis.
02:21 🧠 *Testing Scenarios Introduction*
- Explanation of the three testing scenarios: Shirt problem, World model problem, and Python Snake game.
- Description of the reasoning and coding challenges posed to Mistol AI models.
04:21 🤖 *Testing GPT-3.5 on the Shirt Problem*
- Quick test of GPT-3.5 on the Shirt problem.
- GPT-3.5's incorrect response and analysis of the mistake.
- Setting the stage for Mistol AI's response to the same problem.
05:17 👕 *Testing Mistol Small Model on the Shirt Problem*
- Mistol Small Model's correct response to the Shirt problem.
- Highlighting Mistol's ability to understand parallel processing in the problem.
- Confidence in Mistol's capability based on the small model's performance.
06:08 🌍 *Testing Mistol Medium Model on the World Problem*
- Introduction to the World model problem.
- GPT-3.5's incorrect response to the World problem.
- Preparing to test both Mistol Small and Medium models on the same problem.
07:29 🌐 *Testing Mistol Small and Medium Models on the World Problem*
- Mistol Small Model's response and analysis.
- Mistol Medium Model's response and analysis.
- Comparison with GPT-4's accurate response to the World problem.
08:51 🐍 *Testing Python Snake Game - Mistol Small Model*
- Implementing and testing Python Snake Game code using Mistol Small Model.
- Evaluation of the generated code's quality.
- Comparison with Mistol's response to other models.
10:31 🎮 *Testing Python Snake Game - Mistol Medium Model*
- Implementing and testing Python Snake Game code using Mistol Medium Model.
- Evaluation of the generated code's quality and UI.
- Comparison with Mistol Small Model and GPT-4's responses.
11:42 🔄 *Testing Streaming Function - Mistol Tiny, Small, and Medium Models*
- Introduction to streaming functionality on Mistol AI.
- Quick streaming test on Mistol Tiny, Small, and Medium models.
- Comparison of streaming speed among different Mistol models.
13:05 🌐 *Conclusion and Future Plans*
- Positive feedback on Mistol AI's performance and streaming functionality.
- Expressing excitement about exploring other APIs and supporting Mistol's progress.
- Curiosity about Mistol's Medium model and the potential for more benchmarks and information.
Made with HARPA AI
@prepthenoodles Год назад ⁺³
🎯 Key Takeaways for quick navigation:
00:00 🤖 *Overview of Mistral AI models and pricing*
02:21 👕 *Comparing models on shirt drying word problem*
- Mistral gets it right, ChatGPT gets it wrong
06:22 🏀 *Comparing models on ball in bag world problem*
- GPT-4 reasons perfectly, Mistral models struggle
08:51 🐍 *Comparing models on coding snake game in Python*
- GPT-4 codes full game, Mistral gives partial code
11:27 ⏩ *Demo of Mistral streaming responses *
- All models stream paragraphs quickly
12:50 👍 *Overall positive, ready to explore more APIs*
Made with HARPA AI
@levieux1137 Год назад ⁺¹⁰
Mixtral is amazing. It's the first one I see that gets this question right: "In a totally classical family, a girl named Sally has 3 brothers Alfred, Bernard and Charlie, who each have 2 sisters. How many sisters does Sally have?" and it even explains it well. MistralAI is really trying to address the issues that plague other LLMs and that's great, it will end up with one we can finally trust a little bit more.
@xbon1 Год назад
um... what? GPT-4 can do this fine. unless you mean for local AI, then yeah this is by far and large the best local AI.
@levieux1137 Год назад ⁺⁶
@@xbon1 That's the point. I have zero interest in using a remote system that can disappear or change whenever it wants, and that I have to pay for. Local AI LLMs are the future, for the simple reason that the workloads that will rely on them are generally not of the type you're willing to offload to a remote entity. I'm pretty sure that there are use cases for very large models such as GPT-4 that can solve more complex problems than the ones you can run locally. But 7B LLMs such as Mistral-7B can run fast on your smartphone or laptop right now. That's already better for your data than GPT-4 for a vast majority of use cases.
@micbab-vg2mu Год назад ⁺⁹
Thank you for testing it - at the moment I use GPT API - but maybe in 2024 I will try some OpenSource models.
@AllAboutAI Год назад
Cool, yeah give it a go :)
@Taskade 8 месяцев назад
Eagerly anticipating Mistral’s debut in our upcoming Taskade Multi-Agent update! 🌈
@TVdosPatriotas 9 месяцев назад
So it doesn't have a friendly interface like Open AI's playground?
@wurstelei1356 Год назад ⁺²
By the way, the weights of Mixtral 8x7b are released, so you can run it locally with enough ram/vram.
@martinmakuch2556 Год назад ⁺¹
Yes, but you need a LOT of RAM. So only practical possibility is run quant versions (Bloke has them). I tried q4_k_m and q5_k_m but they performed pretty badly for me, maybe these composite models really need full precision to run well.
@paul1979uk2000 11 месяцев назад
@@martinmakuch2556 I ran the same model on my hardware, it uses a lot of ram, 32GB of system ram and around 11GB of vram, performance is around 5 tokens which is around reading speed, so it's not lightning fast but it's fast enough to use, that's on a Ryzen 3700x and Radeon 6700, and honestly, I'm surprised how well this actually works on my system, being that 32GB is quite cheap nowadays.
Also, a new update was just released for Faraday that offers Vulkan and is supposed to be a lot faster on AMD hardware, but I can't seem to get it to work, so performance might be better than 5 tokens on my hardware when it works.
@nazihfattal974 Год назад
Any AI Assistant functionality with tools (RAG, code interpreter and function calling type) you came across from any of the models out there?
@michaelpiper8198 Год назад
I appreciate you showing actual pricing numbers, could we perhaps normalize showing specific numbers for things like this as well as for size requirements for local versions (when available) would be very helpful going forward.
@onoff5604 Год назад
Great overview. Thanks for showing the code tests.
@SwizZLe333 Год назад
Thanks a bunch for showing comparisons of the different models and how they preform...
@mak_kry 10 месяцев назад
Does it require a lot of resources to run it locally?
@abenjamin13 Год назад ⁺²
Appreciate this thank you
@v-for-victory Год назад
Mistral has my sympathy bonus. I am able to run this offline as well.
@Itsjovannytovar Год назад ⁺¹
Do these LLMs pass ai detection?
@AllAboutAI Год назад ⁺²
Yeah most do anyway now :)
@dmy_tro Год назад
How do I finetune SOTA models? They're cool, but they don't allow me to make the most of them. Finetuning would solve that, and I'd pay for that, but they don't have such an option, and setting up everything locally manually is too complicated. GPT4, biggest Mistral model - I want to be able to fine-tune them!
@lenderzconstable Год назад
What is the reason programs like this and stable diffusion are not ‘plug and play’ with more simple installation?
@wielebnymoore805 Год назад ⁺¹
It is simple
@yuske05 Год назад
is mixtral running locally, is it using the internet at all?
@leucome 11 месяцев назад
Yes.. Both are available. Locally or over internet.
@lespenseesdejawade Год назад ⁺¹
thanks
@omoklamok 11 месяцев назад
do i need to pay for mistral?
@johnnny9 Год назад
how many tokens is 3k words ?
@Esakkiappan-r1u 9 месяцев назад
blog writer ? I'm also
@clearandsweet Год назад ⁺⁶
Just me, patiently waiting until a model this good is uncensored.
@haileycollet4147 Год назад ⁺³
Will be weeks or less, since they released the base model
@crippsverse Год назад
It's true. I think most people are the same.
@levieux1137 Год назад ⁺¹
quite frankly it's not much censored. You can already make it tell you jokes that cannot be repeated in public 🙂
@carstenli Год назад ⁺³
fyi The uncensored dolphin-mixtral version is already available.
@levieux1137 Год назад
@@carstenli absolutely, but it seems slightly less good than mixtral alone (while in the past dolphin used to improve on top of mistral).
@nope9310 Год назад
Sorry but for the ball in the bag with a hole problem, I'm going to have to give a fail to
GPT-3
GPT-4
Mistral-small
Mistral-medium
GPT-All-About-AI
All of you failed the test as none of the models understood all important aspects of the problem. Out of all models it seems Mistral models had the closest answers, but still missed the small probability that the ball rolled over the hold in the bottom of the bag, dropped out and the person did not notice. Mistral likely just assumed this was too stupid a scenario to consider.
@orbedus2542 Год назад ⁺²
NGL, around 6-11 months ago I really liked your content, and I thought a channel labeled "All about AI" would also cover any big updates in AI, critically testing and comparing models, and so on. Yet I miss any and all content on new stuff such as Grok, Bard, Gemini, and so on. I just no longer find the content interesting as it never covers the type of AI stuff I am interested in, which is weird considering the channel name. I personally am unsubscribing, but thought before doing so I should state why. best of luck with your channel though, nothing against yourself, you seem like a cool dude.
@JOHNSMITH-ve3rq Год назад
Why do all this manually ? Use gpt to write a test harness for thus stuff and posh it all go an excel or csv for quick review side by side. Easy.
@alexdonger5816 11 месяцев назад
Cause he wants to make "follow along with my tests and thoughts" content from it, its hard to drag out on something as efficient as ur example

Следующие

Автовоспроизведение

From OpenAI to Open Source in 5 Minutes Tutorial (LM Studio + Python)