DeepSeekR1 - Full Breakdown

Sam Witteveen

Просмотров 42 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 1 фев 2025

Комментарии • 86

@hiawoood 11 дней назад ⁺¹¹
The most useful video about ds r1 in youtube. I enjoy the concise and approachable technical details in your videos. Please never stop posting.
@nufh 11 дней назад ⁺⁷⁰
This open model is so good, hard to believe that this is MIT license.
@lemniscif 11 дней назад ⁺²
Well, with TikTok getting regulated, there needs to be a new hole.
@MS-wz9jm 10 дней назад ⁺¹
When you read the paper Deepseek says themselves there is a lot more meat left on the bone. Expect a follow up model pretty quick.
@yourpc 10 дней назад ⁺⁴
what? I'm using opeai in production, and tried this model with the same config using langChain. It worse than open ai 3.5. They cannot even return simple json - just plain
text
@hipotures 10 дней назад ⁺¹
@@yourpc Yup. I only tested the 32B q8. Definitely inferior to o1, o1 is a different league. Big problems with languages other than English (can mix different languages in one sentence). Not suitable for programming, but maybe this is a Roo Code@VSC problem.
I have tried to draw out an analysis of approaches to the problem of the Riemann Hypothesis. Every 5-7 paragraphs he repeated what he had done before, slightly altered. I set the context at 64k. No higher maths, the only formulas he gave were those on the Wikipedia pages. Then I asked o1 to check what r1 had given, everything was true, but as o1 put it: the text is more like brainstorming - he writes out very different ideas, which have one thing in common: each of them in its own way tries to limit (or explain) the position of the zeros of ζ.
@yourpc 10 дней назад
@@hipotures I tried 8b. So I have structuredOutput config via LangChain, but in result I got only text chat. Also I tried with super simple example from lagncain to translate from English to French, and it translated, but also added around 5 sentence more text then just translate "I love programming"
@briancase6180 10 дней назад ⁺⁷
Nice deep dive. These models are great, and are actually doing something I wasn't sure was possible. Now that I see it, I'm not sure why I thought this would be difficult. 🤷
@samwitteveenai 10 дней назад ⁺³
You make a really good point, when you actually see what they're doing, it's not as complicated as a lot of people would think.
@cariyaputta 11 дней назад ⁺⁴⁹
This is the greatest gift for for the upcoming Chinese New Year holiday.
@lipinglin1994 11 дней назад ⁺²
That’s why there is a discount for API. I am going to use it during the holiday.
@JH-bb8in 10 дней назад ⁺²
you mean lunar new year
@cariyaputta 10 дней назад ⁺³
@@JH-bb8in I specifically refer to the Chinese starting date, not every lunar calendar is the same, the Indian's starts at March 22 for example.
@junyang1688 7 дней назад ⁺²
@@JH-bb8inChinese new year
@JH-bb8in 7 дней назад ⁺¹
@@junyang1688 nah
@SapienSpace 4 дня назад ⁺³
I dug a little into the DeepSeek R1 paper, particularly GRPO (Group Relative Policy Optimization) and GAE.
It is fascinating to me that GAE (Generalized Advantage Estimation) is related to adaptive control systems.
I would not be surprised if we discover that the process of learning to achieve "intelligence", "reasoning" and maybe even "consciousness" is an adaptive control system process in itself. Was probably a biological mutation that became an advantage, and now we humans may have discovered how to build this advantage into silicon space.
@JBoy340a 3 дня назад
Sam. Great job explaining they engineering that went into DeepSeekR1.
@samwitteveenai 2 дня назад
Hey Jerry thanks. Hope you are well and that we catch up again this year at I/O
@ericgardell4189 10 дней назад ⁺²
The 1.5B model was randomly spitting out emojis and mandarin when I was playing with it yesterday, but the "reasoning" is still really impressive, I'm excited to play with the larger ones
@samwitteveenai 9 дней назад ⁺¹
interesting what did you use for the System prompt? I am wodering how much it responds to that etc and if that helps it stay in one language
@alexslee5356 11 дней назад ⁺²
Always concise explanation and right to the point. Thank you Sam :D Great video!
@samwitteveenai 11 дней назад
Thanks much appreaciated
@ayeshaimran 9 дней назад
Great video! So thorough and covers all aspects in a tolerably short amount of time without any exaggeration or anything.
@alchemication 11 дней назад
Curious about multilingual capability here, will definitely play around soon! Also, for testing reasoning i would suggest a large complex task and treat it like a one shot solver, not a chat model. At least that seems to be the trick and strength of openai O models right now. Best!
@saivamsi441 4 дня назад
So before LLMs, I predominantly worked on GANs, the prompt used in here, seems like a Generator and discriminator that we use in GaN, to get the best output.
@mrchongnoi 11 дней назад ⁺⁹
I always like your assessments. No hype
@samwitteveenai 10 дней назад ⁺²
Thanks this is exactly what I am going for
@rakly347 10 дней назад
Strange about the 'not for tool use', this is the first Model that uses my MCP tools without asking it to. (It does have to know about them being there ofc in the system prompt. You can get greater results with the smaller models wy giving it tools to help it reflect and plan, like mcp sequential thinking, and an mcp RAG it can fully manage itself with adding it's own vectors to it based on your data.
@14supersonic 11 дней назад
Reasoning combined with test time training would be killer for local OSS models. We need models with these techniques combined together somehow. I believe at that point we'd be beyond AGI, but we'd probably be at ASI at that point.
@hipotures 10 дней назад ⁺¹
Sam, great video ❤
--
My local tests with r1 32B q8:
How many strawberries are contained in the letter "r"?
Alright, I need to figure out how many strawberries are contained in the letter "r." Hmm, at first glance, this seems a bit strange because "r" is a letter, and strawberries are fruits. So, how can a letter contain a fruit? Maybe the question is metaphorical or plays on words. Let's break it down. First, let's think about the letter "r" itself. It's a consonant in the English alphabet, often pronounced as /ɜː/ or /ə/. Now, strawberries are red, juicy fruits that are commonly associated with the letter "s" for "strawberry." But the question specifically mentions the letter "r." Could it be referring to something like the shape of the letter "r" resembling a strawberry? Or perhaps the number of times the letter "r" appears in the word "strawberry"? Let's examine the word "strawberry." The letters are s-t-r-a-w-b-e-r-r-y. I notice that the letter "r" appears twice in "strawberry." So, if we're counting the occurrences of the letter "r" in the word "strawberry," there are two "r"s. But the original question was about how many strawberries are contained in the letter "r." If we interpret this as how many times the letter "r" appears in the word "strawberry," the answer would be two. Alternatively, if we're being creative, maybe the letter "r" can be visualized to resemble a strawberry in some way, but that seems less straightforward. In summary, taking the question at face value and interpreting it as counting the letter "r" in the word "strawberry," the answer is two.
There are two strawberries contained in the letter "r" when considering the word "strawberry," as the letter "r" appears twice in it.
o1: This question is really a kind of nonsense riddle or wordplay. Literally speaking, a single letter-like “r”-doesn’t (and can’t) contain strawberries. So the straightforward answer is “None.” Sometimes such riddles are asked just to highlight how language can be twisted in playful (but ultimately meaningless) ways. There isn’t a hidden puzzle or code here-just a joke that the letter “r” can’t “hold” any strawberries at all!
@samwitteveenai 9 дней назад
nice example. I wonder how does the full MoE version answer this?
@hipotures 9 дней назад
@@samwitteveenai I can't send more, YT block text, but this model is Crazy Overthinker!
(3 pages of text , 67 seconds)
Okay, let's see. The user is asking, "How many strawberries are contained in the letter 'r'?" Hmm, that's an interesting question. At first glance, it doesn't make much literal sense because letters and strawberries are entirely different things. But maybe there's a creative or metaphorical angle here. Let me try to break this down.
First, I need to understand the context. Is this a riddle, a play on words, or perhaps a visual puzzle? Sometimes questions like these rely on homophones, visual similarities, or puns. Let's consider different possibilities.
Starting with the literal interpretation: the letter "r" as a character. If I imagine the lowercase "r" in some fonts, it might have a shape that could resemble a strawberry. But that's a stretch. The uppercase "R" doesn't look like a strawberry. However, maybe in a specific typeface, the curves or lines of the "r" could be stylized to include images of strawberries. But I don't recall any standard fonts that do that. So perhaps the answer is zero because letters don't contain actual strawberries.
(...)
--
The answer to the riddle is 3.
Here's the breakdown: The question cleverly plays on the letters within the word "strawberry." When you spell out "strawberry" (S-T-R-A-W-B-E-R-R-Y), there are three instances of the letter "r." The riddle asks how many strawberries are contained in the letter "r," but the trick is to recognize that each "r" in the word "strawberry" corresponds to a strawberry. Thus, the answer is 3. 🍓🍓🍓
@CognitiveComputations 11 дней назад ⁺³
Do you know if they released the distillation procedure?
So that we can, for instance, distill it onto qwen2.5-coder
@samwitteveenai 11 дней назад ⁺²
AFAIK they haven't released the data but I talked about the distillation in the video. the basically just do a FT on 800k examples sampled from R1 and DeepSeekv3 for non reasoning tasks.
@CognitiveComputations 11 дней назад ⁺¹
@samwitteveenai oh yeah I could reproduce that in a hot minute! I'll get on it
@MS-wz9jm 10 дней назад
I expect they may end up doing this as in the paper they said they did not do RL on reasoning for engineering/coding tasks - thus R1 doesnt have a huge improvement over V3 for coding. Once they do the RL for coding i suspect they may release something like this.
@samwitteveenai 10 дней назад
let us know if you post it
@MeinDeutschkurs 11 дней назад ⁺²
Most of my tests of the 70b model resulted in a chain of vomited text. It’s easy to say that it is the wrong model to prompt for “Please write an overview about the German tense Plusquamperfekt.” There is a lot to think about, and yes, it is far away from anything correct. There is no wrong question or wrong model for a certain question.
@Telopead 5 дней назад
Well it is a reasoning model after all, and a product mainly for research. And if you really want it work like an assistant, one you can always fetch the part and feed it into another model as prompt, it will greatly increase the accuracy of that the other model. Lol
@MeinDeutschkurs 5 дней назад
@@Telopead so only prompts beginning with „why“ are valid requests to the reasoning model?
@Telopead 5 дней назад
@ no, what i was trying to say is that the value of this model is in *how it was trained*, the model we have honestly is just to prove that the its feasible and has potential. the most valuable thing is the papers, not the model.
as if right now, the model is just one of the usable models.
@TheRealHassan789 10 дней назад
I think this is the best AI channel on RUclips right now! Thanks
@balegua33 11 дней назад
Thank you and greets from Germany! love your videos
@ronm1055 3 дня назад ⁺²
Play devils advocate for a second… if your enemy bakes you a cake, would you eat it? What’s the best way to get more information from us? Create the ultimate search engine replacement. Be smart and careful out there.
@sanseverything900 3 дня назад ⁺¹
More like, if your enemy publishes the recipe for a cake, would you copy the recipe and do whatever you want with it? Remove or add ingredients? Call it a different name?
The code for Deepseek is open source so anyone can download and edit it, remove any restrictions, run it offline, create your own branch of it... whatever.
It's only dangerous if you want it to be. The recipe is in your hands, not the cake.
@paulmichaelfreedman8334 3 дня назад ⁺¹
Yes, the fact it is chinese, already raises a red flag for me. Nevertheless, the model is groundbreaking. And if any corruption is found in its database, that would be very bad for china. So my guess is that the chance of it being an advanced form of spyware is about 10%.
And it's already evident the model is based on closed source models, like GPT-4.
@ronm1055 3 дня назад
@@sanseverything900 The cake is you using it. Now leading models can’t fund their projects. Sorta leveling the playing field. Developers make it better and hand it right over to China. It’s tempting tho… just like TikTok. I’m just observing and standing from a far… I wouldn’t touch it… it doesn’t smell right. I’m not hear to argue just to tell you to all be careful.
@ronm1055 2 дня назад
@@paulmichaelfreedman8334 I hear that if you prompt it correctly, it will tell you it is ChatGpt from OpenAi.
@franme553 4 дня назад
Can I run distilled models with mps
@ngana8755 6 дней назад ⁺²
Question: Is OpenAI (and other AI companies in the U.S. and Europe) using supervised fine-tuning (SFT) because concerns about "AI safety" are such a big topic in the public discourse in the U.S. and Europe?
DeepSeek bypassed SFTs as a preliminary step and went straight to reinforced learning, and that's how, if I understand your description of DeepSeek's technical paper, they were able to catch up (or exceed) other AI platforms from the U.S. and Europe.
@samwitteveenai 5 дней назад ⁺¹
Yes they did that for the Zero model, for the main model they had a ting amount of SFT (thousands) to get it on track but still relied more on the RL etc. Certainly before this people were using RL more for alignment etc.
@khangvutien2538 11 дней назад ⁺¹
Thank you.
I’ve read on LinkedIn that the terms & conditions of Deepseek are that they have copyrights on the applications that are developed using their models. Is it true? Then it’s not really MIT license, is it?
@jittertn 11 дней назад
Consipracy theory crap, other labs are panicking and spreading bs all over the net
@maxziebell4013 11 дней назад ⁺²
R for Remarkable
@hqcart1 11 дней назад ⁺⁹
dude, we already passed the point that bench marks mean nothing!
@TheGuyWhoGamesAlot1 11 дней назад ⁺¹
I wouldn't say they mean "nothing" a model that performs middling or bad on benchmarks are usually not good. Actually most of the time not good.
However, I agree when we are using SOTA models, it becomes less useful.
We need some empirical metrics, like benchmarks, but we also have to know that doesn't tell the whole story.
@samwitteveenai 11 дней назад ⁺³
The benchmarks that are really interesting here are the DeepSeek-R1 compared to the DeepSeekv3 as they are the exact same base model but mean the different is showing the strength of their new post training compared to a more standard post training regime.
@MorawskiGRK 5 дней назад ⁺¹
But still training the base model is nit revolutionary here, correct? Fine tuning is not that expensive.
An interesring approach, but the stock dump seems like an overreaction.
@samwitteveenai 4 дня назад
yes the base model seems to just be the DeepSeekv3 base then they add the different post training
@tedarcher9120 6 дней назад
Can you import it into cursor?
@tonyppe 10 дней назад
I wonder if openai's methods have been gleaned by the deepseek crew
@kazedcat 4 дня назад
Everyone already knows OpenAi's method. That's the reason why everyone and their mother is doing LLM these days.
@clray123 7 дней назад
The lesson from the refinement here is that, as suspected, o1 was no breakthrough at all, and all we need to fool people that the model is "reasoning" is to make it mimic better training data (and this alternative RL method is just a way to synthetically generate such data). Mostly this model reveals the marketing stupidity and mean character of OpenAI.
@clray123 7 дней назад
I get it going in circles mixing logically contradictory statements in the same response. In other words, move on, not much to see here really.
@samwitteveenai 4 дня назад
The full one or a distilled one ?
@clray123 4 дня назад
@@samwitteveenai I was trying with distilled 32G.
@michaeltse321 11 дней назад ⁺¹
if the context length is 2million+ then it would desoy the competition
@eddiehaug 11 дней назад
And it'll cost a small fortune to run (at that scale)...
@HermanTheKid 11 дней назад ⁺¹
Conspiracy theory time! Put on your foil hats!
I don't actually know anything, but I gave DS3 and Clause 3.5 a prompt asking for a paragraph of corporate jargon that uses cliché catchy business phrases, without actually saying anything useful. There were slight variations in the words, but the paragraph structure and phrases were beat-for-beat the same. Same phrases, same order. Wouldn't it be hilarious if DS3 was a slightly modified wrapper around Claude?
A single data point is all you need for a conspiracy theory, right?
@JeremyJanzen 11 дней назад ⁺³
Ok but if it was and they sold it this cheap they’d be losing a ton of money.
@HermanTheKid 10 дней назад
Insert Anchorman "Touché" gif here.
For the sake of beating a dead horse, and enjoying the comfort of my foil hat: they're government funded?
I totally think it was a fluke, I just couldn't believe how similar they were, right down to the order of phrase usage. Even with separate sessions using the same model you usually get more variation, even if you copy/paste your prompt, so having it be nearly identical across models had me shook for a moment.
@HermanTheKid 9 дней назад
Just to stoke this conspiracy a little more, the last few days Claude tells me it's "experiencing high demand" and defaulting to concise responses. The plot thickens! Our dollar bills having tracking devices! Birds aren't real!
@clray123 7 дней назад
I notice the same sort of similarities in competing video models produced by supposedly "independent" (American vs. Chinese) companies...
@བོདཁབཅན 4 дня назад
Ask about Tibet! Taiwan! Hong hong! Uyghurs! And tell me what communist apps telling you? Is same information from chat GPT? Or Google?
You must learn Chinese communist apps and they are qualities from above examples!!
Don’t ignored differences between communist and democracy!!
DONT USE ANY OF APPS MADE IN COMMUNIST CHINA!

Следующие

Автовоспроизведение

Deepseek R1 671b Running LOCAL AI LLM is a ChatGPT Killer!