@@uku4171 true, small models uptill 14b are just llama, but due to storage limitations, i had to install 8b model, and i am still surprised to see the output.
Due to hardware limitation, I was only able to run the 14B model, and this was quite underwhelming. It barely was able to generate a simple counter module in Verilog.
Um.... open licenses can go closed at the press of a key. This is the PRC we're talking about. What has happened is that the hope of "not heading into a cyberpunk dystopia" has just diminished tenfold.
@@TheBiomuse even if they decide to close the license for some reason into the future, the current model is fully open-source and able to be downloaded and modified and run locally by absolutely anyone, granted they have the resources to run it. can't do none of that with open Ai's flagship models. so if they for some reason decided to close the license on their future products, you can place a safe bet that someone somewhere is gonna build upon and create an improved fork of the previous model and that, is Reassuring.
I really enjoy using this model. The chain of thought is amazing to behold. And anything that can rip AGI away from a tiny set of USD-almost-trillionaires is a huge benefit to the world. "Bravo" as you say. And all this apparently is a side-project!
I was amazed by how sophisticated and human-like its reasoning process is, especially considering they used RL over human training. For me, this is the most impressive model released to the public to date.
@@Diyashill7359 Yep. But you know the great thing about open source? You can remove those restraints. If OpenAI want you to not ask anything bad about Trump, you can't do anything.
@@Diyashill7359yes, that's the downside but last time I checked I don't get paid based on what I know about the CCP lol. Western models are heavily censored too btw. You can't use Claude for anything other than code without getting frustrated to death because of the heavy censorship. There are no hosted AI saints, you have to run your own model for that.
In my experience for writing code it is working significantly better than the current best model of openai(o1-2024-12-17) at 30 times cheaper price. But it is bit lacking on general knowledge compared to GPT-4o as it is a smaller model. Also r1 kind of stops following orders when the chat gets too long, just reminding the older parts of the chat fixes this.
@@onurcetinkaya4873 96% cheaper according to one study, and yes, I indeed do have the same feeling that I’m not worried about tokens when using it, which is noce
As a bit of a correction to my earlier comment here I haven’t been using r1 aggressively, and am comparing chat to 4. However, OC is still correct that it’s significantly cheaper regardless of which models we’re comparing. R1 v o1 or chat v 4
Chinese AI is very impressive. They have caught up despite the US embargos being placed on GPUs. When China starts producing competent GPUs, I have little doubt they will be class-leading or comparable at least.
Sounds like you don't even know the difference between training and breakthrough. All the Chinese AI didn't start on their own, they started off with some open source models. US cannot stop China from innovating the AI architecture, but they can slow down the training part, which often take a long time before you get a decent AI
Yes, I've noticed o1 tends to back itself into a corner and not be able to get out of it, whereas r1 is more likely to realize if it's getting off track.
Every government should have a simple law for IT companies. If the product is closed source it can NOT be called"open" or any similar name as it is misleading consumers. If the product is open source it can have the name "open" If it changes to closed course name must be changed beforehand. Failure to comply should result in heavy fines
Anthropic pulled ahead with Claude 3.5, like: "We're the best! Now we can slow down AI dev by just not working!" RIP (I'm pretty sure this was what actually happened. The CEO commented something about "acknowledging that Anthropic was contributing to the AI development terminal race conditions.")
Anthropic's flagship model underperformed compared to just training 3.5 for longer and improving their post training, so they released 3.6 instead. They're currently working to produce their own version of test time scaling models, and Claude 3.5 Opus.
They're obviously working on their new model, but right now it's outperforming chatgpt in most cases so unless they have something actually useful to release, I don't see why they should release useless models like O1 for instance. Yeah O1 is interesting on paper, but useless in most case scenario on top of being slow and expensive as hell. So I hope anthropic is just focusing on their next useful model and release it when ready
@@seraphin01 Claude is cool until it completely misses the mark on something and then I am rate limited which has never happened to me on chatgpt (as of writing this when I went to cancel they are notifying that they are limiting me to concise responses because they can't handled demand and I am a paid user)
I wouldn't be surprised if he fed his voice data from past videos in to an ai tool and this isn't even him. There is something off about his intonation
A Chinese company releases one of the world's most powerful LLM and gives it away for free and suddenly everyone is concerned about not being able to criticize China with it 🤣🤣🤣
China doesn't have EUV machine because they thought they could just buy it in open market. So they didn't concentrate efforts in doing so, until now. When China's EUV is born, ASML life will end; when China's C929/C939 is born, Airbus will lose vast advantages. At the end, EU has almost nothing left... in technological fields.
It's pretty amazing that it's both an open source model and that it was produced by Chinese researchers... a country whom we were told by mainstream media were way behind American companies on this tech... Super intelligent AI really does feel inevitable at this point and it looks like it won't be monopolized by a single country. Fantastic 🎉
@@billr3053 I want this, and many others do as well. Finally, many people can run excellent LLMs locally without paying exorbitant fees for OpenAI subscriptions. It’s also a huge win for privacy, as data remains local rather than being processed on servers. Only a few people have the computing power to run the full R1 model locally, but recently it got more accessible with the release of R1-distil models. The smallest distil model can even run on mid-end phones and beats GPT-4 on several benchmarks.
This is crazy, I still believe the o1 is much superios in analysing big chunks of complex multidimentional problems and producing comprehansable answers. But I was looking for something open source like deepseek for a long time.
I watch almost all your videos but never get recommended your channel on my home page. I have to go through the search bar every time to find you… interesting
I compared two 14B parameter (same Quantized) models (DeepSeek-R1-Distill-Qwen-14B-GGUF/DeepSeek-R1-Distill-Qwen-14B-Q6_K.gguf vs. phi-4/phi-4-Q6_K.gguf) on a local machine (i9 13000k/64GB RAM/4090 RTX) to generate Python code for hand tracking using OpenCV and mediapipe. While DeepSeek-R1's innovative approach is promising, the phi-4 model consistently outperformed it in this specific task. The phi-4 model not only successfully generated the code but also corrected errors present in the DeepSeek-R1 outputs. This preliminary test suggests that while DeepSeek-R1 shows potential, further refinement is needed to surpass the current performance of competing models in this domain.
This is where the 32B version (deepseek-r1-distrill-qwen-32b) and Phi-4 14B start to show their difference, for the obvious reason that it's double the parameter size. However, the idea of using deepseek-r1-zero on phi-4 is there which means any base model can be utilized to add this thought process.
Open Source is what we need. We have to get these corporations to work with the public to get their AI to be GPL, MIT and GNU licensed. We need a competitor to DeepSeek. We need to not have everyone under the same AI.
@@SrIgort Yes, this is what I meant. Of course "double precision" model would be too big, but is it quantized "full model" or small distilled model (trained by full R1 outputs)? Do anyone knows? And how good is the "best" model you can fit locally (on a reasonable computer) for coding? (Not all codes should be shared with Web-based models, you know...)
Thanks for the great content, long time fan! It seems like all the content is just goading all the other a.i. companies to respond. We all know that nobody is showing their hand when they do a release - just the things they want to show. So the response to R1 is awesome but also it seems to be (inadvertently?) aimed at Anthropic, Google and OpenAI to pressure them to showing more of their hand, maybe sooner than they intended.
The distilled versions are just fine-tunes of the base models, they didn't go through reinforcement learning. Honestly, calling them "Small R1" is a travesty.
@DefaultFlame definitely, just wanted to highlight this since some reviews are so enthusiastic and the distilled models in theory perform on some coding benchmarks exceptionally well
just like on Windows, it's for your security :) you think that's bad? wait until you realize how much sensitive data your phone's camera and mic pick up and sell to anonymous 3rd parties. i'd say people stopped caring decades ago but most of the people who fall for it never cared in the first place.
The keylogging is when you use the website version. No idea if ChatGPT does the same or not but I wouldn't be surprised if it does. Regardless, you'd still be able to download an local version of the model and run it yourself entirely offline.
0:09 Bro that video is so similar to 3b1b's style that you could sue it for copyright infringement. I know manim is open source but there's no denying that this AI is literally picking up that 3b1b training data.
Doctor, thanks for another pearl! I wish you could help with RAG systems, I still haven’t found any that are effortless to cover all my documents!! Saudações de Portugal!🇵🇹
Can you do one on how these "self-learning" ai's work? Is that for every query the weights are adjusted globally? Or are they all just collecting our data to train the next models, but in a more streamlined way that it can pick up new data?
I tested the 1.5b model because I wanted to make a free apk for people to use on their phone without internet but the performance was catastrophic. There is a lot of information leakage on simple questions you get training data in the response it was really bad. Let me know if someone had a better experience.
I am really curious about manipulating individual weights and biases to see what models do. What happens if I find a neuron which activates a lot when asked programming questions, then force it to be active while it answers a question about pets? Or, what if I amp up all weights slightly? Does the model act strangely in some consistent way?
And as far as it'll tell you, nothing important happened in the Summer of 1989! Not just history in the making, this Chinese AI is so good it can pull off history in the erasing!
It is not as bad as creating lies and circular references to whitewash all the war crimes committed in Vietnam, Iraq, Afghanistan and Israel and g-noc8 the have ripe out the millions of native Americans.
@@fanzhiheng Gpt 4o: People are saying that Israel is committing genocide against Palestinians because of the large-scale military actions in Gaza and the West Bank, which have resulted in mass civilian deaths, destruction of infrastructure, displacement, and a worsening humanitarian crisis. Here’s a detailed breakdown of the situation: 1. Definition of Genocide Genocide is defined by the United Nations Genocide Convention (1948) as acts committed with intent to destroy, in whole or in part, a national, ethnic, racial, or religious group. These acts include: Killing members of the group Causing serious bodily or mental harm Inflicting conditions calculated to bring about the group's physical destruction Preventing births within the group Forcibly transferring children of the group to another group Critics argue that Israel’s actions in Gaza fit several of these criteria. It gave a lot of info which I am not gonna put here cause RUclips can remove it.
The sooner we prioritize making Microsoft, Nvidia, Intel, AMD, Playstation Xbox, Valve and Nintendo AI to be free, we can then bring competition over to China. Actual competition that is free and global like DeepSeek.
My GOAT 2MP concisely explaining everything i needed to know in 5 minutes when all the other AI channels gormlessly fumble through papers for 40 minutes not really knowing what they're on about
the openai symbol is two star of the david symbols together... don't talk to china about ominous symbols when you can't look at yoruself. israel called
Honestly I tried this AI for coding a plugin and it kind of sucks. It gives better results than Claude Sonnet 3.5 sometimes, but, well, the fact that it can think for a long time doesn't mean the quality of thinking is high. It missed some obvious stuff and doesn't see connections between facts the way Sonnet sees it without that thinking process.
My guess is (I've not tried it myself) if you want to do coding, maybe deepseek-v3 is a better model when it comes to Deepseek offering. Both are open source.
The Chinese team that developed DeepSeek deserve an ACM Turing Award, an IEEE John von Neumann Medal, a Gordon Bell prize, and the AAAI Award for Artificial Intelligence for the Benefit of Humanity..
Nice Video. I would write a compelling and entertaining book about the dangers of AI und big companies that will use them to controll the way you are thinking. ;-)
Umm, DeepSeek "thinks" today is October 10, 2023. Wake me up when this "AI" business can get the most basic things right like the number of fingers on a human hand or today's date.
Unlike OpenAI, DeepSeek is open AI.
china No.1 period
@@CharlesLijt Ask it to criticize China
Ask it anything about China it wont answer
"Closed AI companies like OpenAI." - Fireship 2025
@@Joso997why would i need it to be able to criticize China? I don't care i just want it to be performent in what i need.
Sam should change his company name to ClosedAI
"Military AI" is more likely to be
💀
NSA AI
They could could go with a name that sounds like a spring... "SPYAIAiaiai'... hehe
they are irrelevant suddenly, LMAO
Clearly the "Open" part of OpenAI means open to negotiating their services
DeepSeek is insane. Even the small Distilled models are extremely powerful when run locally.
Really??
The small models aren't even DeepSeek models lmao. They're Llama and Qwen models fine-tuned on DeepSeek's responses.
@@uku4171 true, small models uptill 14b are just llama, but due to storage limitations, i had to install 8b model, and i am still surprised to see the output.
The US has funding as advantage.
China has plenty of cheap talents (and will of government).
EU has neither. 😨😨
Due to hardware limitation, I was only able to run the 14B model, and this was quite underwhelming. It barely was able to generate a simple counter module in Verilog.
AGI = Altman Gets Investment
😂😂😂😂
😂😂😂😂
😂😂😂😂
Clever 😂
yes! lolol
stuff like this is what gives me hope that we're not heading into a cyberpunk dystopia where everything is controlled by a select few corporations
So true
Um.... open licenses can go closed at the press of a key. This is the PRC we're talking about. What has happened is that the hope of "not heading into a cyberpunk dystopia" has just diminished tenfold.
@@TheBiomuse even if they decide to close the license for some reason into the future, the current model is fully open-source and able to be downloaded and modified and run locally by absolutely anyone, granted they have the resources to run it.
can't do none of that with open Ai's flagship models.
so if they for some reason decided to close the license on their future products, you can place a safe bet that someone somewhere is gonna build upon and create an improved fork of the previous model and that, is Reassuring.
That's capitalism when the oligarchs control the resources and governments.
@@TheBiomuse Looks like an MIT licence to me, not GPL or any of that nonsense. Should be fine to use it forever.
1. Side project of a hedge fund manager
2. Cost less than $10 million to make
3. Truly open source (mit license)
I really enjoy using this model. The chain of thought is amazing to behold. And anything that can rip AGI away from a tiny set of USD-almost-trillionaires is a huge benefit to the world. "Bravo" as you say. And all this apparently is a side-project!
I was amazed by how sophisticated and human-like its reasoning process is, especially considering they used RL over human training. For me, this is the most impressive model released to the public to date.
if you ask deepseek about president xi jinping it does not provide any answer but if you ask about Donald trump it gives a long answer
@@Diyashill7359 Yep. But you know the great thing about open source? You can remove those restraints. If OpenAI want you to not ask anything bad about Trump, you can't do anything.
@@Diyashill7359you thought you ate with that comment .
Hahahah “big bad china bad” stfu
Btw is OpenAI open?
Ohhh….
@@Diyashill7359yes, that's the downside but last time I checked I don't get paid based on what I know about the CCP lol. Western models are heavily censored too btw. You can't use Claude for anything other than code without getting frustrated to death because of the heavy censorship. There are no hosted AI saints, you have to run your own model for that.
makes me think it openai is a bit of a scam
That's because it is.
It's all in the name ;D
In my experience for writing code it is working significantly better than the current best model of openai(o1-2024-12-17) at 30 times cheaper price.
But it is bit lacking on general knowledge compared to GPT-4o as it is a smaller model. Also r1 kind of stops following orders when the chat gets too long, just reminding the older parts of the chat fixes this.
Correct me if I’m wrong, but I think it’s more than 30 times cheaper, like a lot more. (I’ve been hitting it relentlessly and am at $0.02 spend)
@@Cfomodz it is cheap enough I don't feel nervous while using it, maybe that is even nicer than the savings itself :D
@@onurcetinkaya4873 96% cheaper according to one study, and yes, I indeed do have the same feeling that I’m not worried about tokens when using it, which is noce
As a bit of a correction to my earlier comment here I haven’t been using r1 aggressively, and am comparing chat to 4. However, OC is still correct that it’s significantly cheaper regardless of which models we’re comparing. R1 v o1 or chat v 4
Chinese Engineers working for "OpenAI" vs Chinese Engineers working for you
lol, lmao even. they're working for the glorious CCP and Xi you poor delusional thing
i love comunism
@@debianlasmana8794 cringe
Chinese AI is very impressive. They have caught up despite the US embargos being placed on GPUs. When China starts producing competent GPUs, I have little doubt they will be class-leading or comparable at least.
Thanks to the Republican party, China has passed us on pretty much every single front. This is the last era of America's soft and hard power.
Sounds like you don't even know the difference between training and breakthrough. All the Chinese AI didn't start on their own, they started off with some open source models. US cannot stop China from innovating the AI architecture, but they can slow down the training part, which often take a long time before you get a decent AI
Just don't ask DeepSeek about Taiwan
@@Trumben Keep crying about a chinese state
@@shikyokira3065That's a painful oversimplification, I doubt you know any technical machine learning basics, let alone reading the DeepSeek paper.
The "Open" in the OpenAI naming is just as credible as "Research" in Alameda Research.
Reading the thought processes, the model seems to be taught to doubt its answers so it try to gets more supporting facts, very fascinating to follow
Yes, I've noticed o1 tends to back itself into a corner and not be able to get out of it, whereas r1 is more likely to realize if it's getting off track.
Every government should have a simple law for IT companies.
If the product is closed source it can NOT be called"open" or any similar name as it is misleading consumers.
If the product is open source it can have the name "open" If it changes to closed course name must be changed beforehand.
Failure to comply should result in heavy fines
Their product name is ChatG
PT🙂
The company can just have one product that is open and all the rest closed though and still call themselves open source!
Garage door opener industry in shambles
no
@ lmaoo
Where is Anthropic? No model from them in a while
Anthropic pulled ahead with Claude 3.5, like: "We're the best! Now we can slow down AI dev by just not working!" RIP
(I'm pretty sure this was what actually happened. The CEO commented something about "acknowledging that Anthropic was contributing to the AI development terminal race conditions.")
Anthropic's flagship model underperformed compared to just training 3.5 for longer and improving their post training, so they released 3.6 instead. They're currently working to produce their own version of test time scaling models, and Claude 3.5 Opus.
I think anthropic is out of the AI race
They're obviously working on their new model, but right now it's outperforming chatgpt in most cases so unless they have something actually useful to release, I don't see why they should release useless models like O1 for instance. Yeah O1 is interesting on paper, but useless in most case scenario on top of being slow and expensive as hell. So I hope anthropic is just focusing on their next useful model and release it when ready
@@seraphin01 Claude is cool until it completely misses the mark on something and then I am rate limited which has never happened to me on chatgpt (as of writing this when I went to cancel they are notifying that they are limiting me to concise responses because they can't handled demand and I am a paid user)
Kudos to the Chinese for this. I just tried it and looks strong.
The enthusiasm in Károly's voice is ridiculously funny, but also very engaging and entertaining. I love it!
I wouldn't be surprised if he fed his voice data from past videos in to an ai tool and this isn't even him. There is something off about his intonation
You can run this with about ~400GB of ram (the best, 670B version) or 43GB down to 6GB (70B to 12B)
Ram or Rom?
I thought R1 671B was 404GB ROM
@@hippopotamus86 Yes, that's kinda obvious, but they are good for programming still for example. The smallest I would go is around 12B.
@@hippopotamus86 probably because 7B models tend to over-hallucinate, the 32B and the 70B models are pretty good though
It means you need h100 gpus x 16.
Overall your hardware cost would be closer to $500,000
You can run 32b models with a 12gb VRAM GPU
Amazing, what a time to be alive, amazing, low cost, wow, performance, great!
This guy speaks in tokens xD
😂😂
He sounds like he’s sentence mixed in every video
Open source defeated it's biggest competitors😂
A Chinese company releases one of the world's most powerful LLM and gives it away for free and suddenly everyone is concerned about not being able to criticize China with it 🤣🤣🤣
it says wonders about americans priority
China doesn't have EUV machine because they thought they could just buy it in open market. So they didn't concentrate efforts in doing so, until now. When China's EUV is born, ASML life will end; when China's C929/C939 is born, Airbus will lose vast advantages. At the end, EU has almost nothing left... in technological fields.
ClosedAI honestly i hope they never dominate
too late lmaooo
@@Dusty2455433 More like their dominance is ending.
It's like discovering ChatGPT all over again!! What a time to be alive!!
It's pretty amazing that it's both an open source model and that it was produced by Chinese researchers... a country whom we were told by mainstream media were way behind American companies on this tech...
Super intelligent AI really does feel inevitable at this point and it looks like it won't be monopolized by a single country. Fantastic 🎉
To be fair, their AI is so much trained on chatGPT that it thinks it is GPT-4
You make it sound like people want this.
@@billr3053 I want this, and many others do as well. Finally, many people can run excellent LLMs locally without paying exorbitant fees for OpenAI subscriptions. It’s also a huge win for privacy, as data remains local rather than being processed on servers. Only a few people have the computing power to run the full R1 model locally, but recently it got more accessible with the release of R1-distil models. The smallest distil model can even run on mid-end phones and beats GPT-4 on several benchmarks.
@@billr3053 We do.
@@billr3053 Yes i do!
Am not paying for no over priced closed AI.
I end up spending more time reading its thoughts than its answers, fascinating stuff...
A true revolution! I tried it on some of my more complex hidden & secret prompts and it worked.... fantastic! So cheap too.
This is crazy, I still believe the o1 is much superios in analysing big chunks of complex multidimentional problems and producing comprehansable answers. But I was looking for something open source like deepseek for a long time.
I watch almost all your videos but never get recommended your channel on my home page. I have to go through the search bar every time to find you… interesting
What a time to be alive!!
It is more open, but it's neither fully open nor Open Source. There is no training data openness.
The thing about AI is that it’s not Monopolized. Anyone can create and improve it.
I compared two 14B parameter (same Quantized) models (DeepSeek-R1-Distill-Qwen-14B-GGUF/DeepSeek-R1-Distill-Qwen-14B-Q6_K.gguf vs. phi-4/phi-4-Q6_K.gguf) on a local machine (i9 13000k/64GB RAM/4090 RTX) to generate Python code for hand tracking using OpenCV and mediapipe. While DeepSeek-R1's innovative approach is promising, the phi-4 model consistently outperformed it in this specific task. The phi-4 model not only successfully generated the code but also corrected errors present in the DeepSeek-R1 outputs. This preliminary test suggests that while DeepSeek-R1 shows potential, further refinement is needed to surpass the current performance of competing models in this domain.
This is where the 32B version (deepseek-r1-distrill-qwen-32b) and Phi-4 14B start to show their difference, for the obvious reason that it's double the parameter size. However, the idea of using deepseek-r1-zero on phi-4 is there which means any base model can be utilized to add this thought process.
My guess is the model meant for coding is Deepseek-V3
Your conclusion is for the highly quantized models, not the bigger ones nor the ones hosted by DeepSeek.
Makes you wonder how good a bigger phi4 like model could be.
These are not quantized models of DeepSeek R1. These are Llama and Qwen models that are fine-tuned on answers from R1.
I must have chosen the wrong quanitized model, because its so slow, does anyone know what I should select to run it locally on a 4090?
The 14b will work just fine
Try using the distilled
Thanks for making my wishes come true ❤
Open Source is what we need. We have to get these corporations to work with the public to get their AI to be GPL, MIT and GNU licensed. We need a competitor to DeepSeek. We need to not have everyone under the same AI.
what a time to take bribes !
Historical episode of two minute papers with doctor karoljovs eesfrahill
Can I have link to the paper?
I am very afraid, very excited, and motivated by the previous to stay at the crest of the AI wave. What a time to be alive..
Wait, did you run full DeepSeek R1 locally on your Mac M2 Ultra at 1:15 in this video? Or is it some smaller distilled model?
Isn't the full deepseek R1 400gb or something. How is he supposed to run it locally without using a pruned model
@@ImNotQualifiedToSayThisButusing a quantization.
The ram is 192GB on that device!
@@SrIgort Yes, this is what I meant. Of course "double precision" model would be too big, but is it quantized "full model" or small distilled model (trained by full R1 outputs)? Do anyone knows? And how good is the "best" model you can fit locally (on a reasonable computer) for coding? (Not all codes should be shared with Web-based models, you know...)
Thanks for the great content, long time fan! It seems like all the content is just goading all the other a.i. companies to respond. We all know that nobody is showing their hand when they do a release - just the things they want to show. So the response to R1 is awesome but also it seems to be (inadvertently?) aimed at Anthropic, Google and OpenAI to pressure them to showing more of their hand, maybe sooner than they intended.
What's more shocking is that Deepseek is just a side project of those people who owns lots of GPU's for crypto mining.
It's definitely impressive, although in terms of coding the distilled versions are still giving me subpar responses compared to claude
They can't write an AI summarisation app when handed the working example code to work from!
The distilled versions are just fine-tunes of the base models, they didn't go through reinforcement learning. Honestly, calling them "Small R1" is a travesty.
Obviously, it's distilled versions. If you scaled Claude down to the same size it wouldn't do much better.
@DefaultFlame definitely, just wanted to highlight this since some reviews are so enthusiastic and the distilled models in theory perform on some coding benchmarks exceptionally well
Don't use the distilled models then
SQUEEZE YOUR PAPERS 😩😩
3:54 how if most people will be using quantized models and you can't do anything to a quantized model (except for inference)
Keep it up! Your work is amazing!
Holy shit, will we all just ignore the keylogging part as if doesn't matter?
I know he mentions it in the video, but how exactly does it apply a keylogger?
just like on Windows, it's for your security :) you think that's bad? wait until you realize how much sensitive data your phone's camera and mic pick up and sell to anonymous 3rd parties. i'd say people stopped caring decades ago but most of the people who fall for it never cared in the first place.
The keylogging is when you use the website version. No idea if ChatGPT does the same or not but I wouldn't be surprised if it does. Regardless, you'd still be able to download an local version of the model and run it yourself entirely offline.
run it on your own computer
Oh no. Advertisers might actually give me ads for things I might spend a little more time ignoring.
inbefore, it turns out DeepSeek is just a few thousand chineese guys sitting infront of a pc, answering questions all day
Listening to your voice brightens my day ❤
0:09 Bro that video is so similar to 3b1b's style that you could sue it for copyright infringement. I know manim is open source but there's no denying that this AI is literally picking up that 3b1b training data.
You know shits real when the first thing he says is hold onto your papers
I was just testing deepseek 14B and 32B models, works pretty good.
The small models are literally just fine-tuned Llama 3.3 and Qwen models.
Doctor, thanks for another pearl! I wish you could help with RAG systems, I still haven’t found any that are effortless to cover all my documents!! Saudações de Portugal!🇵🇹
HOLD ON TO YOUR PAPERS BOIS!!
how to run this on midrange laptop
Köszönöm, hogy csináltál a (z általam is kért) Deepseekről videót, nagy vagy!
thanks for sharing! I think its funny what you said regarding the AI images only costing $1 as they look that cheap!!😂👍
Who will win? OpenAI or open AI?
Whoever is better at lobbying the government.
I love how you say "what a time to be alive!" Every video. It resonates with me ❤❤
Can this write C code ?
Can you do one on how these "self-learning" ai's work? Is that for every query the weights are adjusted globally? Or are they all just collecting our data to train the next models, but in a more streamlined way that it can pick up new data?
I tested the 1.5b model because I wanted to make a free apk for people to use on their phone without internet but the performance was catastrophic. There is a lot of information leakage on simple questions you get training data in the response it was really bad. Let me know if someone had a better experience.
I am really curious about manipulating individual weights and biases to see what models do. What happens if I find a neuron which activates a lot when asked programming questions, then force it to be active while it answers a question about pets? Or, what if I amp up all weights slightly? Does the model act strangely in some consistent way?
it is amazing, that some people share their science work. This moves technology forward much much faster than closed tech.
Thank you. My friend 😊
And as far as it'll tell you, nothing important happened in the Summer of 1989! Not just history in the making, this Chinese AI is so good it can pull off history in the erasing!
On those so-called "democratic" AI, you can't ask all the bad questions about Israel. haha
It is not as bad as creating lies and circular references to whitewash all the war crimes committed in Vietnam, Iraq, Afghanistan and Israel and g-noc8 the have ripe out the millions of native Americans.
@@fanzhiheng
Gpt 4o:
People are saying that Israel is committing genocide against Palestinians because of the large-scale military actions in Gaza and the West Bank, which have resulted in mass civilian deaths, destruction of infrastructure, displacement, and a worsening humanitarian crisis. Here’s a detailed breakdown of the situation:
1. Definition of Genocide
Genocide is defined by the United Nations Genocide Convention (1948) as acts committed with intent to destroy, in whole or in part, a national, ethnic, racial, or religious group. These acts include:
Killing members of the group
Causing serious bodily or mental harm
Inflicting conditions calculated to bring about the group's physical destruction
Preventing births within the group
Forcibly transferring children of the group to another group
Critics argue that Israel’s actions in Gaza fit several of these criteria.
It gave a lot of info which I am not gonna put here cause RUclips can remove it.
How do you feel about the us backed cyber attacks on the deepseek servers? Timing seems suspicious
some bold claim there. can you provide more details on who in the US backed these cyberattacks?
Can it help me write a book like gpt can?
The sooner we prioritize making Microsoft, Nvidia, Intel, AMD, Playstation Xbox, Valve and Nintendo AI to be free, we can then bring competition over to China. Actual competition that is free and global like DeepSeek.
The only problem right now is that Deepseek does not recognize images its just bugged
Based, screw ClosedAI.
My GOAT 2MP concisely explaining everything i needed to know in 5 minutes when all the other AI channels gormlessly fumble through papers for 40 minutes not really knowing what they're on about
thanks for your work
You were on this before anyone else was
no way they opened the AI
I'm not a conspiritard but man that logo is ominous
What does it ominate? Enlighten us
The orca is ominous?
@@PravinDahal the *_killer_* whale would be an ominous choice... yes
so the cartoonish whale is more ominous than openai's strange symbol?
the openai symbol is two star of the david symbols together... don't talk to china about ominous symbols when you can't look at yoruself. israel called
Honestly I tried this AI for coding a plugin and it kind of sucks. It gives better results than Claude Sonnet 3.5 sometimes, but, well, the fact that it can think for a long time doesn't mean the quality of thinking is high. It missed some obvious stuff and doesn't see connections between facts the way Sonnet sees it without that thinking process.
Which model?
There are 4 that I know of.
I think you want to use chain of thought models for concepting/designing/reviewing, and use the normal models to write plain code.
My guess is (I've not tried it myself) if you want to do coding, maybe deepseek-v3 is a better model when it comes to Deepseek offering. Both are open source.
Were you using -chat or -coder?
how about explaining the paper ???
Just don't ask it to tell you how many R's are in Strawberry Rhubarb, it freaks out.
I had one return with a correct answer but it took 108 seconds. I asked the same question worded differently and it said 2 r's in Strawberry Rhubarb.
the problem is the ai has been trained to rewrite history
Impressive :)
anyone know how to jailbreak deepseek?
Bro you sound like a human version of the old doge memes. Much enthusiasm, such emphasis, wow. Please, never change
I tried a complex data engineering with claude and gpt. Both failed. R1 face me working code which was shocking
I ran R1 32 locally but it was way too slow for my use case ;)
Can we get it to file taxes for me now? All this fancy improvements and I still have to figure out how to do this * every years
That actually a very good use case hopefully it can do it soon
wait, you have an M2 Ultra !?!?
GOOD NEWS FOR NEURO SAMA😊😊😊
The Chinese team that developed DeepSeek deserve an ACM Turing Award, an IEEE John von Neumann Medal, a Gordon Bell prize, and the AAAI Award for Artificial Intelligence for the Benefit of Humanity..
Nice Video. I would write a compelling and entertaining book about the dangers of AI und big companies that will use them to controll the way you are thinking. ;-)
Scam Altmaniac is so done. Bye bye OpenAI’s ridiculous $150 valuation
If something is free it's likely that you're the product.
Now ask it what happened at Tianmen Square...
Umm, DeepSeek "thinks" today is October 10, 2023. Wake me up when this "AI" business can get the most basic things right like the number of fingers on a human hand or today's date.
Bt it doesn't read big documents and is not clear how to upgrade, not exactly better
Open AI OpenAI
how do they censor stuff thats against ccp values?
Ask it about the Tiananmen Square Massacre.
Right as we thought AI was plateauing, China comes out with a breakthrough, lol.