I'm so glad to see ML news back in action with more regularity. You've got actual knowledge and credibility that matters for presenting info in a rapidly crudifying space, the scene is filling up with empty influencer know-nothings, and I want the straight dope and technicals. Thank you.
Yannic's channel was my first experience in ML and AI news when GPT4 exploded. Yannic is the real deal, one of the best and most reliable sources available.
The sunglasses are the most important, followed by the humor; but yeah, okay, I appreciate the information as long as it doesn't interfere with my fandom.
Yeah same here. I know you are busy with open assistant and stuff like that (keep those things up) but we need your AI news and paper reviews! How else am I gonna move on with my own AI project. Half the technology I'll need is still buried in some obscure paper somewhere in the paper pile!
For the Grok model it is worth noting this has no fine tuning, and its performance is bad. I want to give Elon credit but this seems like more of a performative release than a real contribution.
Inflection put out a pretty good product with Pi. The Pi chatbot is tuned to being a companion and even as a kind of therapist. Giving an empathetic digital face to Microsoft will not end well.
The robots are humanoids because the tools have been developed around the human topology, so if you want to have a robot in the future to interface with these tools, humanoid is the most optimal form.
I'm always happy to see your videos. You give informative breakdowns without getting lost in the sauce about how whatever minor new improvement shocks the industry and marks full on AGI. Thank you
You joke...but I think mandatory discoslure of AI is the easist regulatory hurdle that will make a huge impact imo. Ideally people would discoluse such without regulatory compulsion...but imo it is important, to me at least, that such disclosures exists moving forward.
@@johnflux1 Yes it did make an impact, since now a typical Norman knows "cookie" is some scary legalistic jargon and not just a magical internet feature which "just works".
Make sure you think about the downsides to open sourcing frontier models also. Open source might be the best way but it's not clear to me the benefits are worth the risks.
I'm all for EU cookie-nags. Just say no, and I always do. Companies maximize their data hogging anyway, so I don't see why there would be less data by default in an alternative universe, where no EU cookie law exist.
As a fellow AI engineer/enthusiast of course I'm happy when we can just release cool stuff, but Europe is going in the right direction by starting to regulate the field. While AI is amazing there are still countless possibilities of misuse, overapplication and even rights infringement (I'm thinking privacy), so regulating big corps early on is necessary. Or we can just take the US route and have sugar in our bread lmao
I don't know if I've missed your coverage, but 1.58bit model training / inference source code has been released, which is interesting because of the scaling law suggested by the associated paper.
Humanoid bots: It's about training. If you want to show a bot how to do something, it needs to be able to follow your example. If it's not a humanoid, it has to solve every task in existence from scratch. If it's humanoid, it can learn from human actions. Great video. Thanks for the detailed and broad content! ❤
Amazing to hear they finally open sourced Grok-1. No doubt given the channel history you will build it from scratch and validate it matches the distributed weights and doesn’t have any sleeper agents, etc, as you can do with any good open source project. Right? We don’t just have to take the word of the guy that has repeatedly lied and misled many? That’s the power of open source, right? Trust, but verify.
23:30 c'mon Yannick, use the original for "On the internet, no one knows your dog" which is Peter Steiner's 1993 cartoon in The New Yorker. As I said last week, your curved movie-screen topic thumbnails should snap into a full screen view, not whiz off the side.
Can you please make a video explaining this paper on anomaly detection "Asymmetric Student-Teacher Networks for Industrial Anomaly Detection"? It will be a great help😊
Given that we are talking about Musk, I would guess his next model will have exactly 3141B parameters (and the next next one, 31415B). And that will be a hard requirement given to his engineering team...
The performance is quite poor, it's useless to them (and to the vaaast majority of people). If they had a SOTA model, I wonder if they would open-source it. Also: it's an MoE model, not all parameters are active at the same time. Expect it to require about the same compute as an 80b model.
@@ClaudioMartella If you think there is no difference between 4 and 8 bit quantization you havent worked with these models at all. And what even is ~1.5 bits per weight? 3 bits per 2 weights? It stretches credulity.
@@ClaudioMartella that paper came from the authors of "RetNet: the successor to transformers" (~half of the authors are the same to be more exact). This time they didn't want to be just to be just a mere successor and called it a new era. They are high on their own farts. Oh, and retnet was such a successor that the only model released was a pile of garbage and got unreleased. 1.58bit paper didn't even compare themselves to the "successor' of transformers that they build themselves And author never released the weights. I expect their next paper to be called "The bestest revolutionest architecture in the mutliverse" with loud claims and no impact.
@@Sven_Dongleyou don’t seem to understand the difference between the possibility that some model can perform well with 1.5 bpw and ability to cast any model (presumably pretrained with 16bpw precision) into 1.5 bpw without losing quality
12:45 - All the technology and buildings and tools that mankind has created is made for the humanoid form. Making humanoid robots would mean that they could use everything we use as well. E.g. you want your floor cleaned? Get a robotic vacuum cleaner? No, better give your ordinary vacuum to a humanoid robot so that he cleans the floor with it.
The Google post was to help the SEO shamans start hallucinating in the right direction. Without it, there are prone to all sorts of fun thinking. Blog posts like that keep them mostly sane.
Khan academy was one of the first reported partners using openai so I assume the gpt -khan model name is specific to a Khan application rather then just trained on khan data nuance.
In India, the government has considered a non-regulatory approach, emphasizing the need to innovate, promote and adapt to the rapid advancement of AI technologies. The EU should do the same. Innovation is very important than the rules about regulation, otherwise we will lag behind in development compared to the US, China, the politicians have no understanding of technology at all, which has major consequences for knowledge technology and our future generation. Many thanx for your great update.
The EU is fully aware that its innovating capacity is miles behind American or Chinese massive corporations, there's just no way Europe will win the research battle in an open field. I think regulating and sort of "protecting" the market from these corporations is a good move, especially since the applications of AI will keep covering more and more ground in the next decades
@@matteofrattini9133 Not so sure about that, the US is the biggest business center, which is different from being the most innovative. European regulation is likely to influence guidelines across the globe. It's not just about protecting the market, but ensuring some level of safety with a technology that is dangerous.
@@technolus5742 I fully support Europe's effort to regulate a potentially dangerous field. I'm just saying it's also a strategic move, since Europe could never achieve technological dominance in an unregulated field against powerhouses like the US or China
Hi Yannic, Can you please confirm that you're coming to WebExpo? I've tried a few times to get in touch. Hope to get a reply, and I'm sorry for chasing. Please let us know.
I experimented with rolling my own 4-bit float encodings, and the lack of precision made them challenging to use. Maybe it will be useful with the first several passes of quickprop.
I realize nobody probably cares but GR00T is zeros no the letter O. GR00T vs GROOT. It jumps out once you see it printed on a page that shows cap-O, like 4:00
@@Brahvim Just trying head off the "who cares?" default response from YT comments. Personally I think it is neat to discover things by paying close attention (in this case visually).
@@erikjohnson9112 At least it isn't a "secret" anymore, thanks to you! People WILL know it's a `0` and not the letter 'O' now! ... People like _me,_ at least! It really is great. Keep it up, dear internet stranger!
We should use genetic algorithms to evolve the ideal robot form factor based on parts, cost, and human built environments and tasks. Maybe they won't look like humans!
How does the 1 bit embedding work? Can someone explain? That eould mean things are encoded into binary right? So does that mean precision foesn't really matter to throughout the machine learning model? That at some point it becomes like a pigeon hole thing within the model?
I'm not sure if the future needs our support. But that's probably just me. I have a hard enough time maintaining interest in the present, being excited about a future is way beyond my abilities.
The person who had not used internet before will never know what "agreement on cookies" even is, especially since the agreement can be constructed in misleading but technically correct ways aka "do you want to create accounts on the internet?". Which would ofc implicitly include an account on provider's site, bonus points if said provider holds a de facto monopoly in the area. That's akin to implicitly agreeing to be stabbed by a knife just because you bought one for your kitchen.
I kinda feel bad, that we got to a point that everytime someone mentions Elon, he has to first say "I am not endorsing him / what ever you think about him / yeah he is bad, but this is good", he is really reaching Trump level rep isn't he
I love how he said "what ever you think of him" 2x. Yet you added versions he didnt say that imply or dirrectly say elon is more on the bad side. Which is not what he said in the video. Very weird to read your comment. Its like subtly changing what happened... but in a way where its doesnt overtly look like lying unless you rewatch the video clip. Please try to be more accurate in the future.
'Whatever you think of him' is a pretty unsubtle code. It's a polite way of distancing yourself from a person. Added with endorsement for his actions, and he's taking a pretty neutral stance. Which is a wise public stance around polarizing figures. Regardless of which pole he's actually on, it's foolish to spend his social capital on the subject. That's the message he's actually sending.
On open source Sora like models. Imo there will be no parity bc of compute. The results I have seen from sora indicate to me that it is not raw vid train data, but rather a hybrid of synthetic data that uses either a nerf or guasssplat supplement to create that level of fine control and temporal fiedility. As impressive as sora is, it still seem obvious to me that 3d rendering is part of their training pipeline, and imo the most reasonable explantion of how to get large amounts of synthetic data there is with nerf/guass splat
lol...I am not surprised you are impressed with grok considering your 4chan llm. But as far as I am concerned they are about par in terms of impressiveness relative to state of the art.
Also...not sure how reliable "popularity graphs" are concerning online tools...whang did a cool vid about such voting metrics are easily manipulated...boaty mcboat face examples come to mind.
11:10 I'm slightly confused here, because it sounds like you think the lawmakers can shift their focus onto developing technology instead. But I know you don't think that, I'm just not really sure what your point is.
The transformer architecture consists of multiple decoder layers, each containing: - Multi-head attention (MHA) with query, key, and value projections - Mixture of experts (MoE) layer - Feedforward layers (linear transformations with activation functions) - Layer normalization and RMS normalization The MoE layer uses the `Router` to compute routing probabilities, which determine the experts to route the input to. The selected experts process the input independently, and their outputs are combined based on the routing probabilities. The multi-head attention mechanism allows the model to attend to different positions in the input sequence, capturing dependencies and relationships between tokens. The rotary position embeddings (RoPE) enhance the model's ability to capture relative position information. The transformer model takes an input sequence and applies the decoder layers sequentially. At each layer, the input goes through the MHA, MoE, and feedforward layers, with layer normalization and residual connections. The final output of the transformer is the embedded representation of the input sequence. The code also includes sharding and partitioning utilities to distribute the model across multiple devices for efficient training and inference. Overall, this transformer architecture incorporates mixture of experts layers and rotary position embeddings to enhance the model's capacity and ability to capture complex dependencies in the input sequence.
I'm so glad to see ML news back in action with more regularity. You've got actual knowledge and credibility that matters for presenting info in a rapidly crudifying space, the scene is filling up with empty influencer know-nothings, and I want the straight dope and technicals. Thank you.
Yannic's channel was my first experience in ML and AI news when GPT4 exploded. Yannic is the real deal, one of the best and most reliable sources available.
The sunglasses are the most important, followed by the humor; but yeah, okay, I appreciate the information as long as it doesn't interfere with my fandom.
Yeah same here. I know you are busy with open assistant and stuff like that (keep those things up) but we need your AI news and paper reviews! How else am I gonna move on with my own AI project. Half the technology I'll need is still buried in some obscure paper somewhere in the paper pile!
For the Grok model it is worth noting this has no fine tuning, and its performance is bad. I want to give Elon credit but this seems like more of a performative release than a real contribution.
Open sourcing Grokh was cheap, it was a basically useless model in terms of commercial usefulness.
and useless in terms of research since it's jax
Inflection put out a pretty good product with Pi. The Pi chatbot is tuned to being a companion and even as a kind of therapist. Giving an empathetic digital face to Microsoft will not end well.
The robots are humanoids because the tools have been developed around the human topology, so if you want to have a robot in the future to interface with these tools, humanoid is the most optimal form.
I generally agree with you but there’s also the marketing aspect of it.
WHEN AND WHERE SHALL WE GET THE CAT ROBOT GF!!!!
This is AI world, YOU are the cat.
I'm always happy to see your videos. You give informative breakdowns without getting lost in the sauce about how whatever minor new improvement shocks the industry and marks full on AGI.
Thank you
You joke...but I think mandatory discoslure of AI is the easist regulatory hurdle that will make a huge impact imo.
Ideally people would discoluse such without regulatory compulsion...but imo it is important, to me at least, that such disclosures exists moving forward.
Just like how the cookie disclosure has had a huge impact, and California's cancer warnings?
@@johnflux1 GDPR have a huge impact, even american companies had to adapt.
@@johnflux1 Yes it did make an impact, since now a typical Norman knows "cookie" is some scary legalistic jargon and not just a magical internet feature which "just works".
i love a good Monday ML news on Tuesday that I watch on Wednesday, cheers Yannic!
Make sure you think about the downsides to open sourcing frontier models also. Open source might be the best way but it's not clear to me the benefits are worth the risks.
I'm all for EU cookie-nags. Just say no, and I always do. Companies maximize their data hogging anyway, so I don't see why there would be less data by default in an alternative universe, where no EU cookie law exist.
As a fellow AI engineer/enthusiast of course I'm happy when we can just release cool stuff, but Europe is going in the right direction by starting to regulate the field. While AI is amazing there are still countless possibilities of misuse, overapplication and even rights infringement (I'm thinking privacy), so regulating big corps early on is necessary.
Or we can just take the US route and have sugar in our bread lmao
@yannic there is a large FP4 literature now (NF4, QLoRa etc) with hundreds of QLoRa models on HuggingFace
Humanoid robots also make sense if you believe that we can get video pre-training to work from human videos
Great To see you again God Bless You 🫵🏼❤️
I don't know if I've missed your coverage, but 1.58bit model training / inference source code has been released, which is interesting because of the scaling law suggested by the associated paper.
Humanoid bots: It's about training. If you want to show a bot how to do something, it needs to be able to follow your example. If it's not a humanoid, it has to solve every task in existence from scratch. If it's humanoid, it can learn from human actions.
Great video. Thanks for the detailed and broad content! ❤
Amazing to hear they finally open sourced Grok-1. No doubt given the channel history you will build it from scratch and validate it matches the distributed weights and doesn’t have any sleeper agents, etc, as you can do with any good open source project. Right? We don’t just have to take the word of the guy that has repeatedly lied and misled many? That’s the power of open source, right? Trust, but verify.
23:30 c'mon Yannick, use the original for "On the internet, no one knows your dog" which is Peter Steiner's 1993 cartoon in The New Yorker.
As I said last week, your curved movie-screen topic thumbnails should snap into a full screen view, not whiz off the side.
Can you please make a video explaining this paper on anomaly detection "Asymmetric Student-Teacher Networks for Industrial Anomaly Detection"? It will be a great help😊
Grok=314B, Pi=3.14..., I assume this is deliberate?
You sure did some deducing there detective
Given that we are talking about Musk, I would guess his next model will have exactly 3141B parameters (and the next next one, 31415B). And that will be a hard requirement given to his engineering team...
naah its going to be 420B first
The performance is quite poor, it's useless to them (and to the vaaast majority of people). If they had a SOTA model, I wonder if they would open-source it.
Also: it's an MoE model, not all parameters are active at the same time. Expect it to require about the same compute as an 80b model.
there's actually a paper that shows that ~1.5 bits per weight is enough
And that paper is probably good for wiping butts.
@@Sven_Dongle why?
@@ClaudioMartella If you think there is no difference between 4 and 8 bit quantization you havent worked with these models at all. And what even is
~1.5 bits per weight? 3 bits per 2 weights? It stretches credulity.
@@ClaudioMartella that paper came from the authors of "RetNet: the successor to transformers" (~half of the authors are the same to be more exact).
This time they didn't want to be just to be just a mere successor and called it a new era. They are high on their own farts.
Oh, and retnet was such a successor that the only model released was a pile of garbage and got unreleased. 1.58bit paper didn't even compare themselves to the "successor' of transformers that they build themselves
And author never released the weights.
I expect their next paper to be called "The bestest revolutionest architecture in the mutliverse" with loud claims and no impact.
@@Sven_Dongleyou don’t seem to understand the difference between the possibility that some model can perform well with 1.5 bpw and ability to cast any model (presumably pretrained with 16bpw precision) into 1.5 bpw without losing quality
12:45 - All the technology and buildings and tools that mankind has created is made for the humanoid form. Making humanoid robots would mean that they could use everything we use as well. E.g. you want your floor cleaned? Get a robotic vacuum cleaner? No, better give your ordinary vacuum to a humanoid robot so that he cleans the floor with it.
The Google post was to help the SEO shamans start hallucinating in the right direction. Without it, there are prone to all sorts of fun thinking. Blog posts like that keep them mostly sane.
SEO Shamans :)
23:00 It used to be "one guy" they all called, but now they just get LLMs to hallucinate the answers. 😆
Rest of the world: develops AI and makes it open source
The EU: we don’t do that here
very competent …. single source 1400 lines of code 🎉😂
Can we get a 250k subscriber special on ML meme reviews please?? 😃
Khan academy was one of the first reported partners using openai so I assume the gpt -khan model name is specific to a Khan application rather then just trained on khan data nuance.
Love your joke in EU AI Act, watched twice 🤣🤣🤣
3:27 Half a bit on, half a bit off, you're half way to quantum computing!
In India, the government has considered a non-regulatory approach, emphasizing the need to innovate, promote and adapt to the rapid advancement of AI technologies.
The EU should do the same. Innovation is very important than the rules about regulation, otherwise we will lag behind in development compared to the US, China, the politicians have no understanding of technology at all, which has major consequences for knowledge technology and our future generation. Many thanx for your great update.
The EU is fully aware that its innovating capacity is miles behind American or Chinese massive corporations, there's just no way Europe will win the research battle in an open field. I think regulating and sort of "protecting" the market from these corporations is a good move, especially since the applications of AI will keep covering more and more ground in the next decades
@@matteofrattini9133 Not so sure about that, the US is the biggest business center, which is different from being the most innovative.
European regulation is likely to influence guidelines across the globe. It's not just about protecting the market, but ensuring some level of safety with a technology that is dangerous.
@@technolus5742 I fully support Europe's effort to regulate a potentially dangerous field. I'm just saying it's also a strategic move, since Europe could never achieve technological dominance in an unregulated field against powerhouses like the US or China
The models are the private deployment capacity for those companies.
🎉 You make people to like mondays
"employ, exploit, extinguish."
Wow
Is the groq inference code as fast as the one they use for hosting?
Hi Yannic, Can you please confirm that you're coming to WebExpo? I've tried a few times to get in touch. Hope to get a reply, and I'm sorry for chasing. Please let us know.
I experimented with rolling my own 4-bit float encodings, and the lack of precision made them challenging to use. Maybe it will be useful with the first several passes of quickprop.
That list of model names appear to have been thoroughly scrubbed... Catbox 404.
I realize nobody probably cares but GR00T is zeros no the letter O. GR00T vs GROOT. It jumps out once you see it printed on a page that shows cap-O, like 4:00
Thanks for telling me.
...and me specifically, given how you yourself think about not many caring.
@@Brahvim Just trying head off the "who cares?" default response from YT comments. Personally I think it is neat to discover things by paying close attention (in this case visually).
@@erikjohnson9112 At least it isn't a "secret" anymore, thanks to you! People WILL know it's a `0` and not the letter 'O' now!
...
People like _me,_ at least!
It really is great. Keep it up, dear internet stranger!
We should use genetic algorithms to evolve the ideal robot form factor based on parts, cost, and human built environments and tasks. Maybe they won't look like humans!
Microsoft never stopped doing EEE, did it?
6:09 Welp, guess josh is getting fired
How does the 1 bit embedding work? Can someone explain? That eould mean things are encoded into binary right? So does that mean precision foesn't really matter to throughout the machine learning model? That at some point it becomes like a pigeon hole thing within the model?
I love you andnml news so kuchby
"First major AI law passed by European LLaMakers"
FP4 maybe we just need 0/1
Wow, does that list expose companies that fine tuned models with OpenAI?
ai banner would be great.
samay ki english achi hogyi hai xd
For one I think the AI act is a good thing for people !
Very very cool AF😎
Those look like open ai api adapters. Ie the APIs that OpenAI can access.
I guess for every GDPR we get a cookie-esque law.
I like how you don’t jump on the hype train as soon as it passes by. I see you were here before it was cool.
"These people had wey too much precision" 🤣
appreciate the information/evaluation density.
Well 1-bit is closer to a 'biological' activation function 🤷
Question is how many groqs it takes to run grok.
22:40 - I hope Google bargained an Android release of iMessage into the deal.
Terrifying robot movement at 4:37
RIP, Josh.
Josh sweating like crazy right now
I'm excited for the future.
I'm not sure if the future needs our support. But that's probably just me. I have a hard enough time maintaining interest in the present, being excited about a future is way beyond my abilities.
If a robot costs more than a humans year salary, it's not worth it.
Yet...
Depends on the upkeep costs and how long it remains how useful, right? Especially if there’s a “rent-to-own” option!
@@drdca8263I was assuming an average lifetime of 1 year. Sabotage by human coworkers is definitely going to be a thing.
Great work! Keep it up!
Flat repo for grok is hard asf
People should confirm agreement on cookies just once - when they connect to the internet. It should be written in the agreement with provider.
The person who had not used internet before will never know what "agreement on cookies" even is, especially since the agreement can be constructed in misleading but technically correct ways aka "do you want to create accounts on the internet?". Which would ofc implicitly include an account on provider's site, bonus points if said provider holds a de facto monopoly in the area.
That's akin to implicitly agreeing to be stabbed by a knife just because you bought one for your kitchen.
Very cool ❤
Nice
I kinda feel bad, that we got to a point that everytime someone mentions Elon, he has to first say "I am not endorsing him / what ever you think about him / yeah he is bad, but this is good", he is really reaching Trump level rep isn't he
He's done it to himself. No one told he should try as hard as possible to ruin his own reputation.
I love how he said "what ever you think of him" 2x.
Yet you added versions he didnt say that imply or dirrectly say elon is more on the bad side.
Which is not what he said in the video.
Very weird to read your comment.
Its like subtly changing what happened... but in a way where its doesnt overtly look like lying unless you rewatch the video clip.
Please try to be more accurate in the future.
I mean it's his own fault, he's a lying scumbag
'Whatever you think of him' is a pretty unsubtle code. It's a polite way of distancing yourself from a person.
Added with endorsement for his actions, and he's taking a pretty neutral stance. Which is a wise public stance around polarizing figures.
Regardless of which pole he's actually on, it's foolish to spend his social capital on the subject. That's the message he's actually sending.
@@Dogo.R you can't be that naive. Elon Musk has pretty much forced people to pick sides when forming an opinion on him.
why not 420 billion parameters ?
On open source Sora like models.
Imo there will be no parity bc of compute.
The results I have seen from sora indicate to me that it is not raw vid train data, but rather a hybrid of synthetic data that uses either a nerf or guasssplat supplement to create that level of fine control and temporal fiedility.
As impressive as sora is, it still seem obvious to me that 3d rendering is part of their training pipeline, and imo the most reasonable explantion of how to get large amounts of synthetic data there is with nerf/guass splat
lol...I am not surprised you are impressed with grok considering your 4chan llm.
But as far as I am concerned they are about par in terms of impressiveness relative to state of the art.
Also...not sure how reliable "popularity graphs" are concerning online tools...whang did a cool vid about such voting metrics are easily manipulated...boaty mcboat face examples come to mind.
11:10
I'm slightly confused here, because it sounds like you think the lawmakers can shift their focus onto developing technology instead.
But I know you don't think that, I'm just not really sure what your point is.
Lol Grok-1 “will probably require 69 GPUs to run”, haha at least that many. Probably more like 420 😂
And its 256 GB for the weights, so you probably need a tera of RAM, then a combined tera of RAM over the range of the GPU's
Most of the Open Source are rather just Open Sores
The transformer architecture consists of multiple decoder layers, each containing:
- Multi-head attention (MHA) with query, key, and value projections
- Mixture of experts (MoE) layer
- Feedforward layers (linear transformations with activation functions)
- Layer normalization and RMS normalization
The MoE layer uses the `Router` to compute routing probabilities, which determine the experts to route the input to. The selected experts process the input independently, and their outputs are combined based on the routing probabilities.
The multi-head attention mechanism allows the model to attend to different positions in the input sequence, capturing dependencies and relationships between tokens. The rotary position embeddings (RoPE) enhance the model's ability to capture relative position information.
The transformer model takes an input sequence and applies the decoder layers sequentially. At each layer, the input goes through the MHA, MoE, and feedforward layers, with layer normalization and residual connections. The final output of the transformer is the embedded representation of the input sequence.
The code also includes sharding and partitioning utilities to distribute the model across multiple devices for efficient training and inference.
Overall, this transformer architecture incorporates mixture of experts layers and rotary position embeddings to enhance the model's capacity and ability to capture complex dependencies in the input sequence.
Bro what are you doing commenting on yt go build West World with that big brain or please be the next president a
@@WiseCheese587 Its just a regurgitation of the Grok-1 specs, genius.
Split one side Truth word Of God/other Ungodly things ❤
Grok-1 256GB for the weights. Good luck.