STRAWBERRY - what OpenAI HIDES from us.

Scripter

Просмотров 4,4 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 4 янв 2025

Комментарии • 119

@zxwxz 3 месяца назад ⁺⁶
"The key point mentioned here does not explain the source of o1's reasoning ability. Tokenization is something that basically every LLM (large language model) can do, and most players in the agent field can assemble their own. The real breakthrough of o1 is that it implemented AlphaGo Zero-style reinforcement learning in the LLM domain, which is the true source of this new paradigm. Additionally, the integration of MCTS (Monte Carlo Tree Search) is necessary to overcome existing limitations. O1 is not in the form of so-called multi-agent; rather, it is a recurrent reasoning flow with shared memory and logical coherence. The purpose is to resolve the instability of long-tail problems and break through into unknown domains. O1 also indirectly unlocked the limitations of data walls. Internet data, at best, presents memory-related issues rather than reasoning issues, and there's an overwhelming amount of garbage data. After o1's breakthrough, the questions that strawberry now sends to Orion might be difficult even for non-experts in the field to understand."
@xscale 3 месяца назад ⁺⁴
Claude passes the rephrase test just fine.
@DCinzi 3 месяца назад ⁺⁶
Good video.
However i dont understand why people make it so complex.. english is language, math is a language, coding is a language. Every idea associates to a word, number codeline... language describes thought. All that need to be said is that LLM have their own language to which they translate ours, and we just are unsure of what it is, that's all
@Snes64 3 месяца назад ⁺³
But the problem is that it still doesn't have intelligence. I still have to inject my own intelligence into it.
@Scripter_story 3 месяца назад ⁺²
@DCinzi That's exactly the point of the whole video. Thanks for condensing it into one paragraph.
@DCinzi 3 месяца назад
@@Snes64 I dont think we really want to get to the point where that's not true any more.. at least I think
@Random77773 3 месяца назад ⁺¹¹
Great video
Practically how long do you think until majority (50-60%) of software engineers are replaced entirely by state of the art LLMs when will that happen ?
Since no company is publicly stating layoffs are due to AI
@Scripter_story 3 месяца назад ⁺⁴
@Random77773 Thanks. 2-3 years before they stop hiring junior / mid SWEs altogether.
@DCinzi 3 месяца назад ⁺²
@Random77773 depends how long the install takes.
THe way I see it is that right now we are in the stage of install... we are all so amazed by those data transfer happening and the machine on the other side being able to take that data and unpack it and communicate back. But really all we are doing is installing the program. When that is done.. well the chance is swift,, and we get to see the real program in action.
@HoD999x 3 месяца назад ⁺²
developers will write prompts instead. at least in the next step
@erkinalp 2 месяца назад
@@HoD999x oh, natural language programming, our forever dream language of programming
@michaelnurse9089 3 месяца назад ⁺⁶
The quality of the videos are getting better and better. Also, so much easier to listen to without music behind the talking.
@Scripter_story 3 месяца назад ⁺²
@michaelnurse9089 We are learning)))
@Scripter_story 3 месяца назад
@michaelnurse9089 Hope politics treats you well.
@SourceOfL 3 месяца назад ⁺⁴
After my interaction with o1-preview I came to the conclusion that all they did was add agents that criticize the AI's answer, after a few cycles it gives you the answer... Secondly they trained it on 70% of the generated data, I think this is really stupid, instead of trying to understand the limitations they just scale the model thinking that if it's big enough it will become AGI... The question is why don't people need to study the entire internet to understand simple things.
@realitytwist-blue 3 месяца назад
It totally makes sense as CriticGPT was launched before
@spiker.c6058 3 месяца назад
The fact that you say they're stupid just is just like you were insulting all the research talent and ressources they have ! This is literally a new paradigm shift.
@SourceOfL 3 месяца назад
@@spiker.c6058 I said the idea to generate 70% of training data is stupid. I don't know them to call them that way
@Rich65501 3 месяца назад ⁺³
This is my favorite Ai channel. Thank you Scripter.
@Scripter_story 3 месяца назад
@Rich65501 Thanks, Rich)
@adoraduca 3 месяца назад ⁺²
Vector communication between models is an interesting idea. Just thinking the AI can even develop a vector-based highly-efficient programming language.
@Scripter_story 3 месяца назад ⁺³
@adoraduca
Thanks. Re vector-based programming language, I think, it's an inevitability.
@francisco444 3 месяца назад ⁺²
Happy Friday, Scripter! Just a gentle reminder to pause, take a deep breath, and enjoy a present moment.
@Scripter_story 3 месяца назад ⁺¹
@francisco444 Doing it right now)) In the flow...
@jdrake411 2 месяца назад ⁺¹
I don't know. I just tried the rephrase test with ChatGPT, Perplexity, Claude, Gemini. They all passed with flying colors.
@nanuqcz 2 месяца назад
The video topic starts at 3:15.
@Scripter_story 2 месяца назад
@nanuqcz Agree. We're still learning. Thanks for the comment - it helps us.
@samvirtuel7583 3 месяца назад ⁺²
It makes the same mistakes as the classic LLM, it hallucinates in the same way.. therefore Strawberry is necessarily based on GPT4, it simply decomposes the original prompts according to a specific algorithm.
Half the time it answers this question wrong: Jane has 1 brother and 2 sisters, how many sisters does Jane's brother have?
Likewise for this question: How many 'r's are in the word strawberry?
Conclusion, it's not there yet.
@RadiantNij 2 месяца назад ⁺¹
Exactly I've proven this by using an o1 type system prompt on gpt4o and it goes through the thanking process the same way. It just thinks longer which allows it to produce better results.
@GraphicdesignforFree 3 месяца назад ⁺²
Very interesting!
@Scripter_story 3 месяца назад
@GraphicdesignforFree Thanks)
@azjaguardesign 3 месяца назад ⁺²
When prompted to re-phrase a previous answer (“respuesta”) … The new #Strawberry-AI, as opposed to earlier models constructed by the “Open AI” company, can now successfully navigate and produce an appropriate, intelligible alternative response sans selected keywords from the original response. And, that’s “replacement theory” on steroids according to Señor Scripter. This ability to innovate “on the fly” denotes a knowledgeable being far beyond Alan Math-i-son Turing’s original qualifying machine constraints. 👽 5:50
@Scripter_story 3 месяца назад ⁺¹
@azjaguardesign Thanks from Senior Scripter)))
@DT-ss3ro 2 месяца назад
Why did you take down the last Friday video?
@alexandrostopalidis9007 3 месяца назад
I really appreciate your videos! They strike a perfect balance between delivering important information, offering clear explanations, and focusing on fundamental, strategic, and long-term insights- I will say it again and again, THANK YOU!
@mahakleung6992 2 месяца назад ⁺¹
Very thought provoking!
@Scripter_story 2 месяца назад
@mahakleung6992 Thank you!
@mahakleung6992 3 месяца назад ⁺¹
But there must be more than just high performance agent communication as that just gives you speed. But you are saying that Strawberry defines a new class of functionality. Based on what you revealed just as scale leads to emergent properties, then speed leads to emergent properties. But intuitively that seems harder to swallow than scaling. Something is missing here. Not simply a new representation, but how it is manipulated. Are we back to the AI and modeling of thought again of 1990s now that we have compute and scaling?
You have done very well. You got the hook and closed with the cliff hanger. On to 20K!!! WELL DONE.
@Scripter_story 3 месяца назад
@mahakleung6992 Thanks!
@mahakleung6992 3 месяца назад ⁺²
You made it! Going to watch now!
@Scripter_story 3 месяца назад
Aren't you supposed to sleep now?
@schrodingerbracat2927 3 месяца назад
My question is: is it a Hilbert space? Or non-linear space?
@marshallodom1388 3 месяца назад
Inside our brains there's a multi dimensionality to every concept we are aware of. It's not written in binary or using letters. There's a map of it online they're building and just like agents speaking in vectors someday we might tap into our own language center and transfer these concepts directly to and from computers, being able to "read" our minds, record our dreams, or telepathically communicate in digital forms with each other or AIs. Sad to say we probably won't get to now that we've given biased black boxed free reign over our language already. We won't be controlling our own destiny from this point forward.
@MrSniper2k7 2 месяца назад
Thanks for always keeping us updated on this technology
@Scripter_story 2 месяца назад
@MrSniper2k7
Thanks for being with us)
@aware2action Месяц назад
How difficult it is to comeup with a trigger to identify the request of rephrasing(less than 100 synonyms in thesaurus)🤔. Using that as trigger to proceed with a complementary vector db, should solve two issues. One trying to be a smart imposter with reasoning🤞,
Another handling scalability through division. Additionally, by verifying the output from each complementary vector db, hallucinations can be reduced, when accuracy is needed. In the end it is all about faking it, until there is no perceived difference.
Don't think GPT based LLMs will evolve in to a concious Ai! anytime in near future🤞.
What is interesting is the choice of name strawberry, that has many seeds on the outside, instead of one inside, as in most fruits🤯. Just some thoughts extending on what is being said.❤👍
@IvanToman 2 месяца назад
I'm not sure that you can use the test intended to detect human imitators, on AI. That sounds to me like using tape measure to determine someone's weight. It might be approximate but as it is wrong tool, it is not a proof of someone's weight.
@jsalsman 2 месяца назад
I've stopped using o1 because for more than a dozen of my practical assistance requests I've compared so far, 4o does as well or sometimes better in a tiny fraction of the time.
@talleslas 3 месяца назад ⁺¹
Unfortunately "open ai" it's "closed ai" nowadays, so all we can do is speculate.
When you mention that more than one model is involved in o1, is this just a theory based on your usage or do you have any actual evidence that let you conclude that?
In my mind o1 it's just the good old fashioned "think step by step" chain-of-thought technique applied/fine-tuned on top of the existing model...
@Scripter_story 3 месяца назад ⁺¹
@talleslas You are right, it is a theory, a speculation. But it is based on a fair bit of analysis and intuition.
@MyOkman 3 месяца назад
So removing the encoder and decoder layers from the communication between the agents is supposed to create a revolution?! How is that?
If we have two models, probably they have two different encoding decoding systems. So you can't take the vector output from one model and plug it right away into the other ( it is like a Chinese talking to a German)
Could you make a video about why exactly you think that removing the encoder decoder layers from the communication between the models is supposed to make a revolution?
@El_Mehdy 3 месяца назад
Very informative, thank you for sharing
@Scripter_story 3 месяца назад
@El_Mehdy My pleasure!
@human_shaped 3 месяца назад
The re-phrase test is a nice and very simple idea.
@Scripter_story 3 месяца назад
@human_shaped Thanks. It returns very interesting results, indeed.
@fatboydim.7037 3 месяца назад ⁺¹
So are you speculating that Open AI have also created true agentic ai as well as reasoners. How far away till we have the innovators level of ai then ??
@Scripter_story 3 месяца назад ⁺²
@fatboydim.7037
I think we are already on the exponential curve. Which means soon.
@erkinalp 2 месяца назад
@@Scripter_story isn't sakana ai's ai scientist an attempt towards the innovator ais?
@deathwishjoe 3 месяца назад
I dont really understand how vector communication helps with stem problems but not other case uses. Why dont we have graduate level of creative writing or the like? I haven't used strawberry but from my understanding its amazing at math and science but philosohy and creative writing is about the same as before
@OnigoroshiZero 3 месяца назад
Great video!
@Scripter_story 3 месяца назад
@OnigoroshiZero Thank you. We tried our best))
@uber_l 3 месяца назад ⁺¹
Then how do you explain stupid mistakes if it is an expert of everything and communicates on the ninth level of vectorial heaven
@mircorichter1375 3 месяца назад
Do you have more concrete data about your tests? A github repo or something? I would like to understand the exact criteria for when rephrasing is achieved vs when not. Ideally everyone should be able to replicate your claims.
Also we need to understand how such models can be trained, because afaik Steawberry is the training algorithm not any particular result.
But i think you are on the right track. Dr WaKu claims something similar, saying that the model has a property they at open ai call 'diversity'.
I'm thinking of this, that when you generate a batch of results for any task, the batch is 'diverse' in the sense that the answers a truely different angles on the same problem. While for other models the tensors in the batch are 'similar'
This then would be very useful for agent style promting in the style of the Q* Algorithm from the 1973 paper.
So please share the method, such that we have a method to quantify diversity.
The hard part would be to find a way of how to train models to be diverse. Dr WaKu says in a sidenote that reinforcement learning might not be right because it tends to let paths converge and hence make them similar.
@mircorichter1375 3 месяца назад
Note that i'm not talking about q-learning but "The Q*-Algorithm - a search strategy for a deductive question-answering system" and it is not used in LLM training but later at the Level of the autonomous Agent that uses the already trained diverse model to then give Something Like o1. The diverse Output batch can be the nodes in Q* Star ... Just speculation of course
@Scripter_story 3 месяца назад
@mircorichter1375 I wish there was a simple way to eval / quantify this. And one of the reason is that every time you get a very different answer, because the model learns from the previous history of your interaction with it. Test cases are not replicable. Especially when you move from Earth rotation-like simple cases to more complex subjects. So, it is not an objective test. Yet, we've found its results convincing.
@savesoil7814 2 месяца назад
did u delete ur last video on tesla optimus? also can u upload something on quantum computers and why ai delayed so long even before chip tech came into the 'industrial market'?
@savesoil7814 2 месяца назад
*...why ai delayed even though the idea of neural networks existed way before chip yech..*
@MatterandMind 3 месяца назад
Thank you. I subscribe.
@savesoil7814 3 месяца назад
On point
@Scripter_story 3 месяца назад ⁺¹
@savesoil7814 Thanks
@KyleSSamuelson 3 месяца назад
What exact “knowledge” is included in this 500gb
@Scripter_story 3 месяца назад ⁺¹
@KyleSSamuelson It is a vector-based concentrate of all concepts of our world.
@erkinalp 2 месяца назад
@@Scripter_story yeah, a highly compressed summary of our world
@Scripter_story 2 месяца назад
@@erkinalp Exactly
@dezmond8416 3 месяца назад
ChatGPT correctly answers the question about the movement of the Earth around the Sun, and paraphrases it accurately. Mistral does the same. So what are you talking about?
@Scripter_story 3 месяца назад
@dezmond8416 On some of the simplest questions like Earth rotation, other models can give one or two reasonable rephrases (usually by direct substitution of synonyms). But if you test it on complex subjects, you see the difference immediately. For me personally, the most clear and persuasive subject was Pavlovian Dog phenomenon.
@vrynstudios 3 месяца назад
I too have a strong doubt. Recent models are not top incremented knowledge base.
@Scripter_story 3 месяца назад ⁺¹
@vrynstudios Not sure that I got your point.
@alkeryn1700 3 месяца назад ⁺⁶
what a waste of time.
@noway8233 3 месяца назад
Yeahhh😅
@patrickmcguinness1363 3 месяца назад ⁺¹
This is BS. First of all, its reasoning not knowledge that makes o1 different. Second o1 is one model not a group.
@Scripter_story 3 месяца назад
I wish I had your level of expertise. Can you give more details?
@RadiantNij 2 месяца назад
It is just thinking/processing longer, this has been proven. I have the prompt for it which I use in 4o. When you use it in 4o you see it outputs a long stream of thoughts same way.
@nemonomen3340 3 месяца назад ⁺¹
What's the secret? You don't say what the secret is in the entire length of the video.
@deathwishjoe 3 месяца назад ⁺¹
The secret is ai agents in strawberry aren't communicating via English text but via vectors. Which is apparently more efficient and leads to intelligence
@nemonomen3340 3 месяца назад
@@deathwishjoe Kind of sounds like AI word salad to me. I'm not an AI scientist, but I'm pretty sure all AI uses vectors in their models.
@Cingku 3 месяца назад ⁺³
From what I understand, OpenAI's secret was to have the model teach itself using its own agents by rephrasing its knowledge so it can differentiate from what's wrong and what's not. That's why they said AI agents communicate not via English, but rather through vectors. We don't need to convert English into vectors if they're communicating with themselves. No wonder the training is so efficient. Previously, we used human alignment, which required providing human text that all needed to be converted into vectors before the AI could understand it - such a waste of energy. I mean, for the first model, we need to use human text to give AI knowledge, but once they have that knowledge, scaling becomes unnecessary and just wastes resources.
@deathwishjoe 3 месяца назад ⁺²
@@nemonomen3340 I agree with your charaterization of it but the idea was instead of chatting and communicating via english or whatever they communicated via vectors instead of going vector to english to vector it was vectors to vectors.
@nemonomen3340 3 месяца назад
@@deathwishjoe Yeah, just seems like everything he said could've been said in a minute. Even then, I think OpenAI already gave the explanation that the way this model is different is that it isn't trained to predict text like other LLMs. Rather, it's trained to reinforce the underlying internal logic that happens to produce good and logical outputs.
Maybe that isn't exactly what he's talking about, but there are only really two possibilities. Either there's no secret because it's publicly available knowledge, or he's just making speculations with questionable basis.
@Anonymous-gu8tk 2 месяца назад
Just another plagiarism engine, only this time using multiple sources (agents) to disguise the original source/s.
@dr.farzanroohparvar7337 3 месяца назад
💚💚🤍🤍❤️❤️
@Scripter_story 3 месяца назад
@dr.farzanroohparvar7337 Thank you!
@PeterSkuta 3 месяца назад
I have full research paper on that strawberry and you speak total nonsense. This video is far away from the truth
@marshallodom1388 3 месяца назад
Thanks for sharing
@PeterSkuta 3 месяца назад
@@marshallodom1388 yeah the youtuber shared real nonsense and so far away like in Star Wars far far away... Which means he spoke BS
@marshallodom1388 3 месяца назад
@@PeterSkuta i meant that in a sarcastic way, i can be an ass at times, I apologize.
what was different in the paper if you could summarize and share?
@PeterSkuta 3 месяца назад
@marshallodom1388 RUclipsr have a missleading title that he cracked strawberry which he not and i dont like clickbait videos thats far away from the truth. In the research its even explained why openai doesnt show process the original one.... And several other aspects in detailed scientific way.
@marshallodom1388 3 месяца назад
@@PeterSkuta gotcha, much appreciated Peter
@monsieuralexandergulbu3678 3 месяца назад
It's not true, that other models don't pass rephrase test. First thing I tested : grok-2-2024-08-13 and engine-test do perfectly fine on a question about compilers, on which I'm an expert
@monsieuralexandergulbu3678 3 месяца назад
Author should have tested more models, it's sad to see idea crumbling to pieces so quickly.
@x111-c4f 3 месяца назад
why so many Z ?!!
@alexandrostopalidis9007 3 месяца назад
I really appreciate your videos! They strike a perfect balance between delivering important information, offering clear explanations, and focusing on fundamental, strategic, and long-term insights- I will say it again and again, THANK YOU!
@Scripter_story 3 месяца назад
@alexandrostopalidis9007 Thanks, Alexandr!

Следующие

Автовоспроизведение

o3: Pushing the boundaries of AGI (and of coding)