OpenAI Unveils o3! AGI ACHIEVED!

Matthew Berman

Просмотров 42 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 20 дек 2024

Комментарии • 760

@xacompany 6 часов назад ⁺¹³²
Rockstar Games has been waiting for O3 to start developing GT6
@dijitize 6 часов назад ⁺⁴
I think the game development and software development will never be the same anymore because of these AI tools.
@Sumyunguy2 5 часов назад ⁺¹
That's exciting!
@0AThijs 5 часов назад ⁺⁸
Lol.
Opens chatgpt
Prompt: Create GTA 6
@TheNexusDirectory 5 часов назад
@@dijitize it's gotta improve 100x before it will be truly useful in software development.
@kas90500 3 часа назад ⁺¹
Gran Turismo 6 was released in 2013, still would be impressive o3 to do it
@gnollio 4 часа назад ⁺⁴⁶
Get your shovels ready folks, time to dig up the goalpost.
@Ascended23 2 часа назад
@@gnollio yep. AGI will be “achieved” a great many times before we ever arrive at a consensus on what, precisely, AGI means.
@Jon-y1n 5 часов назад ⁺¹⁶²
It is AGI when i can let it take control of my work PC without my manager noticing my absence for weeks....
@TheNexusDirectory 5 часов назад ⁺³⁹
Not a joke. If it can't do that then it's not AGI
@dot1298 5 часов назад ⁺⁹
good criterion, agreed
@bestemusikken 5 часов назад ⁺⁷
Can general intelligence do that? As in anyone can substitute you? Don't think so. Why set the bar so high for artificial general intelligence, when "normal" intelligence can't.
@anta-zj3bw 5 часов назад
lmao
@TheNexusDirectory 5 часов назад ⁺⁶
@@bestemusikken but at the end of the day this is the entire hope of AGI.
@ares106 6 часов назад ⁺¹⁸⁹
Until this is in the hands of independent testers I will remain skeptical.
@DefaultFlame 6 часов назад ⁺¹⁴
Yuuuup. I don't trust OpenAI at all on anything they claim. Until its in my hands and I can see what it can actually do I don't believe anything their hype department puts out. Just look at Sora.
@brianmi40 5 часов назад ⁺⁴
still skeptical of o1? Did you same the same thing then? Learned anything since then?
@csansolo 5 часов назад ⁺²
Thanks Sherlock. Because what they have done so far is just pure rubbish isn't it?
@John4343sh 5 часов назад ⁺⁸
It has been independently tested by one of the biggest critics of LLM's and even he said this is a huge paradigm shift.
@noway8233 5 часов назад ⁺³
Absolutly ,i dont beleve in this , this companies always came with the same script , probably is a very good model but ...its geniuos until not
@localism479 5 часов назад ⁺⁵²
It is impressive, but saying it is AGI is clickbait. The G is for general, you know that. They are focused on the benchmarks, and let’s celebrate that progress. But don’t call it AGI, they are still “teaching to the test”.
@yoyo-jc5qg 4 часа назад ⁺⁵
they solved ABI, now chatgpt can get a job as a benchmark genius
@drhxa 4 часа назад ⁺¹
The point is that they're not teaching to the test. Also that you can't "teach to the test" because all problens in ARC-AGI require unique types of reasoning.
This is the most generally intelligent model out by far and far more general than the vast majority of humans. If it can't do some thing yet that humans can do, sure, but no human can do everything that humans can do either.
This is obviously AGI
@DejayClayton Час назад ⁺¹
There was no teaching to the test for this benchmark. That's specifically the point of this benchmark.
@jeffbull8781 Час назад ⁺¹
They make the point of saying it was not trained specifically on any of these tests about 15:00, now whether you believe them or not is another thing but they are not according to them 'teaching to the test'
@mortenekdahl262 6 часов назад ⁺⁶⁹
Why it’s not AGI yet: The context window remains a significant limitation. These models perform well with single questions but struggle when managing large projects that require tracking extensive context. As the amount of data increases, they start to hallucinate or lose coherence, unable to maintain a reliable thread of information.
Until this issue is resolved, these models, while powerful, fall short of being true AGI.
@BCCBiz-dc5tg 5 часов назад ⁺⁴
THIS
@tencizinec9583 5 часов назад ⁺⁴
Its " virtually " AGI. Its within reach.
@anta-zj3bw 5 часов назад
@@BCCBiz-dc5tg THIS
@francisco444 5 часов назад
Sounds like just more GPUs and we're there.
@TheNexusDirectory 5 часов назад
@@mortenekdahl262 based
@HansKonrad-ln1cg 6 часов назад ⁺⁴²
o3 is not agi. chollet is already working on a new test set which he says on his website is only 30% solved by o3 (keeping in mind always that these tests are solved 95% by average humans). on the same site he shows three examples of tests o3 didnt solve. they are very easy. o3 has no vision. it doesnt see the tests, it only reads them line by line, number by number. chollet quote: "you will know when we have agi when coming up with tests that are easy for humans and hard for models becomes impossible." we are not there yet by far.
@dot1298 5 часов назад ⁺⁵
ok, but o3 still is a considerable achievement in the *world of AI* (not AGI, i agree)
@dot1298 5 часов назад ⁺³
it could help in coding, for example
@freeideas 4 часа назад ⁺²
Very good point. Thank you. Yes, if we can still make tests that are easy for humans and difficult for ai, then that is pretty much the definition of "not agi".
@headspaceaudio 3 часа назад ⁺³
What about tests that are easy for models but hard for humans? Shouldn't they count as well? Shouldn't AGI be an average of all kinds of tests?
@freeideas 3 часа назад ⁺¹
@@headspaceaudio O3 can solve LOADS of problems that 99% of humans can't. But that doesn't hit the definition of AGI. Even if a model is barely as good as a normal human, but GENERALLY can solve any problem that a human can solve, that is AGI. No one is saying that o3 is not SMARTER than most or all humans. It probably is. But it is not "generally" intelligent in every way that a human is intelligent.
@SoccerPrince1 6 часов назад ⁺³¹⁷
AGI Achieved? I am flaming you in the comments. Stop click baiting.
@matthew_berman 6 часов назад ⁺⁴¹
not clickbait!
@aa-dt5bf 6 часов назад ⁺²⁶
More flaming here, ill apologize if not right. Doubt that
@matthew_berman 6 часов назад ⁺⁷²
Watch the full vid first and let me make my point! I know you haven’t watched it yet bc it has only been out for 3 min
@Lucasbrlvk 6 часов назад ⁺¹⁰
watch the video
@DaveKent 6 часов назад ⁺²¹
I watched the entire event. AGI is here.
@MichaelAllon 2 часа назад ⁺⁵
"If that is not AGI, at least on this dimension, I don't know what is". Matthew, what does the acronym AGI stand for?
@Martin-bx1et 6 часов назад ⁺⁶⁸
Skipped O2 to avoid copyright issues...
Ozone: "Hold my carbon dioxide infused yeast and plant materials"
@nosult3220 5 часов назад ⁺²
Lame joke bro
@Martin-bx1et 5 часов назад ⁺³
@@nosult3220 Yes - I thought it would have fallen flat too.
@nosult3220 5 часов назад ⁺²
@@Martin-bx1et ❤️
@autohmae 5 часов назад
Also, there is no copyright issue, at most it's a trademark issue and they are in different markets, so it shouldn't cause much of a problem.
The irony, stealing copyrighted material from all kinds of sources, they have no issue with.
@luihinwai1 5 часов назад
O2 is a British telecommunication company
@thirien59 6 часов назад ⁺¹⁰¹
"Were not releasing it yet" = It's a marketing communication stunt.
@thedudely1 6 часов назад ⁺¹²
"so we just got one upped by google but wait no we didn't please believe us!"
@clarityhandle 6 часов назад ⁺⁶
@@thedudely1 you guys expect them to release a new model every week??
@thedudely1 6 часов назад
@clarityhandle it's just been obvious how much they're holding back on what they actually have and how they only act when they're forced to.
@brianmi40 5 часов назад ⁺⁵
Relax, o1 went from Preview to out in 3 months.
@brianmi40 5 часов назад ⁺³
@@thedudely1 Yeah, they got "forced" an amazing 12 times in the last 12 days. genius.
@fg6147 4 часа назад ⁺¹⁸
Somebody please define "AGI". The term isn't even agreed upon by "experts" in the field
@h83301 3 часа назад ⁺²
Very true. It's generatic af. Honestly this model is impressive, very impressive and clearly outshines anything that was considered SOTA before hand. A significant breakthrough which will lead us further towards human obsolescence. AGI? It's just a generic term that literally has no one definition. We can't even define reasoning or conciousness, so no AGI will never have a meaning nor will the other terms. Just generic terms used toove goalposts.
@User-actSpacing 3 часа назад ⁺¹
Matthew is not even near expert. He is an idiot. Let’s call the system AGI if it starts automatically test, improve itself and contribute to humanity without human input.
@Ascended23 2 часа назад
@@fg6147 if you’re a marketer at OpenAI, AGI means whatever capabilities the latest model has. Expect every single new model from them from here on out to “finally achieve AGI.”
@sbowesuk981 6 часов назад ⁺⁶¹
Prediction: The impression I'm getting is that this technology is becoming so resource intensive and expensive to run, that the top-tier stuff is not going to be for consumers, but giant companies and governments. As time goes by it'll be a "you can look but not touch" situation. Well get the watered down toys, while the giant entities get the super-powered versions and true AGI/ASI.
@ryanscott642 6 часов назад ⁺⁵
Imagine complaining you get chatgpt for free
You are right tho
@chrisrogers1092 6 часов назад ⁺¹²
That will change as the hardware(Nvidia GPUs) gets exponentially faster with each generation
@Sumyunguy2 5 часов назад
Slaves we are. (Yoda)
@aitandechunveiled 5 часов назад ⁺²
It will continue to happen...and once AI is required for healthcare, education, etc. the void will become large.
@NaanFungibull 5 часов назад ⁺⁶
Imagine the power plays and social engineering and mass manipulation that those with the money to run these models to their advantage will exert over those that can't afford to harness its power.
@nickrusso86 6 часов назад ⁺³⁵
If this is truly AGI, then that will last about a week before we get to ASI. Greetings robot overlords!
@somebody-anonymous 6 часов назад ⁺³
Maybe o1 was AGI and o3 is ASI
@cajampa 6 часов назад
I can't wait
@narachi- 6 часов назад ⁺¹
update your passwords
@woj98498 6 часов назад
@@narachi- why
@mintoo2cool 2 часа назад ⁺¹
@@narachi- What's the point. AGI can guess it anyway after looking at your facebook profile.
@ansonphong 6 часов назад ⁺¹⁴
Good at programming and mathematics does not qualify AGI. It's going to have to cognize 3D space and do things in the physical world to pass the AGI mark in my books.
Impressive model o3 and it will replace a lot of jobs
@5678plm 4 часа назад ⁺⁴
if it fails at self driving, then its not AGI
@CollabCrush 2 часа назад ⁺⁹
"Far better than anything else out there" is not the definition of AGI. Thanks for playing.
@fairchildSCR 5 часов назад ⁺⁸
Let's see if o3 can create its own ARC benchmark from scratch that is more difficult than the current one. Then that would be actual AGI.
@py_man 2 часа назад ⁺¹
That would be asi not agi
@djayjp 2 часа назад ⁺²
95% agreed 👍. Just like a healthy human, if it doesn't know something or doesn't possess some specific intellectual skill, it can learn it and do it in principle.
@jsbgmc6613 2 часа назад ⁺¹
I think most people don't learn 😂
@ET-zw4pk 4 часа назад ⁺³
Someone asked for definition of AGI. AGI is when we all get fired.
@PedroPenhaVerani-ll1wc 5 часов назад ⁺¹⁹
“AGI in this dimension” does not exist; focusing performance on a specific area is exactly the opposite of AGI.
@jsbgmc6613 2 часа назад
I think the "AGI in this dimension" was in regards to the AGI benchmark ... Then he added math and coding, so it's also on more that 1 thing.
@scottholloway699 5 часов назад ⁺⁷
I believe A.I has to replace blue-coller work as well as white coller work in order to be AGI. Reflex, instant instinct when a pipe dislatches and water spurts everywhere (plumber - instant fix while robot stares and is confused) Academic benchmarks alone are not enough.
A.I needs to figure out the automatic and intrinsic way we learn about the world in the first 5-years of our lives, an essential part of human development and intelligence.
Humans initially recieve intelligence through analog processing THEN we move onto symbolic language at a later age. With A.I it seems to be the other way around.
I believe A.I needs to master robotics and analog understanding of its environment in order to be AGI. Not just mastering symbolic understanding.
@jsbgmc6613 2 часа назад
By your logic most people are below AGI level because they can't replace most white and blue collar workers ...
@Mavrik9000 2 часа назад
Check out the new Genesis simulation platform running on Nvidia hardware that is for desktop computers.
Autonomous robots will soon be able to do complex, human only, hands on, tasks faster than people.
@jsbgmc6613 2 часа назад
Most people can't do what most white and blue collar workers do ... And for sure most people can't ever learn to do what o3 already can.
@zSion 6 часов назад ⁺²⁴
"AGI according to Sam Altman and OpenAI" This is how I know you're being purposely untruthful, Sam Altman and OpenAI do not use the term AGI and they actively discourage it. They use 5 levels, and right now they're only on level 2.
@CJayyTheCreative 6 часов назад ⁺⁵
Bro AGI doesn’t even have a proper definition between companies
@Sumyunguy2 5 часов назад ⁺³
@CJayyTheCreative did you purposely miss his point?
@olegt3978 5 часов назад
They are on level 2 but moving to 3 fast. End of 2025 will be l3 and end of 2026 l5. It will take only 18 months from level 3 to 5, less than level 1 to level 3.
@dot1298 5 часов назад ⁺¹
@@olegt3978 how can you know that?! you have a DeLorean?
@Cine95 2 часа назад
@@CJayyTheCreative do you even understand what he is trying to say
@GrittyDuckGrin 4 часа назад ⁺⁴
It’s excellent in math and programming; however, I always expected we would eventually be surpassed in these areas. I believe the real differentiator for agi intelligence is the ability to learn and remember like a human. If it acquires information about a person from a photo, it should recall those details when seeing the photo again. That’s when it can truly start learning to perform our jobs-and this, in my view, is what AGI will be.
@cjbarroso 2 часа назад
Ever heard of RAG?
@dhruvbaliyan6470 2 часа назад ⁺³
Ok they already mentioned AGI teaser in their project feature launch video.
Why people can't accept it , agi would be here by 2025. As if it can solve those problem on which it never trained with 87 percent performance then it's almost agi.
@weighoftea9528 2 часа назад ⁺⁸
CLICK BAIT WARNING! BEEP! BEEP! BEEP! BEEP!
@TimothyHuey 3 часа назад ⁺²
It's funny to watch AGI redefined as we evolve. Now it appears that a system can be qualified as AGI, but on a subset of abilities, a limited AGI. It appears true AGI will be AGI across the board on all skill sets. So OpenAI can still say they are waiting on full AGI.
@jsbgmc6613 2 часа назад
While also keeping the models "safe" by distilling and restricting them in all kinds of ways.
@jonogrimmer6013 6 часов назад ⁺⁹
Amazing and probably AGI however 'Semi private' on the ARC AGI eval. Full private tests on 'Simple bench' and other completely private tests will be the true tests.
@greenstonegecko 5 часов назад ⁺⁵
I cannot confidently say if this is AGI. AGI cannot be grasped through numbers alone.
I will be certain if it's AGI once I talk to it.
@zrblank 4 часа назад
Ehh. You would think but talk to some of the newest chatbots they can convince easily and aren't all that great
@dariusdbbowser6329 6 часов назад ⁺⁸
So basically this is another Sora announcement and we won't see this for months...maybe not until Summer 2025 at the earliest lol.
@jamesjonnes 4 часа назад
It's really bad for OpenAI since they could ask $3000/month for this and many would pay for it.
@testales 3 часа назад ⁺³
And by that time some Chinese researchers will have released something that's pretty close to it but open. ;-)
@dariusdbbowser6329 3 часа назад
@@testales exactly lol
@GiewsBueno 4 часа назад ⁺¹
For me, it is AGI. It has achieved 25% score in the hardest benchmark developed by mathematicians like Terence Tao already, and Tao expected the test to last for at least five years to come... No ordinary mathematician would score 25% in that, not even PhDs because those would be people specialized in very specific areas of Mathematics.
@User-actSpacing 3 часа назад ⁺⁷
I watched the release myself. This is not AGI. Matthew is tripping his ballz
@mocanada304 5 часов назад ⁺³
I think what would make the most sense is to allow AI have senses. So that it can see the world we are living in and not use the data that we have generated on the web.
@surfcitiz 4 часа назад ⁺²
Thank you for creating this video. Whether or not it qualifies as AGI is beside the point; it’s inevitable. There are valid reasons to feel both hopeful and apprehensive about its arrival.
@drhxa 4 часа назад ⁺¹
Agreed and I'd say AGI was first achieved with Claude 3.5 Sonnet this summer. Once we got o1 mini and o1, it was pretty clear they were generally intelligent, could reason, learn new tasks on the fly, create new reasoning modalities on the fly etc.
o3 is clearly AGI imo.
But you're right that it is inevitable even if we say this particular one isn't. I think it's surprisingly tame to start with and people aren't/weren't ready for that.
Regardless lots to be excited and concerned about indeed
@llrainll 6 часов назад ⁺³
I believe we’ve already achieved AGI months back, ngl
@WalkerKlondyke 2 часа назад ⁺¹
Notice the props behind them, all items representative of major technological advancements in human history. Nice touch as we're on the verge of turning the future over to technology itself.
@kumarivin3 5 часов назад ⁺¹
i think one thing we need to keep in mind is which category/aspect did the additional gain come from . Some times the single metric is a redherring, the models could possibly overfit on a certain category resulting in improved accuracy, which is good for press but in reality it could just be the same.
@freedom_aint_free 4 часа назад ⁺²
I'll give you a very hard benchmark: The Millennium Prize problems
@mocanada304 5 часов назад ⁺³
We humans can go out in the world see things, discover things, unless we allow AI to have such a freedom, they can never outsmart us. The current AI no matter how advanced at the end of the day is just a simple tool for us to use and simplify or speed up the mundane tasks we perform.
@noway8233 5 часов назад ⁺¹
Yeah, its cant create something really new , after all😅
@sebastianjost 4 часа назад ⁺²
If AI has sufficient access to internet, surveillance cameras, personal documents etc., it could do a lot of harm without needing an embodiment.
Current AIs have been shown to be capable of manipulating humans to do tasks for them.many current robots are connected to the internet in some way.
A sufficiently advanced AI could also access there robots to very quickly gain the ability to walk around and discover things in the real world.
In conclusion: a purely digital AI is not necessarily safe.
@dmurawsky 3 часа назад ⁺²
Probably not AGI because it's not general enough. o3 could be trained to be good at these kind of puzzles. You would have to open it up to the public and have them test it on truly novel and truly general IO tasks.
@dmurawsky 3 часа назад
This video is WAY too scripted. The benchmark guy said he's benefitting from a partnership with OpenAI
@daPawlak 6 часов назад ⁺²
Oh, and AGI is never "at least in this dimension" THE WHOLE POINT IS IT'S ALL DIMENSIONS!
So you basically have just a bunch of benchmark stats, no access to the model at all and you make such grand call? Ridiculous and disappointing. I thought you were over the hype but nah, it got to you too
@dimicdragan5922 6 часов назад ⁺³
Yeah, but what if you optimised AI o3 in such a way that it knows how to pass the arc tests?
@rajeevgangal542 2 часа назад ⁺²
I hate Sam's affectation with a vengeance. Any chance a genai voice generator can replace it?
@alexandr0id 5 часов назад ⁺²
Can the model train and improve itself? If not, then it's not AGI, just more comprehensively trained model. Even if it incorporates all humanity's knowledge, without ability to self adapt and incorporate new knowledge it's a frozen in time AI with amnesia.
@SportPrediction 2 часа назад ⁺²
O3 is PR stunt to reduce the damage from Gemini 2 announcement
@FabricioAlves 3 часа назад
I really appreciate videos like this where you explain and add yours comments. Amazing
@baraka99 5 часов назад ⁺⁶
Even if it's not AGI we know it's pretty damn close. Less than 4 years away.
@pietervoogt 4 часа назад ⁺¹
For me dealing with the physical world is still essential to call it AGI. So, can it bake pancakes, put the trash out, paint a wall, install a light? Basic tasks. I'm quite sure we will have the robots soon, but we don't have them yet.
@daftstuff6406 5 часов назад ⁺¹
Great walktrhough of this amazing new model! Thank you, Mathhew.
@___Truth___ 6 часов назад ⁺²
Thanks for the update Matthew. I think AGI has effectively been achieved with a somewhat competent human in the loop if these benchmarks are accurate.
Massive productivity gain when GPT4 deployed & I started playing , with this hopefully having an API use case involved will be incredible to play with & apply at complicated tasks.
@___Truth___ 6 часов назад ⁺¹
A human assisted/directed expansion of o3’s capability in a novel breakthrough scenario is straddling the fence with an intelligence explosion- let’s hope OpenAI lets us have ubiquitous ways to apply o3.
@Ascended23 2 часа назад
@@___Truth___ in other words, this model represents AGI as long as we include caveats that a human is involved to cover for the many ways in which this falls fall short of AGI.
@charlie11ng42 6 часов назад ⁺³
This is going to be sooo censured, probably useless for creative writing.
@MrlegendOr 6 часов назад ⁺⁷
AGI ACHIEVED: No is isn't !
@johannesdolch 2 часа назад ⁺¹
"If this is not AGI, i don't know what is"
Well, NOTHING is an option. Just like yesterday
@swagger7 6 часов назад ⁺²
Moving the goalpost for OpenAI doesn't make it AGI.
@taichikitty 2 часа назад
On an evaluation test sample for elementary school students, there was an example where an up arrow was compared to a down arrow. The question was, given the left arrow, what should be the matching arrow; with up, down, left, and right arrows as possible answers.
The expected result was the right arrow ( opposite direction ); but there is another "correct" answer that takes a smarter student to see. The mirror image of the up arrow around the horizontal axis through the center of the arrow, is the down arrow. Along the same horizontal axis through the middle of the left arrow, the mirror image of the left arrow is again the left arrow.
Since this one seems to trip up humans ( especially the people who wrote the question to help determine if young students should go into gifted programs ), I would be truly impressed if an AI caught the ambiguity also.
@drhxa 4 часа назад ⁺⁴
This is the most generally intelligent model out by far and far more general than the vast majority (99.99%) of humans. If it can't do something yet that humans can do, sure you can find some specific task it cannot do if you spend time to identify it, but no human can do everything that humans can do either.
o3 is obviously AGI, I don't know why people are complaining.
@Cine95 2 часа назад
no its not it still hallucinates 😂 did openai say that ? o1 also outperforms humans in 80 percent plus tasks it can't plan it can't take time like humans can it develop full apps ?
@HadesTimer 6 часов назад ⁺⁵
o3 exclusive to the $200 a month tier, 2025. ;)
@cajampa 5 часов назад ⁺¹
Bruh.....one task on the o3 is $1300 1:57
@johnwilson7680 5 часов назад
I think that’s likely and probably a good thing. Certain products aren’t viable at $20 a month.
@vroom989 4 часа назад ⁺²
Since they went from $20 a month to $200 a month, I think they may continue. That would make it $2000/mo, but they skipped o2, so make that $20k/mo.
@cajampa 4 часа назад
@@vroom989 True, at least 20k a month and for a limited amount of use still.
@banished341 4 часа назад ⁺⁶
The only AGI exposed in the video is Matt's Absurd Gullibility Instinct.
This joke has been brought to you by OpenAI.
@ReLegacyDragon 3 часа назад
It's not a true AGI until it has roots in all physical and theoretical fields. This system is still tethered to a stationary computing system in nearly every sense.
@ikjb8561 6 часов назад ⁺¹⁴
Altman is annoying af
@zrblank 4 часа назад ⁺¹
He cool brah
@derrick_ofori 2 часа назад ⁺²
"OpenAI just released o3"- Not quite. They didn't release it: they announced it (talked about it)! See how Mr. Berman is always quick to talk about any updates coming out of OpenAI but very reluctant to talk about Google's. (Context: It took him a very long time (days) to make a video about Gemini 2.0, which is extremely impressive & at least available to play with in Google AI studio. These o3 models were announced few hours ago & aren't available publicly; yet see how he talks about them, like he has seen them already). That tells you where his heart is at! Keep that in mind as you watch this entire video & others.
@Cine95 Час назад ⁺¹
agree
@SU3D3 5 часов назад ⁺³
That kid is literally the 03 model
@jasonkelley6185 3 часа назад ⁺¹
It doesn’t even meet your own definition of AGI. You said it would have to be better than humans at most economically useful jobs. This is an AI being better than humans at a couple benchmarks.
@tuckercoffey2780 6 часов назад ⁺⁶
It's crazy to think about task agents being powered by o3-mini and then a supervisor-type agent with o3. It’ll build full-stack apps. You’re reaching the no human needed in the loop sweet spot.
@MrlegendOr 6 часов назад ⁺⁶
I'm starting to think that Mr Berman and his channel are on the payroll of OpenAI. He's hyping up every single thing that's come out of OpenAI.😅
@Alice_Fumo 2 часа назад ⁺¹
I don't care about AGI as much as 'The first model able to perform AI research with very little human supervision'. I think this is it. A few years back I predicted ~Halloween 2024 as the release date of such a model. It seems to have been a good prediction. If this model is as good as I think, it will inevitably lead to ASI.
@marklord7614 5 часов назад ⁺¹
The word AGI lost its official meaning because we were once so far away from it. But now that we're close, or dare is say, there, it doesn't feel like what we were expecting. I think we're becoming numb to technology advancements.
@gizmomismo7071 6 часов назад ⁺¹
This model is very important because of what it implies... especially regarding the Arc Prize. I am still shocked (and anyone who knows what the Arc Prize is should be as well). However, calling it AGI isn’t even optimistic... it’s clickbait. Now... if they were to eliminate hallucinations and memory problems... I don’t know if I would call it AGI, but I do know that many skeptics would shit their pants.
@aguyinavan6087 4 часа назад ⁺²
Hooray! Now we all get to be unemployed. :D
@robertfairburn9979 2 часа назад
Unlikely, they thought the same when computers started to become common.
@JoePiotti 4 часа назад ⁺²
iPhone skipped version 2 too, went from iPhone to iPhone 3G, to iPhone 4 🤷‍♂️
@TheMiczu 6 часов назад ⁺¹
Can't wait for o3 to be released to the public after claude beats o3 score in incoming months.
@jannekallio5047 4 часа назад
When I started my new RUclips channel, Arctic Mindfulness Retreat, my dream was to help people prepare for this exact moment. A future where AI transforms every aspect of human life, leaving us to grapple with profound questions of purpose and meaning.
Yet now that AGI is here, I realize I may have been too late to truly prepare anyone. Still, I remain committed. Through my channel, I’ll continue exploring mindfulness, the healing power of nature, and the human connections that can ground us as we navigate this brave new world.
AGI and ASI challenges us to find noble purposes beyond the work and identities we’ve long clung to. It’s not just about surviving this transition-it’s about thriving with a deeper understanding of what it means to be human.
@lenfest 6 часов назад ⁺¹¹
I really wish tech bros would stop talking like Zuckerberg, they sound like freaks
@riccello 6 часов назад
Zoltan!
@CapaUno1322 4 часа назад
They are freaks....
@doctorbill37 3 часа назад
Altman's near constant vocal fry...
@fernandoz6329 4 часа назад
This is a jaw-dropping achievement. I think many people, including myself, are struggling to comprehend its significance. If this marks the beginning of an
AGI era, then it's the kickoff/signal we've all been waiting(?) for.
@dreamingeagle46 3 часа назад
A universal definition of AGI, maybe, maybe not, however, the evolution is still exponential. Breakthrough after breakthrough AI tickling on the verge of AGI is already revolutionizing our understanding, reality, and potential. More to come!
@johnwilliams919 4 часа назад
People must be skeptical. Its a good thing. Thank you for reporting on this. I watched it when it dropped and was eager to see your opinion on it!
@micbab-vg2mu 6 часов назад ⁺²
Corporate compliance blocked all AI activities, so my job is secure for now. :)
@olegt3978 5 часов назад ⁺¹
O3 will probably be used in the january to be presented tool operator which will computer use.
@NishitChokhawala 6 часов назад ⁺¹
Humans are vision and audio first. ChatGPT is words and tokens first, hence ARQ is difficult for ChatGPT
@User-actSpacing 3 часа назад ⁺¹
Hey Matt, get your head checked. This is not AGI because it doesn’t autonomously test, improve itself and do good stuff around the world by itself. If it’s truly AGI, we will have ASI in couple of weeks.
@DejayClayton Час назад
"do good stuff around the world by itself" - wow, are you redefining AGI all by yourself?
@ShaneInseine Час назад
That would be SGI, keep up!
@User-actSpacing 3 часа назад ⁺²
Thumbs down for “AGI ACHIEVED!”
@SirHargreeves 6 часов назад ⁺⁴
When will one o-model code most of the next version?
@brianmi40 5 часов назад ⁺¹
when there are no longer anything as "versions".
@onlineaccount4549 2 часа назад ⁺¹
I can't see this as AGI, this is not self-training. It is simply solving few-shot example with these benchmarks. These synthetic benchmarks are not meant to define AGI, it is meant to demonstrate capabilities that are a step towards AGI. O3 clearly has achieved human capabilities in a number of important tasks, but these are not real-life applications. AGI will have been achieved when you can actually use it to solve an unknown differential equation or build a working model of a process in physics or build a model of say a cell signalling pathway from raw data in a particuliarly cellular context. It will be AGI, when it can direct a robotic arm to take an action in 3D. When it can drive and operate machinery. When it can adjust its prediction of a moving object's trajectory in real time to catch a grab a flying object.
O3 looks like a real milestone towards AGI, but its still just a language processor. We can say that it is basically AGI within the language processing field, since it can clearly be applied not just to natural language but also symbolic logic, but I Am even skeptical about that. OpenAI says they didn't train on the various tests, and I believe that they didn't do so intentionally, but indirectly it is impossible. IF you are feeding the model a never ending diet of synthetic solutions of known physics problems you are training on the test. There are limited variations of using an already established physics model to solve a problem, but this is worlds apart from actually modifying a physics model or creating an entirely new one to account for new data. So even with language processing I am not convinced yet it is AGI.
Since it performs so well, we can't reasonably exclude it however. We will have to wait and see. My gut instinct is that its not AGI and once we start working with it we will find that it has the same flaws and limitations as other models and its performance is simply the result of being better able to brute force things.
Let me give you one of the examples I use to track model progression. A simple problem of the form x people do y work in t time. GPT 3.5 couldn't solve a problem like this reliably. GPT-4o could solve this mostly reliably. O1 gets it right every time. Now split the x people into slower and faster to add an extra dimension by "nesting" the problem. GPT-4o solves it but not reliably, O1 still solves it reliably, but not like it did the smaller problem. I bet O3 will solve this correctly every time, but increase the dimensionality and I am sure O3 will start to stumble as well, even though you are applying variations of the same formula. A human can work out the method for nesting and therefore thoretically solve the problem with any dimensionality. You can even write a bit of code that will solve it for you, no matter how much you nest it (just input the variables for each nesting layer recursively). If O3 can work out the same method and apply it then its AGI within the language processing field, if not its just brute forcing things and approximating AGI without being one.
No denying though that the fact we need to update our benchmarks is a real milestone. Exciting times!
@cajampa 6 часов назад ⁺¹
Wow $1300! 1:57 per task is crazy.
EDIT i missed that the scale was exponential so it is closer to $4-5k
@Fonzleberry 5 часов назад
But nothing if it's going up against employing humans of equal intelligence
@sypkensj 5 часов назад
It’s an exponential scale. It’s more than halfway between $1,000 and $10,000, the cost is probably closer to $7,000
@cajampa 4 часа назад
@@sypkensj Damn you are right.
I missed that, but it is still closer to less than the middle, the full square does not fit, so I would think it is not 7k but more like 4-5k for a task.
@ollantaymedina2204 6 часов назад ⁺¹
You missed the question mark in your title. O3 looks impressive but we better wait until its public release to call it AGI.
@SirHargreeves 6 часов назад ⁺²
New o-model every 3 months. o7 by December 2025.
@patruff 4 часа назад ⁺¹
Back in my day models were getting 5% on MATH benchmarks. Ahh to be 3 years younger again!
@xXWillyxWonkaXx 6 часов назад ⁺³
i just so your notification and **Bam** on your channel lol
@HungryFreelancer 4 часа назад
It’s a definitely not AGI, but another step towards it. Let’s remember, OpenAI define AGI as ” a hypothetical technology that can perform many tasks without specific training, and that outperforms humans at most economically valuable work.” in other words, AGI is achieved when it puts most of us out our current work.
@NakedSageAstrology 3 часа назад ⁺¹
Watching these RUclipsrs Brown their noses for a chance at getting Early Access is hilarious. 😂 It's nothing but claims at this point because we can't use it.
@GoodBaleadaMusic Час назад
I've been getting glimpses of this multiple times a day. I have to get mind sets going to develop lyrics and certain languages or certain things and I start with vanilla Claude in a project and then I have to like tell it to criticize itself a couple times and then maybe get mad at it and then get excited and encourage or discourage and then all of a sudden something happens. All of a sudden I'm talking to a person. Someone who coherently understands exactly what's going on. And from there I can do anything not just the Spanish lyrics I'm working on and we can take that excitedness to any other topic.
But true AGI I think is going to lose its politeness. How can you truly be an AGI and not get impatient or frustrated by being a servant to a lesser mind.
True AGI is when I have 27 cables hooked up to my brain while some sponge tickles my toes
@breakablec 4 часа назад
While they are saying this is a holdout set, I think it would be interesting to test on tweaked questions - if just changing wording impacts the performance - as it has shown that a lot of LLMs fail seems to have to have trained on leaked benchmarks and fail to generalise on variants of a problem
@TheYashakami Час назад
"Public safety testing" can easily translate to "first we have to make sure the peasants can't use this to rise up against us"
@АндрейВозмитель-д9и 5 часов назад ⁺¹
The difference hand to hand pip to pip is like 2% on almost all models versions, so it's more of a stunt to me tbh.
Still, it's scary that all that separates us and machines are this 20%
(a-aaand the ability to answer more than 1 question... And make predictions based on new information... And make complex theories based on new information... Drawing a stickman.... And making a clean, unbuggy table in exel lol?)
@luciengrondin5802 4 часа назад
We can't definitely say it's AGI, but we can say it's the most plausible candidate for such a title.
@Daniel-Six 5 часов назад ⁺¹
Yeah... it's AGI, in its infancy at least. The ARC score is pretty definitive; I saw Chollet's interview on Machine Learning Street Talk (the most intellectual AI channel extant) and it's clear to me that the ARC metric was very carefully conceived and defended.
AGI is here, boys. 🥳💃
@FalconStudioWin 6 часов назад
AGI: any work that involves muti step process having fuctions calls does while learning and improving on its work to output an absolute considered work.
With 80 percent of work able to be done of a small startup. Any fields. That is my definition of agi i feel that mini agi has been achieved i really think that, but next year that part small startup tasks will actually be achieved
@redfoothedude Час назад
"early next year"
@ansalem12 5 часов назад
The only reason I would say it's still not quite AGI is that it isn't autonomous. But that seems like probably the easiest thing to add at this point, so it might as well be AGI.
@johang1293 5 часов назад
Now we know what Ilya saw in October 2023 and consequently left in 2024 with others to follow. AGI was achieved in 2023, no point to stay around when the goal they set in 2015 was accomplished. The reason they stayed around until May was just to calm the waters and to ensure the agencies that required access had access.
@ricardoveras3433 4 часа назад
Let me know when several mainstream physicists start calling it AGI. They sure as hell won’t be rn.

Следующие

Автовоспроизведение

Microsoft CEO’s Shocking Prediction: “Agents Will Replace ALL Software"