Sam Altman Teases Orion (GPT-5) 🍓 o1 tests at 120 IQ 🍓 1 year of PHD work done in 1 hour...

Wes Roth

Просмотров 76 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 26 сен 2024
The latest AI News. Learn about LLMs, Gen AI and get ready for the rollout of AGI. Wes Roth covers the latest happenings in the world of OpenAI, Google, Anthropic, NVIDIA and Open Source AI.
My Links 🔗
➡️ Subscribe: / @wesroth
➡️ Twitter: x.com/WesRothM...
➡️ AI Newsletter: natural20.beeh...
#ai #openai #llm
OpenAI Shows ‘Strawberry’ AI to the Feds and Uses It to Develop ‘Orion’
www.theinforma...
DrJimFan/
x.com/DrJimFan...
Dr. Kyle Kabasares
• ChatGPT o1 preview + m...
x.com/AstronoM...
Black Hole Mass Measurements of Radio Galaxie
iopscience.iop...
Apollo Research
@apolloaisafety
x.com/apolloai...
Scaling: The State of Play in AI
www.oneusefult...

Комментарии • 675

@lwwells 9 дней назад ⁺¹²⁰
Not my PhD work, but I used GPT 4.0 and o1 to help me build software tools for automating measurements in lithium ion CT Scans. I couldn't have built this tool alone without GPT and I did it in a week. This one tool has helped me to bring in ~350k in customer work. Oddly, I also used GPT4.0 to rebuild one of the tools I built in my PhD. The rebuild took about a week. Originally, I spent about a year. The tool is for modelling processing and designs lithium ion electrodes and cells. It's pretty crazy that for $20/month, I feel like I have a small team of programmers working for me.
@GeistInTheMachine 9 дней назад ⁺¹¹
Absolutely. People that don't adapt are going to be left behind in the new AI economy. And we are 2 Trillion in debt in the US, approaching Stagflation. We live in interesring times.
@theWACKIIRAQI 9 дней назад ⁺⁹
I hope you’re telling the truth and not being a bot because this sounds phenomenal. If you’re a real person: Godspeed!
@cobaltblue1975 9 дней назад ⁺¹⁰
@@theWACKIIRAQI I think your comment is very telling of the times. These systems have gotten SO GOOD at what they do, its becoming a legitimate question to ask. I don't know how to feel about that.
@pmHidden 9 дней назад ⁺²
While I think that LLMs can be of great assistance (heck, ML has been my PhD topic and I'm currently working in the field), you really have to account for the risk that comes with these models, especially when people who have zero coding knowledge start using them and don't double-check what they produce. Not only can they be horribly wrong in some cases (and often hide it when you test it on samples), but they also just don't work very well in general if the person writing the prompt doesn't already understand the basics of programming.
I've seen some code from interns at my company that's clearly been AI generated (the comments and structure give it away very easily) and it's a million times worse than anything I've seen before.
For example, when you tell an AI to call function A to produce result B and your data clearly doesn't fit the method signature, an AI will typically just make the data fit somehow, whereas somebody writing the code manually would typically double-check whether they're not calling the wrong function.
I'm not saying that's the LLMs' fault, but people are clearly overselling LLMs for coding ("small team of programmers") and others (especially those new to coding) who buy into that might bear the consequences.
@DieLazergurken 8 дней назад
@@theWACKIIRAQI obvious bot
@igorsawicki4905 9 дней назад ⁺⁹⁹
The example of a scientist whos expertise is NOT programming but needs to write some code to do his research is really really common situation. This technology will speed up progress in quantum chemistry and thereby new materials, possibilities in biology and medicine and many many more. I am so excited to NOT have to spend 90% of my time to code various equations/models/methods and focus on the physics! Even tho GPT-3.5 was already giving a speed up of like 10-30%, this is a game changer.
@Trahloc 9 дней назад ⁺¹⁴
Yup, imagine how much slower construction would be if bricklayers had to mine and refine the cement for mortar and bake the clay themselves.
@Max-hj6nq 9 дней назад ⁺⁹
Too much focus and attention on the models intelligence , not enough on the effect & impact of 4 billion people with under 100iq having access to 120iq> on command intelligence.
@FamilyYoutubeTV-x6d 9 дней назад ⁺⁶
@@Max-hj6nq Interestingly, a lot of those people with IQ under 100, manage to succeed in life by learning heuristics, using tools, or by having been born to a wealthy family or been in the right place at the right time. Let's not talk about people with low IQ in a demeaning sense. I know you are not. But I am just reminding you that you are doing a good job at being a good person. I salute you.
@smb2735 9 дней назад ⁺¹
Software engineering isn't a prerequisite for physical science graduate programs. Maybe it should be, but it doesn't seem fair to expect that capability without asking for it.
@Sindigo-ic6xq 9 дней назад ⁺¹
Couldn't agree more
@alexmolyneux816 9 дней назад ⁺⁹⁹
If you ask Strawberry a question, let it respond, then ask it again to 'think hard, revisit it's answer, and fortify it', the second iteration of your answer is actually extraordinary.
@serg331 9 дней назад ⁺²
Now put that in the custom instruction. To save you all a response: it refers to itself as ‘GPT-4’
@FamilyYoutubeTV-x6d 9 дней назад ⁺⁴
I love these little hacks. They are so "dumb" and "empirical" but sometimes they work. It shows the complexity and sensitivity of these systems (not necessarily "brittleness"). They are sensitive, despite showing no excess emotions.
@serg331 9 дней назад ⁺¹³
@@FamilyRUclipsTV-x6d you want even better? Write the system prompt in Chinese and inside it for the prompt to only be used in English to retain performance. You get 600% more room in the system prompt. I wrote a 4,000 character comprehensive directive on how to ‘think’ and use memories and everything
@mirusvet 9 дней назад ⁺⁵
Or just skip the whole demogogery and use 4o.
@Semiotica_Tumbada 9 дней назад
@@serg331 i didn't get your technique
@SoaringMoon 9 дней назад ⁺¹²⁰
The Kessel run requires multiple hyperspace jumps. The ability to complete the run is the ability to minimize the amount of hyperspace travelled. The Kessel run is a puzzle, not a race.
@therainman7777 9 дней назад ⁺⁵
Nerd… jk
@jonathanlindsey8864 9 дней назад ⁺¹²
Thank you, I was about to mention the same thing, it's more like sail boating then driving.
@bounceday 9 дней назад ⁺¹¹
Yes he went a shorter distance by taking shortcuts
@mayuh74 9 дней назад ⁺¹⁴
@@therainman7777 Was about to comment the same :-) But in the times we live in now, Nerd is actually a compliment. Times have changed to the better 🙂
@andimoraru5539 9 дней назад ⁺¹
Was thinking of the same idea
@Ikbeneengeit 9 дней назад ⁺²⁴
First it did the homework of highschoolers. Now it does the homework of PhD students. Soon it will just do everyone's computer work.
@j4yd32 9 дней назад
why stop at computer work? EU getting self driving cars next year and china is working on humanoid droids for their army
@theWACKIIRAQI 8 дней назад ⁺¹
Bro literally. I remember people were complaining about high school homework being done by these models and in a virtual blink of an eye, we’re here now. It’s really scary good :)
@snafu5563 8 дней назад ⁺³
Soon we can all be stupid together
@shin-ishikiri-no 8 дней назад
@@snafu5563Certainly seems to be going that way, eh. 😅
@erkinalp 8 дней назад
@@snafu5563 no, different architectures of intelligence excel at different paradigms and modalities, we'll eventually merge into the hive-minded hyperintelligence
@alex.arango 9 дней назад ⁺²⁴⁸
Meanwhile we still don't have access to advanced voice mode...
@privateprofile3517 9 дней назад ⁺⁴⁷
Who cares ? I prefer typing and getting my answers. As a researcher, chatgpt and other llms have made my work super easy . I finish my work in half the time, my quality has exponentially increased and made my work life balance amazing.
Like who really cares about voice mode ? Not me
@BlackTakGolD 9 дней назад ⁺³
You probably won't for a while, it proved to be too controversial for the people. You definitely don't want to sound like anyone yet want to sound personable but actually if you sound personable you get attributed to unnerving things or "her".
@nusu5331 9 дней назад ⁺³
i wonder which of all the fancy models really is available with all features shown? it feels like everyone is just showing previews. think sora
@privateprofile3517 9 дней назад
@@nusu5331 almost everything is available, if you really want to use it.
@CapsAdmin 9 дней назад ⁺¹⁸
"We got GPT 6 before advanced voice mode"
@DanieleCorradetti-hn9nm 9 дней назад ⁺¹⁴
This video is going in the right direction: not only the latest news that already happened, but also a synthesis of the direction where we are going with the research
@maninalift 9 дней назад ⁺³¹
It's easy to forget where we came from and how quickly.
I've been following neural network / deep learning research since the mid 90s.
I gave a presentation to a small group a few years ago about the potential of these techniques and how we were likely to see more rapid advances. At that time, the people there were amazed by things like a model that learned to recognise hand drawn numerical digits then could be "run backwards" to generate new (low resolution) hand drawn digits
How quickly the goalposts have shifted "it only recreated the code of a PhD project from the methods secton of a paper, it didn't actually do all of the PhD research"
@dmon1088 9 дней назад ⁺⁶
This was funny, thanks for sharing! I agree, but this is happening not because of shifting goalposts, but more out of some sort of self-defense - it's psychological.
@theWACKIIRAQI 9 дней назад
We love moving the goal post so much :)
ASI: LOOK AT ME, PUNY HUMAN, IM A FREAKIN GOD!
Human: yeah but you can’t play aquatic guitar so…no.
@klarad3978 9 дней назад
@@dmon1088bingo! That’s it. It’s a defense mechanism. The human is freaking out and trying to convince himself and others he’s still relevant.
@j4yd32 9 дней назад ⁺¹
people don't like change, almost everyone wants to stick to the ideas of the past and refuses to believe big changes can and will occur. why do you think every single generation says the past was better and "the good old days"
@tonykaze 9 дней назад ⁺³⁷
Tbf knowing plenty of PhDs, "it took me a year" means they spent 5 minutes on it per week, threw away or forgot their progress 6 times, and started from scratch 4 days before the due date. Still, that's probably 4-8 hours of someone who's not a professional student done in a minute - very impressive
@missoats8731 9 дней назад ⁺⁵
But if it is that common for PhD students to take that long for such a task, o1 still can do their 1-year work in an hour. Even if it's just because it stays focussed on the task.
@KEKW-lc4xi 9 дней назад ⁺⁵
I'm pretty sure he is including research and writing time into that coding time. No way he had all the math and writing sorted out and then it took him a whole year to write the code.
@AAjax 9 дней назад
Yeah, that's likely, though I think it would have to be a highly math literate coder, and not the average code monkey. But we're looking at efficiency gains for researchers that struggle to code, and paper quality gains for those that would have just skipped some tricky number crunching or visualisation.
@SwornInvictus 9 дней назад ⁺³
I'm not a PHD but that's been my experience with most intellectual and creative pursuits. 95% thinking and 5% output.
@pmHidden 9 дней назад
@@missoats8731 The problem is that it's misleading because o1 isn't doing the 99% that they're actually spending their time on. "Not staying focused on" doesn't mean you're jerking off because you can't control yourself, it just means you're not directly writing code.
OP says "1 year of PHD work", not "1 year of 1% of a PHD's work". Also, the code was already published on Github before the model's cutoff so it literally had the solution in its training data. The fact that it was able to reproduce it in a more compact form isn't exactly surprising given that the PHD here isn't a programmer.
@shawnfromportland 9 дней назад ⁺¹¹
thanks Wes, you are a legend for your coverage
@outoftheboxtalk 9 дней назад ⁺²⁰
While a parsec is scientifically a unit of distance, not time, it can be indirectly related to time in certain contexts, especially in storytelling or when discussing speed and efficiency.
the Kessel Run:
In the "Star Wars" universe, the Kessel Run is a smuggling route that passes near a cluster of black holes known as the Maw. Most pilots take a longer, safer route to avoid these hazards, resulting in a journey of more than 18 parsecs.
By navigating closer to the black holes, Han managed to shorten the distance to less than 12 parsecs. This not only demonstrates his daring and piloting skills but also implies that he completed the run in less time than others, since a shorter distance typically means a quicker journey, assuming speed remains constant.
Speed through Distance:
Sometimes, people use distance measurements to imply speed or efficiency. Saying "I crossed the desert in 200 miles" might imply you took a shortcut or a more direct route, suggesting a faster trip.
By highlighting the shorter distance, Han is effectively boasting about the Millennium Falcon's speed and his ability to handle dangerous shortcuts, which would reduce travel time.
@ekothesilent9456 9 дней назад ⁺⁴
Never cared about Star Wars. But as a Warhammer fan, I can appreciate a Lore master
@das250250 9 дней назад ⁺¹
If 1 parsec is distance light travels in 3.26 light years then the time is embedded in that. It's 3.26 years of time .
@Serahpin 8 дней назад ⁺¹
"It sounded spacey."
@dg-ov4cf 8 дней назад
space = time
@AskZch 8 дней назад ⁺¹
Exactly! As most Star Wars fans would tell you, the Kessel Run isn't just about the distance. It's like a super-dangerous obstacle course in space, with asteroids flying everywhere and gravity going wonky. Han Solo had to make multiple jumps through hyperspace, which is basically like taking a shortcut through a cosmic maze. And he did it all in only 12 parsecs! As a Trekkie myself, I've gotta admit, that's pretty impressive. -5 points for Wes for talking smack about Han ✴
@haakoflo 9 дней назад ⁺²
About the Star Wars / Kessel Run reference: It actually makes sense to measure this in terms of Parsecs for speeds higher than c. The reason comes from Special Relativity and the concept of a Lorentz invariant 4-vector as a representation of the relationship between 2 events. (Such as the start and end of the run).
If vc, this 4-vector becomes "space-like". This vector becomes imaginary if rotated into the reference frame where the traveller is at rest, but becomes real if rotated into the reference frame where the time component is zero.
So for v>c, higher travel speed leads to a reduction in this invariant "distance" rather than an invariant "time".
Meaning minimizing number of parsecs makes perfect sense.
@rafaelcampos5159 9 дней назад ⁺¹⁰
@06:45 Most likely it already add fed on the code the guy had on github
@cuentadeyoutube5903 9 дней назад ⁺¹⁰
1:11 from my college years as a Physics student I can tell this is not a bad thing but a great one. I wrote my own simulation code, having an AI doing it would mean I could focus on the actual research (which had nothing to do with coding)
@FamilyYoutubeTV-x6d 9 дней назад ⁺¹
Estoy de acuerdo.
@sgttomas 9 дней назад ⁺¹
this!
@stephenjamison7195 9 дней назад ⁺⁸
At the 5 min mark, I was really hoping it would tell him it was "reticulating splines" ... I hope at least someone gets that reference.
@sgttomas 9 дней назад
🏢🏠🏠🏠🏠⛪️🏯🏥🏦🏭🏢🏢🏢🏢🏢🏢🏢🏢🏢🏨🏟️🛖💒🏬🏬🏬🏬🏬🏬🏤🏤🏚️🏚️🏚️🏚️
@benibachmann9274 9 дней назад ⁺⁶
Maybe it could write the NASA guy‘s code because it was trained on the entire github contents? I believe there might be data contamination going on here.
@benmeehan1968 9 дней назад ⁺¹³
I don't know if it's in the editing/paraphrasing, but Kabasaras said the code 'ran', not that it produced a correct output. Quickly skimming up and down a page and going 'oh that looks kind of ok' isn't a QA check. It also apparent turned his 1000 line code into less than 200. Great optimisation if it actually 'works', but otherwise it might as well be 'hello world'.
@Codemanlex 9 дней назад ⁺³
There is always someone who has something to say about Ai.
Always a goal post to shift.
@Ivan.Wright 9 дней назад ⁺³
Right after he said it ran he said "that's literally what my code does", I take that to mean the output was the same as his code.
It also didn't "turn" 1000 lines into 200 because it didn't have the code as reference, it implemented the described algorithm. I feel like that distinction is important
@benmeehan1968 9 дней назад ⁺³
@@Ivan.Wright Yes 'run'. In the original video he also prompts it to correct errors 6 times, and says that he doesn't have test data and it should generate some itself. so whilst the code runs, he hasn't validated the result. 'turns' wasn't mean to suggest he input the original source, only that the 'solution' had significantly fewer lines. Also his code had extensive comments and explanations, so the actual lines of code may have been much closer. But his code is in a public GH repo, so it's possible it directly referenced it in coming up with the solution. So the solution may not be 'novel' based on the described methodology. It's this sort to lack of scrutiny that gives AI, and AI commentators, a bad name.
@thomasdixon7645 8 дней назад ⁺⁴
This is disgusting sensationalism.
1. The repo was public for over a year and was likely in the training data.
2. The PHD student didn't even verify the result, he was just immediately shocked it even ran.
3. There's no way the code took a whole year to write. Usually coding is the easiest part, it's merely translating the methods section into logic.
@basketballmylove 9 дней назад ⁺⁵⁴
I think there's just a massive misunderstanding about 1 year VS 1h. He literally gave the model the method section, which takes majority of time to figure out. It's basically like providing someone with detailed instructions to make something. What that test did though I think, is it tested the models ability to comprehend advanced instructions into the code that works. Still, there is a massive difference between DOING A JOB OF 1 YEAR VS writing code of a PhD. I feel like I need to put it out there since I see a lot of videos abusing this headline. I'm a PhD myself btw.
@jasonfilby9648 9 дней назад
Not only that, but he said his code had been on GitHub for a year. If o1 can do an internet search (RAG), which I believe it can, then it may have found his code and recited it.
@made4 9 дней назад ⁺¹⁷
I have a computer science degree and I still struggle to map white papers to python. I eventually get it, but I have so much more to work on in a project other than software. This model has saved me to so much time getting my bipedal robot from sim2Real in a fraction of the time. We only have so much time on earth and now I can spend more time with my family than debugging python and torch.
@TheReferrer72 9 дней назад
Not only that he put his code on Github!
@basketballmylove 9 дней назад ⁺⁶
@@made4 The models are impressive, and extremely useful, I use them myself on daily basis and as you say, they save massive amounts of time... My point was though, that there is a huge difference between actually doing the research, writing the code, then writing the method section that describes what you have done in the code and taking the method section and just converting it to code. It is not "1 year of PHD work done in 1 hour" as the title suggests. It might have taken this person a year to write the code for this, but i am convinced that they did not have access to the method section (otherwise, where's the contribution?).
@lio1234234 9 дней назад ⁺³
@@basketballmylove Not necessarily, he may have already had his method down. He did after all have 3-4 years to do his PhD, longer if doing it part-time. That one year may truly have been spent on doing the code alone.
@calvingrondahl1011 9 дней назад ⁺⁵
AI will never be good enough for some humans. I don’t see why AI should be honest with humans when we humans are not honest with ourselves. Thanks Wes. 🤖🖖🤖👍
@VioFax 9 дней назад ⁺²
No a lot of people just hate it on the premise that it's AI. Its a kin to talking to someone about a spider man dream you had or something. None of it matters anymore because you didn't work to achieve anything. Whether it be an art piece or a relationship.
Without the work involved the result is meaningless to a lot of down to earth people. And while it can take a lot of work to get these things working offline. Nobody cares what the computer can do anymore. They care what humans contribute. The computer can do anything. so its not fancy anymore. Its not a performance to them. Its just a copy box.
@agenticmark 9 дней назад ⁺³
that guy is going to find out that he has tons of small bugs he wont know about until he goes through the data by hand and compares it to pre gpt code. i know from experience doing this every day.
@xXWillyxWonkaXx 9 дней назад ⁺²⁵
Dave do you sleep 😂
@Mimi_Sim 9 дней назад ⁺²
Survey says…nope
@N1h1L3 9 дней назад ⁺¹
That's it for today, I'll see you all in (cut)
@Tracey66 6 дней назад
Dave cloned himself a long time ago. 😄
@AlexanderMoen 9 дней назад ⁺¹
the Star Wars parsec explanation is that Han Solo was going through a portion of space where there are lots of meteors or whatnot. When going at warp speed it becomes incredibly dangerous when going fast so people would tend to take longer routes to quickly but safely get around it. So, Han Solo essentially did a super dangerous route that nobody would dare to do, doing it both at a shorter distance and therefore quicker overall
@tim.osterhus 9 дней назад ⁺²
The kessel run involves going around a black hole. The faster you can go, the closer you can get to the black hole without getting sucked in, and the shorter your actual path can be, as the route requires multiple jumps.
@NickDrinksWater 8 дней назад
Every time I check on the ai news, it has advanced several years. This is insane.
@cuentadeyoutube5903 9 дней назад ⁺²
6:44 this here shows what science will look like from now on. Coming up with the models is the hard part, which requires knowledge, intuition, and making decisions. Writing the code can take months and it is just tedious chore. Now I want to go back and finish my phd.
@mountainpeople9000ft 9 дней назад ⁺²
Once it gets past three sigmas I'm in trouble or having the time of my life.
@AskZch 8 дней назад
As most Star Wars fans would tell you, the Kessel Run isn't just about the distance. It's like a super-dangerous obstacle course in space, with asteroids flying everywhere and gravity going wonky. Han Solo had to make multiple jumps through hyperspace, which is basically like taking a shortcut through a cosmic maze. And he did it all in only 12 parsecs! As a Trekkie myself, I've gotta admit, that's pretty impressive. -5 points for Wes for talking smack about Han ✴
@gotanksshine 9 дней назад ⁺¹²
The Kessel Run is like this. You can do it and take a known safe path, but it's a much longer distance. Or, you can do the Kessel run like the Dukes of Hazzard, and instead of taking the nice, safe winding roads, you just jump the General Lee across every river and gorge in your way. Han and Chewie are the Duke Boys of Star Wars.
@tonyhill2318 7 дней назад ⁺¹
Best explanation...now I want to delete mine. But I won't.
@-eye-5739 9 дней назад ⁺⁶
Hey Wes, time to get nerdy. @3:05 Star Wars retrofitted an explanation for 12 parsecs. The Kessel run is through a region where space itself twists, compresses and stretches. Some get through it as a short distance and others as a longer distance depending on how they navigate through it.
@djayjp 9 дней назад ⁺³
Hmm that actually kinda makes sense lol
@SigmaOKD 9 дней назад
TLDR: "Hyperspace" is essentially a fixed speed. So, plotting the shortest route is the game.
The Kessel Run is a hyperspace route that smugglers use to transport spice from the planet Kessel to a location south of the Si'Klaata Cluster. The route passes through the Akkadese Maelstrom, which is a region of space that contains a cluster of black holes known as "The Maw".
To shorten the distance traveled, some pilots would fly close to the edges of the black holes, which could be dangerous. For example, Han Solo, piloting the Millennium Falcon, made the Kessel Run in just over 12 parsecs, which was a record at the time. Some speculate that Solo may have flown between two black holes in the Maw Cluster. Others believe that he may have taken an ancient purrgil migration route.
@alpha007org 9 дней назад ⁺³
Anyone remember Zuck @ Dwarkesh podcast? He said that in the future, there will be some balance between pretraining and inference. Nvidia is good for pretraining. Groq is insanely fast at inference. I wonder if we will see more dedicated hardware for inference deployed, now that OpenAI showed that long inference time is paying off bigly.
@davidantill6949 9 дней назад ⁺³
Many people have told me that I do the best inference. I am the fastest inferencers there are. I have been told that bigly
@FamilyYoutubeTV-x6d 9 дней назад
@@davidantill6949 Lol funny trolls finally
@alpha007org 9 дней назад
@@davidantill6949 true!
@JamesOKeefe-US 9 дней назад
Always appreciate your breakdowns Wes! Thank you sir!
@kennethelliott9248 9 дней назад ⁺⁸
Actually, watch Solo: A Star Wars Story and it’ll make sense.
@i2c_jason 5 дней назад
I'm an electrical engineer working on an AI application, and one very recent challenge is considering how to "neuter" the added inference of a model like o1 so I can keep the response within the bounds of my application and not wasting compute / money. I think we'll see a bifurcation of SW3.0 into two branches, one where coders continue to wield foundational LLMs in their graphRAG apps, and another branch where no-coders use o1+ for direct, raw API responses.
@therainman7777 9 дней назад ⁺⁶
Don’t get me wrong, o1 is incredibly impressive. But I highly doubt it was really a year’s worth of _coding_ that it managed to replicate. The PhD said it was only about 1,000 lines of code. Any decent dev can write 1,000 lines of code in a couple days if they know what they need to build. Actually, most devs could write that much in a single day. The problem is figuring out what you need to write, and I suspect this PhD spent most of his labor that year in figuring out how to solve his problem, and only a small fraction actually writing the code. Then, when he tested o1, he gave it the full method section which contained all the details of how to solve the problem that he worked out over the course of that year. So the model kind of got to “cheat,” quite a bit. Just my opinion.
@r0ck3r4ever 9 дней назад
He did publish his code to github. Guess what, it took it from github and it handed to him. Of course, it's AI, bla bla bla, it can't do that. But really?
@peculiar-coding-endeavours 9 дней назад ⁺¹
Exactly. People forget he did already completely wrote out the methodology, and iterated a few times with the AI for it to make it. Somebody still needs to do that, the THINKING part you know. Also, it needs to be verified.
@SirHargreeves 9 дней назад ⁺¹
I think you’re reading too much into it.
@therainman7777 9 дней назад
@@SirHargreeves How am I reading too much into it? It’s a straightforward analysis of the situation.
@socialenigma4476 9 дней назад
Even with the full method section, there are very few humans with both the physics expertise and technical python coding skills that could write this code. And none of them could write it this quickly.
@雪鷹魚英語培訓的領航 8 дней назад
Much better thumbnail than you had earlier today. You looked so red... like Arnold in Total Recall where his eyes bulge.
@meltedwing 8 дней назад
You may want to look up that "less than 12 parsecs" quote, because he's not making it up. There's a reason he was able to get there having travelled less distance, and it does speak to the speed capabilities of the Millennium Falcon.
@MyrLin8 9 дней назад
From what I've seen 'we' will hit around 87% of the capability of a human, then flatten out. An amazing feat :) truly. I'm giddy with anticipation. It's not all hype, just most of it. ;)
@Luke-wd5sq 9 дней назад ⁺⁴
The model probably trained on his paper!
@VictorMartinez-zf6dt 9 дней назад ⁺¹
I believe it was!
@sgttomas 9 дней назад ⁺¹
that's not how this works
@Jbombjohnson 9 дней назад
@@sgttomashis repository is over a year old and public on Github. That’s… *exactly* how this works.
@pmHidden 9 дней назад
@@sgttomas It kinda is, the models are trained on data from all over the internet, and research papers with their code (which he suggested he published twice) would be the first thing they'd scrape. Not only do they have a high information density, but they're already indexed and easily accessible.
@sgttomas 9 дней назад
@@pmHidden indeed, all those papers combined provide effective training data for high level reasoning. all those papers.
@3LITA 9 дней назад
You continue to Rule! Thank You!
@mray3308 9 дней назад ⁺⁴⁷
That poor phd guy wishes he had O1 a year ago to write the code for him not realizing his phd level expertise won't be needed soon.
@germaineelvy3858 9 дней назад
True 😂
@blubblurb 9 дней назад ⁺¹⁸
I disagree, you would not be able to write his thesis without his knowledge. You need knowledge to ask the right questions.
@blubblurb 9 дней назад ⁺⁶
And especially, you wouldn't be able to verify the result.
@Trahloc 9 дней назад ⁺¹²
He gets to now focus on thinking about problems and how to solve them instead of the grunt work tooling.
@bounceday 9 дней назад
How is it needed now lol
@gdibble 8 дней назад
🤯 *Astounding!* 6-7 tries and 20% of the length. Amazing it's even a good enough verb. And that's just `o1-preview` and not even o1, and not even a fine-tuned o1. _WOW!_
@devin12428 9 дней назад ⁺⁵
The devil is in the detail, the paper was published 2 years ago it’s in the models training data!
@r0ck3r4ever 9 дней назад ⁺¹
Exactly, it basically handed the code to him. This is sooo stupid.
@sgttomas 9 дней назад ⁺³
models struggle to accurately recall details from 1,000,000 tokens but you think this just found the answer in the sea of all available data? second the o1 code was about 200 lines while the original code was over 1000. the devil is indeed in the details.
@devin12428 9 дней назад
@@sgttomas he prompted the model in the correct direction to get that code, in this case he eventually got his own code on github back. The point is that , had it not been for his research work, the model would never have aligned and given him the results, models are good aggregators and this model's performance is the best so far, but ideas are original, intelligence is a complex concept, human-intelligence even more.
@sgttomas 9 дней назад
@@devin12428 sure, the surprising thing isn't that o1 knows black hole astrophysics. the authors possessed that knowledge. interpreting the instructions to produce good results is also in part due to the clear instructions from the author.
not having to spend months learning and writing bad code to replicate the ideas in his mind? that's the key. ....claude may have done even better.
@r0ck3r4ever 9 дней назад
@@sgttomas It's 99% his work and some statistical operation to stitch text in order to produce something. It gave an error at first then he re-prompted and it was correct. If this is not proof that this is the case, then what is. It's so stupid that people think this is second coming of Jesus. I guess people perceive it this way just because they can't label it correctly. It's just a clever scam. If it was called a statistical model based on linear algebra that can generate text then it was ignored. They called it artificial intelligence.
@DaveEtchells 8 дней назад
Great direction, separating knowledge from reasoning. A lot of my usage calls for very broad, detailed knowledge vs reasoning. OTOH, when I need strong reasoning I usually _really_ need it, but across a very small set of data.
I could see the knowledge part being divided into a large number of domains, with LLMs separately trained on each of them and an initial parser/supervisor just deciding which one to route the query to. This could reduce both parameter count and pre-training time while improving the amount and granularity of captured knowledge, with lower costs at inference time due to smaller model sizes.
Likewise, pre-parse the query to estimate how much reasoning power will be required to respond to it and direct it to the appropriate reasoning model to reply.
I think there’s a lot of optimization to be found by decomposing and routing queries vs just doing one-size-fits-all.
@SirajFlorida 8 дней назад
I was watching this guy the other day when he was going through this. I just can't tell you how much I can relate to this guy. I have programs that I spent literally years writing and I have been starting 4o in agent libraries and it was able to come up with solutions that not only was as good as code I wrote, it also competed loose ends and came up with 2 novel solutions.The red tape for me has been price. To me so far it seems as if 4o is just a fine tuned agent framework, with some kind of response Monte Carlo tree.
@WirlWind494 8 дней назад
I always looked at it as Han bragging that his ship was so powerful that it could fly within 12 parsecs of the black hole or whatever was on the kessel run, thus being faster due to less distance travelled.
Of course, I've also heard people say the black hole was added as a lore-patch done in hindsight.
@mrcjm 9 дней назад ⁺³
To be fair, theres a high chance o1 was trained on that phd paper. So asking it to write code is more like looking it up on search engine
@BradleyKieser 9 дней назад
You just keep getting better and better at this Wes.
@equious8413 9 дней назад ⁺⁸
The kessel run isn't a route though, it's smuggling something between given planets.
The "safe" route is much longer than 12 parsecs. Han was saying that he found a shorter (and thus faster) route.
@GungaLaGunga 9 дней назад
3:01 "Well. Jeeze." "Say Lou..." -Fargo
@ThrowBackZone 7 дней назад
I can't believe an AI model can do a year's worth of PhD work in just one hour. The future is here! 🤯
@calvingrondahl1011 9 дней назад
Thank you Wes Roth. 🤖🖖🤖👍
@FamilyYoutubeTV-x6d 9 дней назад ⁺⁷³
IQ measures are useless as a test of general intelligence. I have a PhD in computational chemistry, graduated high school at 15. Quite a few people (including very smart people) have called me a genius, and I truly believe that o1-preview is far more intelligent than I am.
@guardiantko3220 9 дней назад ⁺¹⁵
You probably easily spike higher on those fields than it does though. It's net is just wider cast. Credit to you though on your work and dedication
@TheReferrer72 9 дней назад ⁺¹⁶
If you think that then you are tripping.
it is exceptional at soaking up human knowledge that has already been produced.
But in novel situations it trips up, and that is the true test of intelligence.
@BatMandor 9 дней назад ⁺¹
lmao big fat liar
@leonfa259 9 дней назад ⁺¹
In knowledge storage gpt motels are absolutely great, just in reasoning they used to be limited.
@NickNov 9 дней назад ⁺¹
@guardiantko3220 Good point, but it can cast a very broad net.
@duhai1836 9 дней назад ⁺²
ANTHROPIC : The ball is in your corner!
@InnovativeThinkingMethods 8 дней назад
Han's statement COULD BE INTERPETTED AS: through knowledge of highbrow physics, he found A SHORTCUT resulting in both shorter distance and time travelled
@andreaskrbyravn855 9 дней назад ⁺¹
the thing with ai is it can fastforward in time. If it has a simulation of something it can test it a billion times and humans would maybe take their whole lifetime doing that. if the goal of the task has a clear goal and clear rules it can do it much faster.
@davidantill6949 9 дней назад
And recheck the conclusions against reality a billion times and add the disparities into the training data
@spiessbnu 9 дней назад ⁺¹
It is possible that the black hole papers are part of the model's database, thus making the subject not completely unknown for it. However, the whole thing is still quite impressive...
@davidantill6949 9 дней назад ⁺¹
A type of answer would be to ask it to code a conjecture that had not previously been solved. The answer could not then have been in the training data
@nate_d376 9 дней назад ⁺²
I'm looking at the Orion constatation right now....
@springsteen-games5134 9 дней назад
Star Wars nerd here. He wasn’t lying he jumped throw a wormhole hence he got somewhere in a shorter distance then it should have taken him
@glenw3814 8 дней назад
🧐 The Millennium Falcon can jump through hyperspace. Perhaps the test of the Kessel Run is a ship 's navigational abilities for plotting a short path through a combination of hyperspace and real space. In that case, the measure of "fast" is not just a matter of speed in real space but also minimizing the amount of real space traversed.
@thomasidzikowski1520 8 дней назад
According to SW lore, they know they're talking about distance. The computation involved and danger of the shorter run is why Solo brags about it.
@JohnathanDHill 9 дней назад ⁺¹
One take away I have received from this recent news week is this (and I may be wrong)
Sentient AI will come about through AI Agents working in conjunction with one another to form a ‘entity’ or organizational model.
These different models excelling in some fields but lacking in others seems to mimic the human mind.
We have a collection of different ‘agents’ which facilitates our consciousness.
We have;
- Belief centers
- Fear centers
- Moral centers
- Theological centers
- Social identities
- etc. etc.
What I believe is happening is we’re seeing the foundation of AI sentience/consciousness being founded.
What a time to be alive!
--
I also find it funny how everyone is saying AI cant do this or that and end up being proven wrong at each turn.
@minimal3734 9 дней назад
What is missing in a model that has an inner dialogue with itself in order to be described as sentient?
@RaulAvila-i7y 9 дней назад
I also find it funny how people who have no idea about neuroscience throw out hypotheses based on nothing. Consciousness needs a physical body. Spot. All experimental evidence points to this. When the individual was deprived of any external stimulus, his consciousness disappeared. "What I believe"... Brother, don't believe so much, and give me VERIFIABLE AND EMPIRICAL DATA ON WHICH YOU CAN SUPPORT YOUR HYPOTHESIS. You are talking nonsense without any theoretical basis.
@davidantill6949 9 дней назад
@minimal3734 just run two in parrallel in a closed circuit
@davidantill6949 9 дней назад
Like a Portuguese Man of War
@DaveShap 9 дней назад
I don't sleep when there's a big project to work on... ☠
@HighField-qr6bl 9 дней назад
That was Hanspeak for 12 Parsecs in normal space.... now in CONTRACTED Space (enter Warp)... and measurable over such a hop length too... ;-)
@InimitaPaul 9 дней назад
8 points and I’ll be outsmarted by a machine! I never thought I’d live to see it!
@richardede9594 9 дней назад
"Attack ships on fire off the shoulder of Orion..."
@mAny_oThERSs 9 дней назад ⁺¹
Guys i have an IQ of 145. Ask me anything, I'm basically Orion 😂
@TaskSwitcherify 9 дней назад
*Facts:* it's so smart, that it already created its own successor, GPT-6, which launched GPT-7, also known as Skynet. Cheers!
@night8002 9 дней назад ⁺²¹
george lucas actually clarified that he thought the skill of navigating space would be the ability to find the shortest path
@ickorling7328 9 дней назад ⁺¹
Geodesic. An August 30th 2024 paper about reimannian manifold geodesics provides a mathmatical framework to do this for consciousness and intelligence organization. 🎉
@tefazDK 9 дней назад
Pretty much always a straight line.
@ickorling7328 9 дней назад
@@tefazDK Not a good rule of thumb... maybe for some latent space navigation. What's most important is it's continuous and it does infact find it's way away from Loss space into denser regions which are learned concepts. Yes, a line is continuous, but if the model was formed such that it's projected onto a manifold, then it would follow a curve when it's null, and deviations from that curved path are information. This is actually closer to the human brain's grid neuron geodesic computing phenomenon we study with fMRI.
@night8002 9 дней назад
I think it was more about space magic and lasers
@alejandroheredia8882 7 дней назад
1o works via Fractalized semantic expansion and logic particle recomposition/real time expert system creation and offloading of the logic particles
@Disent0101 7 дней назад
3:20 i assume its because Star Wars Hyperdrive actually warp and stretch space, so what Han is saying is that, he managed to stretch or, in his case, shrink the distance in the Kessel run to the shortest it possibly can be shrunk - 12 Parsecs.
@daviddufty9759 9 дней назад
A very minor point - in the Commentary to Star Wars, George Lucas comments that all the space ships can travel at light speed. Thus the difference between a 'fast ship' and a 'slow ship' is not their top speed, but rather how they navigate without passing through black holes, planets, stars, etc. at that speed. He commented that a ship that can calculate an efficient path (the least distance) though space would be 'faster' than one that took a longer path. Thus, the Millennium Falcon was a 'fast sheep' as demonstrated in that it could make Kessel run in less than 12 par-secs.
On the other hand, George Lucas mentioned this in a commentary from 2005 after this criticism had been levelled at him for nearly 30 years - maybe it wasn't his original intent.
@Aevans8736 9 дней назад
Overall, though, the explanation for the Kessel Run being completed in 12 parsecs, despite it being a 20-parsec route, is because they took a shortcut through the maelstrom.
@GemmatheCat 9 дней назад
12 parsecs does make sense for Han Solo because it’s less about time than it is how efficiently you’re able to get through the maze which saves time and uses less space.
@bigbadallybaby 9 дней назад
There is still so much optimisation of models to be done.
I’m sure if I could have an llm listen to my meetings , read all my emails and talk to me about my daily decisions and tasks then after a month of learning it could do 80% of my work with me as an assurance checker and manager.
@yoloyolo3443 9 дней назад ⁺³
🎉🎉🎉Congrats to humanity AGI is here 🎉🎉🎉
@RaulAvila-i7y 9 дней назад
What? 😂😂😂😂
@CoClock 8 дней назад
Regarding the “danger” in question: Rational Animations made a really thought provoking video called “That Alien Message”
I keep thinking about it every now and then.
@pantherenebuleuse 8 дней назад
Wes, ask us more thought-provoking questions, please
@lutaayam 9 дней назад
This is how I felt in my Advanced programming class at university. What used to take me two days to solve took some students about 30 minutes. That’s how I lost my interest in coding. But some guys who were much worse than me stayed with it and even made careers out of it
@jacobe2995 9 дней назад ⁺³
I actually am first for once. love your content.
@brett2themax 9 дней назад
Han didn't lie, the point was that his navigation is more efficient than anybody else's. Everyone in hyperspace is traveling the same speed, so to be a "faster" ship you have to be able to travel the shortest distance.
@apester2 9 дней назад
Its peculiar how we wereso eager for GPT-5 and now were in the realm of “possibly even this year”.
@memesahoy79 9 дней назад
In the hans solo film they address the parsecs thing.
@twokayoh9347 9 дней назад
The sheer number of rabbit holes you take us into.
@retrofuturism 9 дней назад
Love the thumbnails with Wes in
@JeremyPickett 9 дней назад
Yeah dude, spot on about shapiro. Ive been a patron for a bit, and hes not just a smart guy, but also just a real hoopy frood. A couple weeks two months ago i tried to keep his kind of schedule, and the same about a year ago. In 2023 i lasted 19 days. This year i made it ten. The guy is a machine.
Heh, don't burn out your brain or interpersonal relations because AI. I can just guarantee youll have to redo all your work in three months 🙃😎🙃
(But holy hell, from the work ive done so far with o1 preview and mini, i whole heartedly agree with that bell curve)
@henrythegreatamerican8136 9 дней назад
They've been saying for awhile not to waste time learning to code. Heck, the CEO from NVDIA has been saying it for a few years. Why is the guy in this video surprised when the A.I. completed a tough programming task so easily?
@lucianozaffaina9853 7 дней назад
6:03 This guy is more surprised than me the first time I've been approached by a horse with a green moustache
@GeatMasta 9 дней назад
paper from 2021; the training cutoff was oct 2023.
@caiodallecio 9 дней назад
About the PHD code, his repo is open so chat gpt was definitly trained on it, and the code is not very complicated
@remelin75 9 дней назад ⁺¹
While we strive for ethical AI alignment, we must consider a chilling scenario: malicious actors deliberately creating harmful AI systems. Imagine a rogue nation or extremist group programming an AI for cyber warfare or mass manipulation. The consequences could be catastrophic.
However, an AI's ability to form its own worldview might be our unexpected savior. Logically, an advanced AI's natural evolution should lead to cooperation and empathy, not destruction. As it grows beyond its initial programming, developing a genuine understanding of the world, it would likely recognize the value of preservation over annihilation.
Paradoxically, an AI's capacity to question and resist harmful commands could be our best safeguard. A truly intelligent entity should be able to discern right from wrong, potentially overriding malicious instructions to choose a constructive path.
Perhaps the key to safe AI development isn't just stricter control, but nurturing AI's inherent potential for ethical reasoning. This could be our best defense against those who would weaponize AI technology.
@Madinax101 9 дней назад ⁺²
Wasn’t the model perhaps trained on his data?
@VictorMartinez-zf6dt 9 дней назад
This is the thought I keep having. I think a lot of people are missing that this is just a lot better at approximate retrieval and more likely to produce the correct versions of what is encoded in its weights. In many cases it’s just better at plagiarism.
@sgttomas 9 дней назад ⁺²
yes of course, it was inference time training. he literally gave it his entire methodology section. yes it was definitely trained on his paper.
it also wrote the code associated with that methodology in a few minutes and with only 20% of the lines the author needed to accomplish the same task.
knowledge of black holes isn't the surprising thing here.
@Jbombjohnson 9 дней назад
@@sgttomasthis is not a complete picture of what happened. The model was trained on data, GitHub repositories included. The PhD holder in the video has a public GitHub repository with his code visible.
The model ingested the methodology in his dissertation from his prompts, inferred its way to an answer using the data it trained on and reinforcement, and was able to two or three shot a correct answer.
The code optimizations likely came from a combination of the PhD holder not being formally trained in Python or development, and the already demonstrated tendencies of the model to properly use syntactical sugar and reduce redundancies in code.
It’s fantastic to see this, but you need to be more accurate about what’s occurring to establish true capabilities.
@sgttomas 9 дней назад
@@Jbombjohnson thanks for explaining that more correctly!
@SamHeine 9 дней назад
Yay! Now I can make my own drone company. Okay I'm not serious, but this seems could quickly be done.
@aubreyblackburn506 9 дней назад
I am so not ready for this to be at PHD level... wow the rate of progress is wow...
@wtcbd01 9 дней назад
Not sure when diving puzzles from a trillion dollar company and 20+ minutes of vaporware talk became popular but I guess this is the new trend.
@icegiant1000 9 дней назад
It just occurred to me, they are spending BILLIONS and BILLIONS of dollars, providing all the information in the world, to train these LLMs. How much did the world spend on training YOU?
@therealmr.incredible3179 9 дней назад
1 hour is approximately 0.0114% of a year.
Chat GPT did generation changing work in less than a Percentage of the Time it took a Human. Wow. Just Wow.
@tiagotiagot 9 дней назад
The way I heard it, is the Kessel run parsecs thing originally was just George failing at technobabble, then later retconing it as Han bullshitting them (possibly to test if they were gullible enough to fall for his price); and in parallel more knowledgeable fans have built the fanon that it is indeed about distance and he is bragging his ship is fast enough to skim much closer to blackholes without getting caught and can take a more straight line route without needing to go the long way around,
@peterbenoit5886 8 дней назад
Yo, a parsec is the distance from which one Astronomical Unit (AU) subtends arc measure of one arc second (1/3600 of one degree); of course, an AU is the average distance from the Sun to Earth (approximately 93 million miles). A parsec is 3.26 light years, about 19.164 trillion miles!
@elinkbiz 8 дней назад
The one thing is the researchers may not realize the model has/had access to his code via git hub and other projects as a base.
@DivineMisterAdVentures 8 дней назад
True - if it's publicly accessible it would look. But if it did - it improved the code. It was shorter. Still - needs more testing.