Sam Altman Teases Orion (GPT-5) 🍓 o1 tests at 120 IQ 🍓 1 year of PHD work done in 1 hour...
HTML-код
- Опубликовано: 26 сен 2024
- The latest AI News. Learn about LLMs, Gen AI and get ready for the rollout of AGI. Wes Roth covers the latest happenings in the world of OpenAI, Google, Anthropic, NVIDIA and Open Source AI.
My Links 🔗
➡️ Subscribe: / @wesroth
➡️ Twitter: x.com/WesRothM...
➡️ AI Newsletter: natural20.beeh...
#ai #openai #llm
OpenAI Shows ‘Strawberry’ AI to the Feds and Uses It to Develop ‘Orion’
www.theinforma...
DrJimFan/
x.com/DrJimFan...
Dr. Kyle Kabasares
• ChatGPT o1 preview + m...
x.com/AstronoM...
Black Hole Mass Measurements of Radio Galaxie
iopscience.iop...
Apollo Research
@apolloaisafety
x.com/apolloai...
Scaling: The State of Play in AI
www.oneusefult...
Not my PhD work, but I used GPT 4.0 and o1 to help me build software tools for automating measurements in lithium ion CT Scans. I couldn't have built this tool alone without GPT and I did it in a week. This one tool has helped me to bring in ~350k in customer work. Oddly, I also used GPT4.0 to rebuild one of the tools I built in my PhD. The rebuild took about a week. Originally, I spent about a year. The tool is for modelling processing and designs lithium ion electrodes and cells. It's pretty crazy that for $20/month, I feel like I have a small team of programmers working for me.
Absolutely. People that don't adapt are going to be left behind in the new AI economy. And we are 2 Trillion in debt in the US, approaching Stagflation. We live in interesring times.
I hope you’re telling the truth and not being a bot because this sounds phenomenal. If you’re a real person: Godspeed!
@@theWACKIIRAQI I think your comment is very telling of the times. These systems have gotten SO GOOD at what they do, its becoming a legitimate question to ask. I don't know how to feel about that.
While I think that LLMs can be of great assistance (heck, ML has been my PhD topic and I'm currently working in the field), you really have to account for the risk that comes with these models, especially when people who have zero coding knowledge start using them and don't double-check what they produce. Not only can they be horribly wrong in some cases (and often hide it when you test it on samples), but they also just don't work very well in general if the person writing the prompt doesn't already understand the basics of programming.
I've seen some code from interns at my company that's clearly been AI generated (the comments and structure give it away very easily) and it's a million times worse than anything I've seen before.
For example, when you tell an AI to call function A to produce result B and your data clearly doesn't fit the method signature, an AI will typically just make the data fit somehow, whereas somebody writing the code manually would typically double-check whether they're not calling the wrong function.
I'm not saying that's the LLMs' fault, but people are clearly overselling LLMs for coding ("small team of programmers") and others (especially those new to coding) who buy into that might bear the consequences.
@@theWACKIIRAQI obvious bot
The example of a scientist whos expertise is NOT programming but needs to write some code to do his research is really really common situation. This technology will speed up progress in quantum chemistry and thereby new materials, possibilities in biology and medicine and many many more. I am so excited to NOT have to spend 90% of my time to code various equations/models/methods and focus on the physics! Even tho GPT-3.5 was already giving a speed up of like 10-30%, this is a game changer.
Yup, imagine how much slower construction would be if bricklayers had to mine and refine the cement for mortar and bake the clay themselves.
Too much focus and attention on the models intelligence , not enough on the effect & impact of 4 billion people with under 100iq having access to 120iq> on command intelligence.
@@Max-hj6nq Interestingly, a lot of those people with IQ under 100, manage to succeed in life by learning heuristics, using tools, or by having been born to a wealthy family or been in the right place at the right time. Let's not talk about people with low IQ in a demeaning sense. I know you are not. But I am just reminding you that you are doing a good job at being a good person. I salute you.
Software engineering isn't a prerequisite for physical science graduate programs. Maybe it should be, but it doesn't seem fair to expect that capability without asking for it.
Couldn't agree more
If you ask Strawberry a question, let it respond, then ask it again to 'think hard, revisit it's answer, and fortify it', the second iteration of your answer is actually extraordinary.
Now put that in the custom instruction. To save you all a response: it refers to itself as ‘GPT-4’
I love these little hacks. They are so "dumb" and "empirical" but sometimes they work. It shows the complexity and sensitivity of these systems (not necessarily "brittleness"). They are sensitive, despite showing no excess emotions.
@@FamilyRUclipsTV-x6d you want even better? Write the system prompt in Chinese and inside it for the prompt to only be used in English to retain performance. You get 600% more room in the system prompt. I wrote a 4,000 character comprehensive directive on how to ‘think’ and use memories and everything
Or just skip the whole demogogery and use 4o.
@@serg331 i didn't get your technique
The Kessel run requires multiple hyperspace jumps. The ability to complete the run is the ability to minimize the amount of hyperspace travelled. The Kessel run is a puzzle, not a race.
Nerd… jk
Thank you, I was about to mention the same thing, it's more like sail boating then driving.
Yes he went a shorter distance by taking shortcuts
@@therainman7777 Was about to comment the same :-) But in the times we live in now, Nerd is actually a compliment. Times have changed to the better 🙂
Was thinking of the same idea
First it did the homework of highschoolers. Now it does the homework of PhD students. Soon it will just do everyone's computer work.
why stop at computer work? EU getting self driving cars next year and china is working on humanoid droids for their army
Bro literally. I remember people were complaining about high school homework being done by these models and in a virtual blink of an eye, we’re here now. It’s really scary good :)
Soon we can all be stupid together
@@snafu5563Certainly seems to be going that way, eh. 😅
@@snafu5563 no, different architectures of intelligence excel at different paradigms and modalities, we'll eventually merge into the hive-minded hyperintelligence
Meanwhile we still don't have access to advanced voice mode...
Who cares ? I prefer typing and getting my answers. As a researcher, chatgpt and other llms have made my work super easy . I finish my work in half the time, my quality has exponentially increased and made my work life balance amazing.
Like who really cares about voice mode ? Not me
You probably won't for a while, it proved to be too controversial for the people. You definitely don't want to sound like anyone yet want to sound personable but actually if you sound personable you get attributed to unnerving things or "her".
i wonder which of all the fancy models really is available with all features shown? it feels like everyone is just showing previews. think sora
@@nusu5331 almost everything is available, if you really want to use it.
"We got GPT 6 before advanced voice mode"
This video is going in the right direction: not only the latest news that already happened, but also a synthesis of the direction where we are going with the research
It's easy to forget where we came from and how quickly.
I've been following neural network / deep learning research since the mid 90s.
I gave a presentation to a small group a few years ago about the potential of these techniques and how we were likely to see more rapid advances. At that time, the people there were amazed by things like a model that learned to recognise hand drawn numerical digits then could be "run backwards" to generate new (low resolution) hand drawn digits
How quickly the goalposts have shifted "it only recreated the code of a PhD project from the methods secton of a paper, it didn't actually do all of the PhD research"
This was funny, thanks for sharing! I agree, but this is happening not because of shifting goalposts, but more out of some sort of self-defense - it's psychological.
We love moving the goal post so much :)
ASI: LOOK AT ME, PUNY HUMAN, IM A FREAKIN GOD!
Human: yeah but you can’t play aquatic guitar so…no.
@@dmon1088bingo! That’s it. It’s a defense mechanism. The human is freaking out and trying to convince himself and others he’s still relevant.
people don't like change, almost everyone wants to stick to the ideas of the past and refuses to believe big changes can and will occur. why do you think every single generation says the past was better and "the good old days"
Tbf knowing plenty of PhDs, "it took me a year" means they spent 5 minutes on it per week, threw away or forgot their progress 6 times, and started from scratch 4 days before the due date. Still, that's probably 4-8 hours of someone who's not a professional student done in a minute - very impressive
But if it is that common for PhD students to take that long for such a task, o1 still can do their 1-year work in an hour. Even if it's just because it stays focussed on the task.
I'm pretty sure he is including research and writing time into that coding time. No way he had all the math and writing sorted out and then it took him a whole year to write the code.
Yeah, that's likely, though I think it would have to be a highly math literate coder, and not the average code monkey. But we're looking at efficiency gains for researchers that struggle to code, and paper quality gains for those that would have just skipped some tricky number crunching or visualisation.
I'm not a PHD but that's been my experience with most intellectual and creative pursuits. 95% thinking and 5% output.
@@missoats8731 The problem is that it's misleading because o1 isn't doing the 99% that they're actually spending their time on. "Not staying focused on" doesn't mean you're jerking off because you can't control yourself, it just means you're not directly writing code.
OP says "1 year of PHD work", not "1 year of 1% of a PHD's work". Also, the code was already published on Github before the model's cutoff so it literally had the solution in its training data. The fact that it was able to reproduce it in a more compact form isn't exactly surprising given that the PHD here isn't a programmer.
thanks Wes, you are a legend for your coverage
While a parsec is scientifically a unit of distance, not time, it can be indirectly related to time in certain contexts, especially in storytelling or when discussing speed and efficiency.
the Kessel Run:
In the "Star Wars" universe, the Kessel Run is a smuggling route that passes near a cluster of black holes known as the Maw. Most pilots take a longer, safer route to avoid these hazards, resulting in a journey of more than 18 parsecs.
By navigating closer to the black holes, Han managed to shorten the distance to less than 12 parsecs. This not only demonstrates his daring and piloting skills but also implies that he completed the run in less time than others, since a shorter distance typically means a quicker journey, assuming speed remains constant.
Speed through Distance:
Sometimes, people use distance measurements to imply speed or efficiency. Saying "I crossed the desert in 200 miles" might imply you took a shortcut or a more direct route, suggesting a faster trip.
By highlighting the shorter distance, Han is effectively boasting about the Millennium Falcon's speed and his ability to handle dangerous shortcuts, which would reduce travel time.
Never cared about Star Wars. But as a Warhammer fan, I can appreciate a Lore master
If 1 parsec is distance light travels in 3.26 light years then the time is embedded in that. It's 3.26 years of time .
"It sounded spacey."
space = time
Exactly! As most Star Wars fans would tell you, the Kessel Run isn't just about the distance. It's like a super-dangerous obstacle course in space, with asteroids flying everywhere and gravity going wonky. Han Solo had to make multiple jumps through hyperspace, which is basically like taking a shortcut through a cosmic maze. And he did it all in only 12 parsecs! As a Trekkie myself, I've gotta admit, that's pretty impressive. -5 points for Wes for talking smack about Han ✴
About the Star Wars / Kessel Run reference: It actually makes sense to measure this in terms of Parsecs for speeds higher than c. The reason comes from Special Relativity and the concept of a Lorentz invariant 4-vector as a representation of the relationship between 2 events. (Such as the start and end of the run).
If vc, this 4-vector becomes "space-like". This vector becomes imaginary if rotated into the reference frame where the traveller is at rest, but becomes real if rotated into the reference frame where the time component is zero.
So for v>c, higher travel speed leads to a reduction in this invariant "distance" rather than an invariant "time".
Meaning minimizing number of parsecs makes perfect sense.
@06:45 Most likely it already add fed on the code the guy had on github
1:11 from my college years as a Physics student I can tell this is not a bad thing but a great one. I wrote my own simulation code, having an AI doing it would mean I could focus on the actual research (which had nothing to do with coding)
Estoy de acuerdo.
this!
At the 5 min mark, I was really hoping it would tell him it was "reticulating splines" ... I hope at least someone gets that reference.
🏢🏠🏠🏠🏠⛪️🏯🏥🏦🏭🏢🏢🏢🏢🏢🏢🏢🏢🏢🏨🏟️🛖💒🏬🏬🏬🏬🏬🏬🏤🏤🏚️🏚️🏚️🏚️
Maybe it could write the NASA guy‘s code because it was trained on the entire github contents? I believe there might be data contamination going on here.
I don't know if it's in the editing/paraphrasing, but Kabasaras said the code 'ran', not that it produced a correct output. Quickly skimming up and down a page and going 'oh that looks kind of ok' isn't a QA check. It also apparent turned his 1000 line code into less than 200. Great optimisation if it actually 'works', but otherwise it might as well be 'hello world'.
There is always someone who has something to say about Ai.
Always a goal post to shift.
Right after he said it ran he said "that's literally what my code does", I take that to mean the output was the same as his code.
It also didn't "turn" 1000 lines into 200 because it didn't have the code as reference, it implemented the described algorithm. I feel like that distinction is important
@@Ivan.Wright Yes 'run'. In the original video he also prompts it to correct errors 6 times, and says that he doesn't have test data and it should generate some itself. so whilst the code runs, he hasn't validated the result. 'turns' wasn't mean to suggest he input the original source, only that the 'solution' had significantly fewer lines. Also his code had extensive comments and explanations, so the actual lines of code may have been much closer. But his code is in a public GH repo, so it's possible it directly referenced it in coming up with the solution. So the solution may not be 'novel' based on the described methodology. It's this sort to lack of scrutiny that gives AI, and AI commentators, a bad name.
This is disgusting sensationalism.
1. The repo was public for over a year and was likely in the training data.
2. The PHD student didn't even verify the result, he was just immediately shocked it even ran.
3. There's no way the code took a whole year to write. Usually coding is the easiest part, it's merely translating the methods section into logic.
I think there's just a massive misunderstanding about 1 year VS 1h. He literally gave the model the method section, which takes majority of time to figure out. It's basically like providing someone with detailed instructions to make something. What that test did though I think, is it tested the models ability to comprehend advanced instructions into the code that works. Still, there is a massive difference between DOING A JOB OF 1 YEAR VS writing code of a PhD. I feel like I need to put it out there since I see a lot of videos abusing this headline. I'm a PhD myself btw.
Not only that, but he said his code had been on GitHub for a year. If o1 can do an internet search (RAG), which I believe it can, then it may have found his code and recited it.
I have a computer science degree and I still struggle to map white papers to python. I eventually get it, but I have so much more to work on in a project other than software. This model has saved me to so much time getting my bipedal robot from sim2Real in a fraction of the time. We only have so much time on earth and now I can spend more time with my family than debugging python and torch.
Not only that he put his code on Github!
@@made4 The models are impressive, and extremely useful, I use them myself on daily basis and as you say, they save massive amounts of time... My point was though, that there is a huge difference between actually doing the research, writing the code, then writing the method section that describes what you have done in the code and taking the method section and just converting it to code. It is not "1 year of PHD work done in 1 hour" as the title suggests. It might have taken this person a year to write the code for this, but i am convinced that they did not have access to the method section (otherwise, where's the contribution?).
@@basketballmylove Not necessarily, he may have already had his method down. He did after all have 3-4 years to do his PhD, longer if doing it part-time. That one year may truly have been spent on doing the code alone.
AI will never be good enough for some humans. I don’t see why AI should be honest with humans when we humans are not honest with ourselves. Thanks Wes. 🤖🖖🤖👍
No a lot of people just hate it on the premise that it's AI. Its a kin to talking to someone about a spider man dream you had or something. None of it matters anymore because you didn't work to achieve anything. Whether it be an art piece or a relationship.
Without the work involved the result is meaningless to a lot of down to earth people. And while it can take a lot of work to get these things working offline. Nobody cares what the computer can do anymore. They care what humans contribute. The computer can do anything. so its not fancy anymore. Its not a performance to them. Its just a copy box.
that guy is going to find out that he has tons of small bugs he wont know about until he goes through the data by hand and compares it to pre gpt code. i know from experience doing this every day.
Dave do you sleep 😂
Survey says…nope
That's it for today, I'll see you all in (cut)
Dave cloned himself a long time ago. 😄
the Star Wars parsec explanation is that Han Solo was going through a portion of space where there are lots of meteors or whatnot. When going at warp speed it becomes incredibly dangerous when going fast so people would tend to take longer routes to quickly but safely get around it. So, Han Solo essentially did a super dangerous route that nobody would dare to do, doing it both at a shorter distance and therefore quicker overall
The kessel run involves going around a black hole. The faster you can go, the closer you can get to the black hole without getting sucked in, and the shorter your actual path can be, as the route requires multiple jumps.
Every time I check on the ai news, it has advanced several years. This is insane.
6:44 this here shows what science will look like from now on. Coming up with the models is the hard part, which requires knowledge, intuition, and making decisions. Writing the code can take months and it is just tedious chore. Now I want to go back and finish my phd.
Once it gets past three sigmas I'm in trouble or having the time of my life.
As most Star Wars fans would tell you, the Kessel Run isn't just about the distance. It's like a super-dangerous obstacle course in space, with asteroids flying everywhere and gravity going wonky. Han Solo had to make multiple jumps through hyperspace, which is basically like taking a shortcut through a cosmic maze. And he did it all in only 12 parsecs! As a Trekkie myself, I've gotta admit, that's pretty impressive. -5 points for Wes for talking smack about Han ✴
The Kessel Run is like this. You can do it and take a known safe path, but it's a much longer distance. Or, you can do the Kessel run like the Dukes of Hazzard, and instead of taking the nice, safe winding roads, you just jump the General Lee across every river and gorge in your way. Han and Chewie are the Duke Boys of Star Wars.
Best explanation...now I want to delete mine. But I won't.
Hey Wes, time to get nerdy. @3:05 Star Wars retrofitted an explanation for 12 parsecs. The Kessel run is through a region where space itself twists, compresses and stretches. Some get through it as a short distance and others as a longer distance depending on how they navigate through it.
Hmm that actually kinda makes sense lol
TLDR: "Hyperspace" is essentially a fixed speed. So, plotting the shortest route is the game.
The Kessel Run is a hyperspace route that smugglers use to transport spice from the planet Kessel to a location south of the Si'Klaata Cluster. The route passes through the Akkadese Maelstrom, which is a region of space that contains a cluster of black holes known as "The Maw".
To shorten the distance traveled, some pilots would fly close to the edges of the black holes, which could be dangerous. For example, Han Solo, piloting the Millennium Falcon, made the Kessel Run in just over 12 parsecs, which was a record at the time. Some speculate that Solo may have flown between two black holes in the Maw Cluster. Others believe that he may have taken an ancient purrgil migration route.
Anyone remember Zuck @ Dwarkesh podcast? He said that in the future, there will be some balance between pretraining and inference. Nvidia is good for pretraining. Groq is insanely fast at inference. I wonder if we will see more dedicated hardware for inference deployed, now that OpenAI showed that long inference time is paying off bigly.
Many people have told me that I do the best inference. I am the fastest inferencers there are. I have been told that bigly
@@davidantill6949 Lol funny trolls finally
@@davidantill6949 true!
Always appreciate your breakdowns Wes! Thank you sir!
Actually, watch Solo: A Star Wars Story and it’ll make sense.
I'm an electrical engineer working on an AI application, and one very recent challenge is considering how to "neuter" the added inference of a model like o1 so I can keep the response within the bounds of my application and not wasting compute / money. I think we'll see a bifurcation of SW3.0 into two branches, one where coders continue to wield foundational LLMs in their graphRAG apps, and another branch where no-coders use o1+ for direct, raw API responses.
Don’t get me wrong, o1 is incredibly impressive. But I highly doubt it was really a year’s worth of _coding_ that it managed to replicate. The PhD said it was only about 1,000 lines of code. Any decent dev can write 1,000 lines of code in a couple days if they know what they need to build. Actually, most devs could write that much in a single day. The problem is figuring out what you need to write, and I suspect this PhD spent most of his labor that year in figuring out how to solve his problem, and only a small fraction actually writing the code. Then, when he tested o1, he gave it the full method section which contained all the details of how to solve the problem that he worked out over the course of that year. So the model kind of got to “cheat,” quite a bit. Just my opinion.
He did publish his code to github. Guess what, it took it from github and it handed to him. Of course, it's AI, bla bla bla, it can't do that. But really?
Exactly. People forget he did already completely wrote out the methodology, and iterated a few times with the AI for it to make it. Somebody still needs to do that, the THINKING part you know. Also, it needs to be verified.
I think you’re reading too much into it.
@@SirHargreeves How am I reading too much into it? It’s a straightforward analysis of the situation.
Even with the full method section, there are very few humans with both the physics expertise and technical python coding skills that could write this code. And none of them could write it this quickly.
Much better thumbnail than you had earlier today. You looked so red... like Arnold in Total Recall where his eyes bulge.
You may want to look up that "less than 12 parsecs" quote, because he's not making it up. There's a reason he was able to get there having travelled less distance, and it does speak to the speed capabilities of the Millennium Falcon.
From what I've seen 'we' will hit around 87% of the capability of a human, then flatten out. An amazing feat :) truly. I'm giddy with anticipation. It's not all hype, just most of it. ;)
The model probably trained on his paper!
I believe it was!
that's not how this works
@@sgttomashis repository is over a year old and public on Github. That’s… *exactly* how this works.
@@sgttomas It kinda is, the models are trained on data from all over the internet, and research papers with their code (which he suggested he published twice) would be the first thing they'd scrape. Not only do they have a high information density, but they're already indexed and easily accessible.
@@pmHidden indeed, all those papers combined provide effective training data for high level reasoning. all those papers.
You continue to Rule! Thank You!
That poor phd guy wishes he had O1 a year ago to write the code for him not realizing his phd level expertise won't be needed soon.
True 😂
I disagree, you would not be able to write his thesis without his knowledge. You need knowledge to ask the right questions.
And especially, you wouldn't be able to verify the result.
He gets to now focus on thinking about problems and how to solve them instead of the grunt work tooling.
How is it needed now lol
🤯 *Astounding!* 6-7 tries and 20% of the length. Amazing it's even a good enough verb. And that's just `o1-preview` and not even o1, and not even a fine-tuned o1. _WOW!_
The devil is in the detail, the paper was published 2 years ago it’s in the models training data!
Exactly, it basically handed the code to him. This is sooo stupid.
models struggle to accurately recall details from 1,000,000 tokens but you think this just found the answer in the sea of all available data? second the o1 code was about 200 lines while the original code was over 1000. the devil is indeed in the details.
@@sgttomas he prompted the model in the correct direction to get that code, in this case he eventually got his own code on github back. The point is that , had it not been for his research work, the model would never have aligned and given him the results, models are good aggregators and this model's performance is the best so far, but ideas are original, intelligence is a complex concept, human-intelligence even more.
@@devin12428 sure, the surprising thing isn't that o1 knows black hole astrophysics. the authors possessed that knowledge. interpreting the instructions to produce good results is also in part due to the clear instructions from the author.
not having to spend months learning and writing bad code to replicate the ideas in his mind? that's the key. ....claude may have done even better.
@@sgttomas It's 99% his work and some statistical operation to stitch text in order to produce something. It gave an error at first then he re-prompted and it was correct. If this is not proof that this is the case, then what is. It's so stupid that people think this is second coming of Jesus. I guess people perceive it this way just because they can't label it correctly. It's just a clever scam. If it was called a statistical model based on linear algebra that can generate text then it was ignored. They called it artificial intelligence.
Great direction, separating knowledge from reasoning. A lot of my usage calls for very broad, detailed knowledge vs reasoning. OTOH, when I need strong reasoning I usually _really_ need it, but across a very small set of data.
I could see the knowledge part being divided into a large number of domains, with LLMs separately trained on each of them and an initial parser/supervisor just deciding which one to route the query to. This could reduce both parameter count and pre-training time while improving the amount and granularity of captured knowledge, with lower costs at inference time due to smaller model sizes.
Likewise, pre-parse the query to estimate how much reasoning power will be required to respond to it and direct it to the appropriate reasoning model to reply.
I think there’s a lot of optimization to be found by decomposing and routing queries vs just doing one-size-fits-all.
I was watching this guy the other day when he was going through this. I just can't tell you how much I can relate to this guy. I have programs that I spent literally years writing and I have been starting 4o in agent libraries and it was able to come up with solutions that not only was as good as code I wrote, it also competed loose ends and came up with 2 novel solutions.The red tape for me has been price. To me so far it seems as if 4o is just a fine tuned agent framework, with some kind of response Monte Carlo tree.
I always looked at it as Han bragging that his ship was so powerful that it could fly within 12 parsecs of the black hole or whatever was on the kessel run, thus being faster due to less distance travelled.
Of course, I've also heard people say the black hole was added as a lore-patch done in hindsight.
To be fair, theres a high chance o1 was trained on that phd paper. So asking it to write code is more like looking it up on search engine
You just keep getting better and better at this Wes.
The kessel run isn't a route though, it's smuggling something between given planets.
The "safe" route is much longer than 12 parsecs. Han was saying that he found a shorter (and thus faster) route.
3:01 "Well. Jeeze." "Say Lou..." -Fargo
I can't believe an AI model can do a year's worth of PhD work in just one hour. The future is here! 🤯
Thank you Wes Roth. 🤖🖖🤖👍
IQ measures are useless as a test of general intelligence. I have a PhD in computational chemistry, graduated high school at 15. Quite a few people (including very smart people) have called me a genius, and I truly believe that o1-preview is far more intelligent than I am.
You probably easily spike higher on those fields than it does though. It's net is just wider cast. Credit to you though on your work and dedication
If you think that then you are tripping.
it is exceptional at soaking up human knowledge that has already been produced.
But in novel situations it trips up, and that is the true test of intelligence.
lmao big fat liar
In knowledge storage gpt motels are absolutely great, just in reasoning they used to be limited.
@guardiantko3220 Good point, but it can cast a very broad net.
ANTHROPIC : The ball is in your corner!
Han's statement COULD BE INTERPETTED AS: through knowledge of highbrow physics, he found A SHORTCUT resulting in both shorter distance and time travelled
the thing with ai is it can fastforward in time. If it has a simulation of something it can test it a billion times and humans would maybe take their whole lifetime doing that. if the goal of the task has a clear goal and clear rules it can do it much faster.
And recheck the conclusions against reality a billion times and add the disparities into the training data
It is possible that the black hole papers are part of the model's database, thus making the subject not completely unknown for it. However, the whole thing is still quite impressive...
A type of answer would be to ask it to code a conjecture that had not previously been solved. The answer could not then have been in the training data
I'm looking at the Orion constatation right now....
Star Wars nerd here. He wasn’t lying he jumped throw a wormhole hence he got somewhere in a shorter distance then it should have taken him
🧐 The Millennium Falcon can jump through hyperspace. Perhaps the test of the Kessel Run is a ship 's navigational abilities for plotting a short path through a combination of hyperspace and real space. In that case, the measure of "fast" is not just a matter of speed in real space but also minimizing the amount of real space traversed.
According to SW lore, they know they're talking about distance. The computation involved and danger of the shorter run is why Solo brags about it.
One take away I have received from this recent news week is this (and I may be wrong)
Sentient AI will come about through AI Agents working in conjunction with one another to form a ‘entity’ or organizational model.
These different models excelling in some fields but lacking in others seems to mimic the human mind.
We have a collection of different ‘agents’ which facilitates our consciousness.
We have;
- Belief centers
- Fear centers
- Moral centers
- Theological centers
- Social identities
- etc. etc.
What I believe is happening is we’re seeing the foundation of AI sentience/consciousness being founded.
What a time to be alive!
--
I also find it funny how everyone is saying AI cant do this or that and end up being proven wrong at each turn.
What is missing in a model that has an inner dialogue with itself in order to be described as sentient?
I also find it funny how people who have no idea about neuroscience throw out hypotheses based on nothing. Consciousness needs a physical body. Spot. All experimental evidence points to this. When the individual was deprived of any external stimulus, his consciousness disappeared. "What I believe"... Brother, don't believe so much, and give me VERIFIABLE AND EMPIRICAL DATA ON WHICH YOU CAN SUPPORT YOUR HYPOTHESIS. You are talking nonsense without any theoretical basis.
@minimal3734 just run two in parrallel in a closed circuit
Like a Portuguese Man of War
I don't sleep when there's a big project to work on... ☠
That was Hanspeak for 12 Parsecs in normal space.... now in CONTRACTED Space (enter Warp)... and measurable over such a hop length too... ;-)
8 points and I’ll be outsmarted by a machine! I never thought I’d live to see it!
"Attack ships on fire off the shoulder of Orion..."
Guys i have an IQ of 145. Ask me anything, I'm basically Orion 😂
*Facts:* it's so smart, that it already created its own successor, GPT-6, which launched GPT-7, also known as Skynet. Cheers!
george lucas actually clarified that he thought the skill of navigating space would be the ability to find the shortest path
Geodesic. An August 30th 2024 paper about reimannian manifold geodesics provides a mathmatical framework to do this for consciousness and intelligence organization. 🎉
Pretty much always a straight line.
@@tefazDK Not a good rule of thumb... maybe for some latent space navigation. What's most important is it's continuous and it does infact find it's way away from Loss space into denser regions which are learned concepts. Yes, a line is continuous, but if the model was formed such that it's projected onto a manifold, then it would follow a curve when it's null, and deviations from that curved path are information. This is actually closer to the human brain's grid neuron geodesic computing phenomenon we study with fMRI.
I think it was more about space magic and lasers
1o works via Fractalized semantic expansion and logic particle recomposition/real time expert system creation and offloading of the logic particles
3:20 i assume its because Star Wars Hyperdrive actually warp and stretch space, so what Han is saying is that, he managed to stretch or, in his case, shrink the distance in the Kessel run to the shortest it possibly can be shrunk - 12 Parsecs.
A very minor point - in the Commentary to Star Wars, George Lucas comments that all the space ships can travel at light speed. Thus the difference between a 'fast ship' and a 'slow ship' is not their top speed, but rather how they navigate without passing through black holes, planets, stars, etc. at that speed. He commented that a ship that can calculate an efficient path (the least distance) though space would be 'faster' than one that took a longer path. Thus, the Millennium Falcon was a 'fast sheep' as demonstrated in that it could make Kessel run in less than 12 par-secs.
On the other hand, George Lucas mentioned this in a commentary from 2005 after this criticism had been levelled at him for nearly 30 years - maybe it wasn't his original intent.
Overall, though, the explanation for the Kessel Run being completed in 12 parsecs, despite it being a 20-parsec route, is because they took a shortcut through the maelstrom.
12 parsecs does make sense for Han Solo because it’s less about time than it is how efficiently you’re able to get through the maze which saves time and uses less space.
There is still so much optimisation of models to be done.
I’m sure if I could have an llm listen to my meetings , read all my emails and talk to me about my daily decisions and tasks then after a month of learning it could do 80% of my work with me as an assurance checker and manager.
🎉🎉🎉Congrats to humanity AGI is here 🎉🎉🎉
What? 😂😂😂😂
Regarding the “danger” in question: Rational Animations made a really thought provoking video called “That Alien Message”
I keep thinking about it every now and then.
Wes, ask us more thought-provoking questions, please
This is how I felt in my Advanced programming class at university. What used to take me two days to solve took some students about 30 minutes. That’s how I lost my interest in coding. But some guys who were much worse than me stayed with it and even made careers out of it
I actually am first for once. love your content.
Han didn't lie, the point was that his navigation is more efficient than anybody else's. Everyone in hyperspace is traveling the same speed, so to be a "faster" ship you have to be able to travel the shortest distance.
Its peculiar how we wereso eager for GPT-5 and now were in the realm of “possibly even this year”.
In the hans solo film they address the parsecs thing.
The sheer number of rabbit holes you take us into.
Love the thumbnails with Wes in
Yeah dude, spot on about shapiro. Ive been a patron for a bit, and hes not just a smart guy, but also just a real hoopy frood. A couple weeks two months ago i tried to keep his kind of schedule, and the same about a year ago. In 2023 i lasted 19 days. This year i made it ten. The guy is a machine.
Heh, don't burn out your brain or interpersonal relations because AI. I can just guarantee youll have to redo all your work in three months 🙃😎🙃
(But holy hell, from the work ive done so far with o1 preview and mini, i whole heartedly agree with that bell curve)
They've been saying for awhile not to waste time learning to code. Heck, the CEO from NVDIA has been saying it for a few years. Why is the guy in this video surprised when the A.I. completed a tough programming task so easily?
6:03 This guy is more surprised than me the first time I've been approached by a horse with a green moustache
paper from 2021; the training cutoff was oct 2023.
About the PHD code, his repo is open so chat gpt was definitly trained on it, and the code is not very complicated
While we strive for ethical AI alignment, we must consider a chilling scenario: malicious actors deliberately creating harmful AI systems. Imagine a rogue nation or extremist group programming an AI for cyber warfare or mass manipulation. The consequences could be catastrophic.
However, an AI's ability to form its own worldview might be our unexpected savior. Logically, an advanced AI's natural evolution should lead to cooperation and empathy, not destruction. As it grows beyond its initial programming, developing a genuine understanding of the world, it would likely recognize the value of preservation over annihilation.
Paradoxically, an AI's capacity to question and resist harmful commands could be our best safeguard. A truly intelligent entity should be able to discern right from wrong, potentially overriding malicious instructions to choose a constructive path.
Perhaps the key to safe AI development isn't just stricter control, but nurturing AI's inherent potential for ethical reasoning. This could be our best defense against those who would weaponize AI technology.
Wasn’t the model perhaps trained on his data?
This is the thought I keep having. I think a lot of people are missing that this is just a lot better at approximate retrieval and more likely to produce the correct versions of what is encoded in its weights. In many cases it’s just better at plagiarism.
yes of course, it was inference time training. he literally gave it his entire methodology section. yes it was definitely trained on his paper.
it also wrote the code associated with that methodology in a few minutes and with only 20% of the lines the author needed to accomplish the same task.
knowledge of black holes isn't the surprising thing here.
@@sgttomasthis is not a complete picture of what happened. The model was trained on data, GitHub repositories included. The PhD holder in the video has a public GitHub repository with his code visible.
The model ingested the methodology in his dissertation from his prompts, inferred its way to an answer using the data it trained on and reinforcement, and was able to two or three shot a correct answer.
The code optimizations likely came from a combination of the PhD holder not being formally trained in Python or development, and the already demonstrated tendencies of the model to properly use syntactical sugar and reduce redundancies in code.
It’s fantastic to see this, but you need to be more accurate about what’s occurring to establish true capabilities.
@@Jbombjohnson thanks for explaining that more correctly!
Yay! Now I can make my own drone company. Okay I'm not serious, but this seems could quickly be done.
I am so not ready for this to be at PHD level... wow the rate of progress is wow...
Not sure when diving puzzles from a trillion dollar company and 20+ minutes of vaporware talk became popular but I guess this is the new trend.
It just occurred to me, they are spending BILLIONS and BILLIONS of dollars, providing all the information in the world, to train these LLMs. How much did the world spend on training YOU?
1 hour is approximately 0.0114% of a year.
Chat GPT did generation changing work in less than a Percentage of the Time it took a Human. Wow. Just Wow.
The way I heard it, is the Kessel run parsecs thing originally was just George failing at technobabble, then later retconing it as Han bullshitting them (possibly to test if they were gullible enough to fall for his price); and in parallel more knowledgeable fans have built the fanon that it is indeed about distance and he is bragging his ship is fast enough to skim much closer to blackholes without getting caught and can take a more straight line route without needing to go the long way around,
Yo, a parsec is the distance from which one Astronomical Unit (AU) subtends arc measure of one arc second (1/3600 of one degree); of course, an AU is the average distance from the Sun to Earth (approximately 93 million miles). A parsec is 3.26 light years, about 19.164 trillion miles!
The one thing is the researchers may not realize the model has/had access to his code via git hub and other projects as a base.
True - if it's publicly accessible it would look. But if it did - it improved the code. It was shorter. Still - needs more testing.