are we cooked w/ o3?

Поделиться
HTML-код
  • Опубликовано: 26 янв 2025

Комментарии • 1,9 тыс.

  • @ThePrimeTimeagen
    @ThePrimeTimeagen  Месяц назад +314

    I see a lot of people calling cope on prices. Here is the last 20 years of projection of cost per flop. doubles every 2.5 years. This isn't great. Algorithms is a must, but even then it's pretty hard to improve performance 1000x with algos.
    Either way, I do think my premise is correct. Hard skills will be a great advantage over the next decade. Beyond that, who the heck knows
    www.lesswrong.com/posts/c6KFvQcZggQKZzxr9/trends-in-gpu-price-performance

    • @avparadox
      @avparadox Месяц назад +15

      WE LOVE THE APE GUY GUIDING US TOWARDS KNOWLEDGE & REALITY. WE LOVE THE APE GUY. THE NAME IS ❤

    • @lyndonsimpson1056
      @lyndonsimpson1056 Месяц назад +5

      Even after that, i prefer being of those that will have learned the technical skills. i fail to see how it can ever become totally useless. In a AI tech driven world it's the most important professional edge you can get.

    • @micahgmiranda
      @micahgmiranda Месяц назад +22

      I hear what you're saying, and this could be my imagination, but the tone and approach seemed like you were downplaying your subconscious fear.

    • @MrHuman002
      @MrHuman002 Месяц назад +8

      From the article you linked to: "for models of GPU typically used in ML research, we find a faster rate of improvement (FLOP/s per $ doubles every 2.07 years)." So in roughly ten years you're looking at the cost being 1/16th (x0.0625) of what it is now. And that would just be for running the *current* O3 model... imagine running price-performance optimized coding models 10 years more advanced than what we have now, at x0.0625 the cost of current compute.

    • @sleepykitten2168
      @sleepykitten2168 Месяц назад +7

      Also misses the fact that o4 will likely be significantly better. Probably not as big of a jump, but there's low hanging fruit left to be picked in the test time compute architecture. That begs the question, what about o4 run on 10% of the compute? Likely it'd have similar performance to o3, but there's a 10x reduction in cost.

  • @mmmmfh
    @mmmmfh Месяц назад +3540

    we achieve AGI 5 times every week

    • @jamesarthurkimbell
      @jamesarthurkimbell Месяц назад

      When we finally do, the AGI will have been trained on so many false claims that it won't believe its own existence

    • @kevin.malone
      @kevin.malone Месяц назад +112

      We achieve AGI ~130 million times a year. We've been at that level since the 1980s, though it's trending downward.
      Places like India and Africa have been pumping them out faster than anywhere in the west.

    • @aguywithaytusername
      @aguywithaytusername Месяц назад +47

      ​@@kevin.malone that's just GI tho not AGI

    • @naveenbattula
      @naveenbattula Месяц назад +56

      ​@@kevin.malone bold of you to assume most humans are intelligent

    • @RamonChiNangWong078
      @RamonChiNangWong078 Месяц назад +7

      And everytime the cost keeps skyrocketing

  • @yumearia3662
    @yumearia3662 Месяц назад +2500

    The day that AI companies fire all their employees would be when AGI has been achieved

    • @Adjarnor
      @Adjarnor Месяц назад +16

      Yeah

    • @Alx-h9g
      @Alx-h9g Месяц назад +40

      They might be the last employees to go. Before an AGI can self improve.

    • @SomeRandomUserChannel
      @SomeRandomUserChannel Месяц назад +67

      @@Alx-h9g Come on. Stop confusing reality with SciFi movies. You might as well believe in magic with this thinking.

    • @Georgggg
      @Georgggg Месяц назад +15

      No, that would be only when AGI robots would be allowed to be as legal entities as emploees.
      Employee can sign contract, AI can't.

    • @InternalStryke
      @InternalStryke Месяц назад

      I thought AGI implied self improvement ​@@Alx-h9g

  • @user-lz2oh9zz4y
    @user-lz2oh9zz4y Месяц назад +954

    They are running nuclear power plant so that AI can spit out 'bananums'

  • @DeusGladiorum
    @DeusGladiorum Месяц назад +453

    Just some napkin math-684kg of emissions would mean about 1,800 kWh, or one task consumes as much electricity as a single home does in 2 months

    • @patricknelson
      @patricknelson Месяц назад +34

      _1.8 megawatts! Great Scott!_

    • @czerwonyniebieski
      @czerwonyniebieski Месяц назад +25

      Sounds like it's very sustainable

    • @jonduke4748
      @jonduke4748 Месяц назад +30

      @@czerwonyniebieski Nuclear power is actually extremely sustainable and produces the lowest emissions for KWH

    • @genericdeveloper3966
      @genericdeveloper3966 Месяц назад +39

      @@jonduke4748 Even so, the question is, is the massive investment in nuclear power worth it for LLM use alone? It will already cost a significant amount to just keep up with grid demands for electric vehicles and replacing power generation for existing demand with cleaner sources.

    • @jonduke4748
      @jonduke4748 Месяц назад +13

      @genericdeveloper3966 it would not be so massive if the regulation was not quite overly as obscene as it is; additionally it wouldn't be so bad if calmer heads and brighter minds had prevailed over the last 50 years and actually invested much more in nuclear infrastructure and technology. It's honestly and literally the only way forward at this point, considering the ever increasing demand for power and the need to reduce price per KWH. Once built nuclear is much cheaper than fossil fuel. It's also the only and I mean ONLY serious solution to climate change aside from mass depopulation.

  • @enkk
    @enkk Месяц назад +817

    I don't know how this will affect jobs, but as a teacher of computer science at university I do know for sure that knowledge is still going to be very valuable. Maybe even more than before. If you are a student never think for one second that what you are learning is going to be useless: the process of learning is the real value.

    • @no-nukez
      @no-nukez Месяц назад +167

      Okay, but is the value of the “process of learning” going to pay my bills and reduce my student debt?

    • @krumbergify
      @krumbergify Месяц назад +51

      The more you learn, the easier it becomes to learn. You start to see patterns and make analogies. ”This new thing is kinda like good old A combined with our friend B plus a little new twist - alright!”

    • @Repeatedwaif
      @Repeatedwaif Месяц назад +19

      Ai will inevitably cause bugs that someone will need to fix thank God I've been learning assembly security research will never die

    • @krumbergify
      @krumbergify Месяц назад +25

      @@no-nukez Exactly! You get hired to solve problems that didn’t even exist when you graduated. That takes a mindset of constant learning.

    • @enkk
      @enkk Месяц назад +21

      @@no-nukez that's a difficult one. Full disclosure, I'm from Italy so the whole job scenario may be very different. I'm hearing about layoffs in the USA and a big frenzy. Probably CEOs wanna jump on the whole "cheaper labour" thing... What i'm thinking is that being competitive its still going to matter, and in order to be competitive *hopefully* knowledge is going to matter.

  • @timbauer399
    @timbauer399 Месяц назад +135

    Love this video. Worrying about AI literally keeps me up at night. I can't thank you enough for making this. It doesn't just give me comfort, it gives me a direction. Thank you. 🙂

    • @mackenziebroadbent682
      @mackenziebroadbent682 Месяц назад +7

      Worrying about AI is also keeping me up at night (I literally woke up at 3AM this morning in a panic attack thinking “the robots are coming!”) I’ve found that writing my thoughts down has been helping with the anxiety a lot. There is a lot to be concerned about for the future, but there’s also huge opportunities for both individuals and society. I think there will be growing pains and not everyone will be better off in the short term, but on a longer time horizon AI will probably be the best thing to ever happen to humanity.

    • @T1000Rex
      @T1000Rex Месяц назад +1

      Glad you were able to have some peace

    • @TamilvananB-z4l
      @TamilvananB-z4l Месяц назад +4

      W bro. AI will never replace humans.

    • @WilsonSilva90
      @WilsonSilva90 Месяц назад

      @@TamilvananB-z4l It surely will. The question is "when?". I hoped we had 50 years, but we may just have 5 to 10.

    • @TamilvananB-z4l
      @TamilvananB-z4l 29 дней назад

      @@WilsonSilva90 Well, they need at least a few technical experts when there are notorious bugs that AI might take a while to figure out. I see it increases productivity of developers, not replacing them in anyways

  • @TomNook.
    @TomNook. Месяц назад +158

    You know it's serious when it's just the green screen

  • @centerfield6339
    @centerfield6339 Месяц назад +524

    Turns out generating images is much easier than writing code above a junior level.

    • @asdqwe4427
      @asdqwe4427 Месяц назад +49

      Yes, I mean the models literally just learn what information it thinks that we want to hear. Not what is "correct"

    • @benjaminblack91
      @benjaminblack91 Месяц назад +114

      If you talk to professional artists, the image generation tooling is still completely useless in profesisonal image creation workflows. More time is spent fixing errors than it would take to generate the image manually.

    • @justsomeredspy
      @justsomeredspy Месяц назад

      ​@@benjaminblack91 Right. Image generation is great if you want to generate a picture of "Keanu Reeves but he's 650 lbs." for a RUclips thumbnail, but it's pretty much useless if you want to create something more specific. It's still impressive technology, but nowhere near flawless like every AI-bro would have you believe.

    • @Ezcape0
      @Ezcape0 Месяц назад +3

      ​@@asdqwe4427humans don't use correct either

    • @J3R3MI6
      @J3R3MI6 Месяц назад +7

      @@benjaminblack91 you’re precious… it will be almost perfect by 2025

  • @ニガービーナー
    @ニガービーナー Месяц назад +99

    > WE HAVE AGI
    okay lets run it!
    > CANT TOO EXPENSIVE
    ....

  • @unknownChungus
    @unknownChungus Месяц назад +408

    In a recent experiment of humans vs ants, ants did better in some better scenarios. Does that mean Ants achieved AGI?

    • @NihongoWakannai
      @NihongoWakannai Месяц назад +35

      Ants are pretty smart as a hivemind. Nature is the best creator of intelligence.

    • @astral6749
      @astral6749 Месяц назад

      That experiment prohibited humans from communicating with each other, making the experiment inaccurate.

    • @Stepphenn
      @Stepphenn Месяц назад +10

      @@NihongoWakannai best? It takes it millions of years. At a computational standpoint, it’s not that great. Humans are outpacing it.

    • @danieldorn9989
      @danieldorn9989 Месяц назад +55

      Ant Generated Intelligence

    • @captain_crunk
      @captain_crunk Месяц назад

      No, it means that humans aren't that smart and therefore we should be ready to hand control over to the AI overlords sooner than later.

  • @WecksyRex
    @WecksyRex Месяц назад +167

    at 4:40 you mention that o3 can solve a toy task that it's never seen before, but the ARC-AGI benchmark includes a training set to familiarize models with the format of the problems. I don't think any specific problem types are repeated, but they aren't going in completely cold to the format either (o3 was specifically fine tuned on that training set for this benchmark)

    • @MegaSupermario666
      @MegaSupermario666 Месяц назад +62

      That's kinda the whole problem. Training the model on a format and structure of the test defeats the entire point. It means the results likely aren't generalizable and may be strongly tied to the narrow ARC testing environment. If that's what's happening with o3, then if you take it out of that environment and into a real-world situation, it wouldn't do any better than probably o1 on the same task.
      It's something that has been observed in humans too. When people are trained to perform better on cognitive tests, their scores improve dramatically, even on questions they haven't seen before. But when you give them a different test with a different structure, then they go on to do about the same as pre-training. Training them on the test didn't make them more intelligent in general. It just made them better at taking that specific test. The 'G' in AGI is really what matters here.

    • @playdoughfunrs
      @playdoughfunrs Месяц назад +12

      even for the codeforces score, it is very likely that they hired people to solve common problems and trained on this data

    • @nousquest
      @nousquest Месяц назад +5

      It defeats the point because some sort of self existent AGI without a training set is impossible

    • @theb190experience9
      @theb190experience9 Месяц назад +14

      I would suggest you listen to the guy from ARC describe the results. You cannot ‘train’ a model for these tests. You can make them aware of the format, and the same applies to humans, but each test is novel and the model has not seen it. Same as a human. That the model can beat the human on novel tests is what impresses with this iteration.

    • @WecksyRex
      @WecksyRex Месяц назад +7

      @theb190experience9 that's what my comment said if you reread it carefully, though a human would not need a training set nearly as large as the one offered to understand the goal - at best, one or two examples

  • @Imperial_Squid
    @Imperial_Squid 29 дней назад +40

    9:01 THIS. Speaking as someone who literally did a PhD in machine learning, all this "we achieved AGI" stuff sounds like _pure marketing bs_ to me. People have been predicting that the great golden tech utopia future is only 5 years away for like 60+ years, it's all just be about building hype for tech and investment.

    • @Imperial_Squid
      @Imperial_Squid 29 дней назад +9

      Sam Altman is a marketer and grifter, no doubt he's very clever, but he's not AI Jesus or anything.

    • @maxave7448
      @maxave7448 22 дня назад +2

      I also love how every AI bro loves to use that graph (1:28) as some sort of proof that AGI has been achieved when its literally showing that even with exponentially growing cost, test accuracy barely increases. According to the graph, the score difference between o3 low and o3 high was just 12%, meanwhile the cost difference per task was literally thousands of dollars.

    • @couperino
      @couperino 21 день назад

      @Imperial_Squid what is your prediction of AGI ? When will we have it you think ?

    • @vasiapetrovi4773
      @vasiapetrovi4773 13 дней назад

      so you are just going to ignore pure and steady growth of our technological capabilities?

    • @Imperial_Squid
      @Imperial_Squid 13 дней назад +1

      @vasiapetrovi4773 when did I say something like that? I think transformers were a remarkable achievement, I think LLMs were a remarkable achievement, I think ChatGPT was a remarkable achievement, I think o4 was very impressive. I'm fully capable of valuing all these things as being the innovations they are _while still_ being skeptical about claims of AGI or a complete technological revolution. These opinions aren't contradictory _in the slightest..._

  • @SelectHawk
    @SelectHawk Месяц назад +349

    The goalposts have already been moved for using the term Artificial Intelligence, how long before every machine learning tool is called AGI and they have to come up with a THIRD term to mean the same thing the other two used to mean.

    • @SlyNine
      @SlyNine Месяц назад +17

      It's never been a well defined term anyways.

    • @asdfqwerty14587
      @asdfqwerty14587 Месяц назад +57

      AI has never really referred to actual intelligence to be honest. I mean, people were calling what the computer enemy does in video games "AI" for decades even when they had incredibly simple algorithms and nobody ever considered it to be a confusing term to use back then.

    • @tones1161
      @tones1161 Месяц назад +7

      we just started using the term AI for any simple algorithm out there.

    • @elcapitan6126
      @elcapitan6126 Месяц назад +6

      it's a catchall term for stuff we can't solve and still find mysterious. AI is the god of the gaps.

    • @rando29287
      @rando29287 Месяц назад +1

      this is a phenomenon called the maxim of extravagance

  • @AdamJorgensen
    @AdamJorgensen Месяц назад +642

    I have zero trust for AGI benchmarks. AI Kool aid drinkers set their own benchmarks, super trustworthy 😂

    • @DeepTitanic
      @DeepTitanic Месяц назад +23

      Counter point: I can prompt a snake game

    • @jerrymartin7019
      @jerrymartin7019 Месяц назад +107

      ​​@@DeepTitanic
      Indeed, the future is certainly bleak for SSEs (Senior Snake Engineers)

    • @elujinpk
      @elujinpk Месяц назад +9

      @@DeepTitanic My 6 year old can write snake. You are not good at life.

    • @slurmworm666
      @slurmworm666 Месяц назад +41

      ​​@@elujinpk You're old enough to have a 6 year old kid and can't recognize an obvious joke

    • @Karurosagu
      @Karurosagu Месяц назад

      ​@@DeepTitanicAll that compute power for Snake? Make GTA6 and then we can talk about it

  • @SpookySkeleton738
    @SpookySkeleton738 Месяц назад +312

    they just have to make it 8000 times more efficient and then make it able to complete tasks it's not been pre-trained on, that's all

    • @hungrymusicwolf
      @hungrymusicwolf Месяц назад +49

      That second one is the big one. Until it can solve problems it has absolutely no training or knowledge of AI will just be an assistant. Which, while useful, it is not going to take your jobs. Though how far we are from AI being able to learn independently without being trained on something is anyone's guess, could be tomorrow could be 100 years from now.

    • @SlyNine
      @SlyNine Месяц назад

      ​@@hungrymusicwolfhumans can't do that either. I'll ask you to do something you have no training or knowledge on and watch you just magically figure it out..

    • @robotron1236
      @robotron1236 Месяц назад

      @@hungrymusicwolf I’m guessing closer to that 100 year mark. Idk, it just seems like it’s plateaued to me. I think they just needed something to fool investors. Open AI is absolutely HEMORRHAGING money and they’re not really putting anything truly valuable forward.

    • @monkemode8128
      @monkemode8128 Месяц назад

      ​It's designed to just be familiar with the format of the problem. Think of it as training a model to understand JSON so that you can pass it a problem in JSON format. For example, if you passed math problems formatted in JSON, pretraining the model on JSON doesn't directly help with understanding the steps to solve the math problem. That is what they say, at least, my explanation isn't an endorsement or me making the claim it actually works that way. I'm just the messenger of this info. ​@@hungrymusicwolf

    • @donventura2116
      @donventura2116 Месяц назад +7

      I haven't read much about o3 but I thought the big breakthrough was that it was solving problems without being pre-trained.

  • @new-bp6ix
    @new-bp6ix Месяц назад +90

    Honestly, AI has made me appreciate humans more than ever before, and despite all the advancements in technology and all the energy consumed, humans are still at a mysterious level of ability compared to AI

    • @20Twenty-3
      @20Twenty-3 Месяц назад +12

      Well, we've had millions of years of headstart to get to where we are now, AI has come so far in just decades. Machines are evolving much faster than we are, and may surpass us one day.

    • @jeffsteyn7174
      @jeffsteyn7174 Месяц назад +1

      Oh god we consumed energy. 😂

    • @ozgurpeynirci4586
      @ozgurpeynirci4586 Месяц назад

      @@20Twenty-3 machines run nuclear plants to get to these results, humans spends 100 watts per day.

    • @Bloooo95
      @Bloooo95 Месяц назад +2

      @@20Twenty-3 Lol

    • @genericdeveloper3966
      @genericdeveloper3966 Месяц назад +4

      @@20Twenty-3 None of our engineering at a nano-scale holds a candle to life itself. The human body is an unfathomably well designed and interconnected nano-micro-macro machine working in harmony at all scales for a common purpose.

  • @wawaldekidsfun4850
    @wawaldekidsfun4850 Месяц назад +4

    Really appreciated this sobering take on O3/GPT-4. The title had me worried this would be another AI hype video, but you nailed the practical limitations that most AI evangelists conveniently ignore - especially those astronomical costs and resource requirements. The fact that even a "tuned" version only hits those percentages while costing $20 per task (or 172x more for high compute!) really puts things in perspective. Your point about the widening gap between deep technical understanding versus surface-level AI dependence is spot-on. It's refreshing to hear someone cut through the investor-driven AGI hype and remind us that solid technical skills and understanding fundamentals are more valuable than ever. Keep these reality checks coming! 👨‍💻

  • @ghostinplainsight4803
    @ghostinplainsight4803 Месяц назад +65

    I started refactoring the entire front end with ramda exclusively a year ago. The code is beautiful, everyone can read and understand what it does at a high level, but no one can change it.They couldn't fire me if they wanted to. Check mate boardroom suits.

    • @djkramnik1
      @djkramnik1 Месяц назад

      I hate you

    • @Geohhh
      @Geohhh Месяц назад

      Why no one can change it?

    • @peanutcelery
      @peanutcelery Месяц назад

      Uncle bob was wrong. Couple everything and depend on in your code lol

    • @dhruvmehta4181
      @dhruvmehta4181 29 дней назад

      as i know ramda, if its js library, it is slow, my team had removed it and seen performance gains

  • @ArturdeSousaRocha
    @ArturdeSousaRocha Месяц назад +90

    "I see rewrites, and I see rewrites within rewrites." (of freshly hallucinated legacy code)

    • @awesomesauce804
      @awesomesauce804 Месяц назад +6

      Damn. This hits deep. I was half asleep last night and let it go wild. Looked at the code with fresh eyes this morning and I have classes in separate files and nested classes in the main file. It decided it didn't like my project structure so it made a subfolder and created new files without deleting the old files.

    • @hbobenicio
      @hbobenicio Месяц назад

      Average Joe devs will spent too much time prompting something that would produce code that "just works dont touch it".
      They will be Prompters instead of Actual Programmers. When the shit get real, Prompters will call Actual Programmers to clean up their mess

    • @danieldorn9989
      @danieldorn9989 Месяц назад

      Wake me up when you see the blonde wearing a red dress

    • @tttm99
      @tttm99 Месяц назад

      ​@@awesomesauce804 I'd love to pretend otherwise but so many of my personal projects have been managed like this for over 30 years 😂. When you do anything like this professionally though you refer to different "generations" of solution and call the process "prototyping". 👍😎

    • @blackoutgo2597
      @blackoutgo2597 8 дней назад

      Like in dune, "feints within feints within feints"

  • @haoyeechan3737
    @haoyeechan3737 Месяц назад +38

    The future of AGI is already depicted by the minions in Despicable Me. They kind of get the job done but you don't understand them and you're always wondering what could go wrong.

  • @shivammittal6066
    @shivammittal6066 Месяц назад +220

    my noodles are more cooked, which i'm eating while watching those AGI tweets

    • @kossboss
      @kossboss Месяц назад

      @prime do you use tablet or Wacom? Or do you draw with vim motions too?

  • @namesas
    @namesas Месяц назад +284

    Internal AGI achieved trust me bro

    • @NexusGamingRadical
      @NexusGamingRadical Месяц назад +6

      Trust him bro

    • @SlyNine
      @SlyNine Месяц назад +15

      I'm not convinced humans achieved gi

    • @MattRodriguez-h7j
      @MattRodriguez-h7j Месяц назад +1

      Wait 8 months. Big things are coming out August 2025

    • @DM-pg4iv
      @DM-pg4iv Месяц назад +7

      If the internet said it then it must be true

    • @playdoughfunrs
      @playdoughfunrs Месяц назад +11

      exactly bro, thats why all of our top board members who were with the company from the very beginning left to start their own thing. because they humbly did not want to accept any credit for helping achieve AGI. it is true bro AGI is here

  • @boop
    @boop Месяц назад +16

    It feels similar to those really expensive and high precision robotic arms, where yes technically it could do the job of factory workers, but it's way too expensive and that's not even what it's made for, it's made to either be a demonstration piece or for some highly specific niche task.

    • @BHBalast
      @BHBalast Месяц назад +2

      It is made as a proof that test time compute works and it seems like it did the job.

  • @MichaelDude12345
    @MichaelDude12345 Месяц назад +3

    This is the kind of stuff I am here for. We got the pros, the cons, some valid insight from Prime's perspective, and he didn't spent time speculating on things that are purely hype based. Great video!

  • @DanilEmelianov
    @DanilEmelianov Месяц назад +86

    Quote from ARC-AGI that people somehow missed.
    “Furthermore, early data points suggest that the upcoming ARC-AGI-2 benchmark will still pose a significant challenge to o3, potentially reducing its score to under 30% even at high compute (while a smart human would still be able to score over 95% with no training)”
    They also stated that their pov - AGI can be achieved when there will be no way to create dataset of simple tasks that AI fails to carry out while human easily do.
    So all this “AGI achieved” is the biggest bullshit

    • @DanHartwigMusic
      @DanHartwigMusic Месяц назад +5

      This is actually the best comment

    • @ivanjelenic5627
      @ivanjelenic5627 Месяц назад +4

      And the meaning of "simple tasks" will keep shifting.

    • @where_is_alice
      @where_is_alice Месяц назад

      @@ivanjelenic5627 Any source on that? The tasks for ARC-AGI seems very simple.

    • @yisraelmeirsobel907
      @yisraelmeirsobel907 Месяц назад +1

      @@ivanjelenic5627 If 95% of non insane adult humans can do it, then its simple.

    • @Alx-h9g
      @Alx-h9g Месяц назад +1

      No but it's using a different training paradigm, which itself can be iterated on, which may get us closer.
      You luddites change the goalposts all the time, and AI keeps breaking through them within a year.

  • @CarpenterBrother
    @CarpenterBrother Месяц назад +19

    They trained on 75% of the training dataset. It's literally called "training dataset". If their success is purely contributed to that, as some people make it out to be, then ARC should've already fallen in the past few years to dozens of companies who did train on the training dataset and still couldn't get above 20% accuracy.
    Yes you should continue to learn hard skills regardless of AGI because there is literally no downsides to having more knowledge, but you should also be able to look at a trend line and realize where it is headed.

    • @CarpenterBrother
      @CarpenterBrother Месяц назад +3

      Also I recommend people to look up the ARC tasks that o3 failed at. You will see that most of the fail cases are very reasonable attempts, very very reasonable attempts.

    • @pedroluiz8019
      @pedroluiz8019 29 дней назад +1

      Ignoring the trend might be okay if you are a netflix engineer, but for most folks working on much simpler problems I think we all should be aware that the bar is rising very quickly

  • @arseniy_viktorovich
    @arseniy_viktorovich Месяц назад +4

    Your "AI won't replace you" videos are like some kind of a psychotherapy for me now. Guilty pleasure.

  • @luckylanno
    @luckylanno Месяц назад +11

    I wonder if AGI will turn out like fusion. Fusion is possible today, it just costs more energy to keep a reaction going than you get out of it. It might take 100 years or more of innovation for it to be useful. Maybe AGI is possible today, or soon, but how long until it costs less than the annual salaries of a full staff combined per problem solved?

    • @Chandlerdied
      @Chandlerdied 28 дней назад

      They’re shooting past AGI for ASI, if it’s possible, you only need to build it once.

  • @Fe22234
    @Fe22234 Месяц назад +31

    Underrated skill critical thinking. And not freaking out over this.

  • @johndewey7243
    @johndewey7243 Месяц назад +6

    We are quickly heading towards a John Henry (he was a steel driving man) moment. We are competing on cost with a machine. What a time to be alive!

  • @nirmithra1
    @nirmithra1 Месяц назад +72

    It'll make 2x engineers into 8x engineers, 10x engineers into 80x engineers and 100x engineers into 102x engineers

    • @yuriy5376
      @yuriy5376 Месяц назад +29

      And X engineers are still just former Twitter engineers

    • @Leonhart_93
      @Leonhart_93 Месяц назад

      It might make the 100x ones into 99x ones, give it hard enough stuff and it starts hallucinating confidently.

    • @racoon-thespy7062
      @racoon-thespy7062 Месяц назад +5

      1o3x engineers

    • @coomlord5360
      @coomlord5360 Месяц назад +7

      Lowkey im hoping we get to a point where making a game by yourself becomes that way so that small passionate teams can succeed

    • @warrenarnoldmusic
      @warrenarnoldmusic Месяц назад

      Funny thing is that the world will funnily find a way to live with this

  • @DigitalOsmosis
    @DigitalOsmosis Месяц назад +25

    We don't need to worry about o3, we need to worry about o4-mini.
    Every time there has been an increase in performance, it was followed by a budget friendly advance that doesn't match the bleeding edge, but matches the performance of the previous generation.

    • @erkinalp
      @erkinalp 29 дней назад +4

      o3-mini performs better than o1 full

    • @reboundmultimedia
      @reboundmultimedia 25 дней назад

      @@erkinalp exactly, that’s why he’s saying o4-mini gets this kind of intelligence widely accessible

  • @ISKLEMMI
    @ISKLEMMI Месяц назад +51

    7:20 - It's so much worse than that. That's five full *tanks* of gasoline - not "just" five gallons.

    • @redex68
      @redex68 Месяц назад +7

      I absolutely don't believe this. Looking up how much oil you'd need to produce 1kwh of electricity, it's about 0,0672 gallons. Assuming an average car tank is 12 gallons, that means you would need to spend almost 900 kWh (12galons / 0.0672 gallons/kWh) of electricity. It said that for the low efficiency variant it took 13 minutes to run, which means that you'd have to be using 4166 kW of electricity for the entire 13 minutes. An H100 uses 700W of electricity, that would mean that for 1 task you would be using 6000 H100 running at full tilt the entire time, which is completely absurd.
      Edit: I rewatched that part and to be more precise, it says 684 kg of CO2 equivalent emissions. Looking at the US average emissions per kWh of electricity (0.37 kg of CO2 per kWh), that would mean you would actually have to spend 1800 kWh of electricity, doubling the math. Either way you look at it it's completely absurd if you're purely looking at the cost of answering questions and not the training.

    • @vaolin1703
      @vaolin1703 Месяц назад +1

      @@redex68yeah it’s complete bs

  • @Nodsaibot
    @Nodsaibot Месяц назад +250

    it generates wrong code, FASTER

    • @umberto488
      @umberto488 Месяц назад +6

      Lol...no buddy

    • @hypnaudiostream3574
      @hypnaudiostream3574 Месяц назад +59

      lol … yeah buddy

    • @umapessoa6051
      @umapessoa6051 Месяц назад +18

      Honestly nowadays they are generating code better than 90% of the "Senior Developers", but programmers have a big ego and will always try to insult these tools.

    • @SomeRandomUserChannel
      @SomeRandomUserChannel Месяц назад +31

      @@umapessoa6051 It is not the code that matters but the actual logic and experience of programmers.

    • @TheHorse_yes
      @TheHorse_yes Месяц назад

      You mean STRONK code, right?

  • @diegoignacioalvarezespinoz3965
    @diegoignacioalvarezespinoz3965 29 дней назад +23

    theprimeagen 2024: "IA is dogshit"
    theprimeagen 2025: "IA is expensive"

    • @maxave7448
      @maxave7448 22 дня назад

      Correction: IA is expensive dogshit

    • @petrkinkal1509
      @petrkinkal1509 21 день назад +1

      I mean that probably does reflect the state of the ai at the time he was talking about it.
      theprimeagen 2026: "IA is competitive"
      theprimeagen 2027: "IA is super-intelligent"
      theprimeagen 2028: "IA is nice and I totally support our new AI overlords"

    • @vasiapetrovi4773
      @vasiapetrovi4773 13 дней назад +1

      @@petrkinkal1509 2028:*noises of tumbleweed on nuclear wasteland*

  • @rmidifferent8906
    @rmidifferent8906 Месяц назад +36

    The fact that there is no data point for o3 untuned is the biggest stink to me - especially since all of the other points are not showing that they are tuned. Tuning the model to such a task is like giving someone all of the questions and answers before having them take an IQ test - I don't think that the test results are all that impressive after that

    • @almightysapling
      @almightysapling Месяц назад +19

      Also the concept of "tuning" to a test that is supposed to measure the ability to generalize is itself entirely farcical. I can't believe anyone takes this result seriously.

    • @watsonwrote
      @watsonwrote Месяц назад

      ​@@almightysapling Devil's in the details. I suspect a lot of people are not looking at the details, just the hype messages.

    • @joeysung311
      @joeysung311 Месяц назад +5

      @@almightysaplingIs that literally just so the graph “looks better” to people who aren’t gonna think about what the text actually says?? Knowing what I’m looking at the graph is just so silly lol

    • @JavedAlam-ce4mu
      @JavedAlam-ce4mu Месяц назад

      We have fine-tuned AI that can beat any human at chess or Go, why is anyone surprised that tuning an AI at a specific kind of task allows it to do well?

    • @rmidifferent8906
      @rmidifferent8906 Месяц назад +7

      @@joeysung311 To me it all points to one thing - AI companies are struggling.
      The "AGI test" that measures whether the AI is an AGI that measures it in percents, but 100% does not necessarily mean AGI?
      A comparison between tuned new AI to old untuned to manufacture the sense of huge progress?
      People see AI improving from 30% on the AGI test to 70% and think "oh, we just need to do one more small upgrade to 100% and the AGI will be already there" when the reality is not that simple.
      OpenAI achieved precisely what they wanted to - they made so many people scream about the incoming AGI because they see the layers of misleading stats

  • @richcole157
    @richcole157 Месяц назад +6

    You gotta experience it to know if it any good. All these things are potentially accelerators, eg write boilerplate for you, find bugs, write tests for you. At the moment the llms make so many mistakes that they help little for production code, but ok for experiments, one off automation tasks where the stakes are not that high. For production code you have to do so much fixing and adjusting and rewriting it is hardly worth using them at all.

    • @RM-xl1ed
      @RM-xl1ed Месяц назад +4

      This. Literally this 100000 times over. Nobody seems to talk about this for some reason. It's gonna be a long time before AI is writing code that is good enough to go directly to production

  • @Edoardo396channel
    @Edoardo396channel Месяц назад +78

    If anything, the deep learning project in university taught me not to trust any benchmark for AI

    • @Ezcape0
      @Ezcape0 Месяц назад +2

      Yeah only the arena

    • @petersansgaming8783
      @petersansgaming8783 Месяц назад +9

      Exactly. We had public leaderboards for our assignments and what ended up happening is that most students were optimizing for the private test sets.

    • @AlanMitchellAustralia
      @AlanMitchellAustralia Месяц назад +12

      Goodhart's Law - “When a measure becomes a target, it ceases to be a good measure"

    • @yisraelmeirsobel907
      @yisraelmeirsobel907 Месяц назад +3

      AI developer's seem to take Goodhart's law of “When a measure becomes a target, it ceases to be a good measure” as a challenge, instead as a caution. When we see a benchmark we always think we have found a way to measure AGI, only to find the model creators pass them by rules lawyering!

  • @myyoutubeprofile-c3u
    @myyoutubeprofile-c3u Месяц назад +36

    They are trying to draw blood from a stone at this point.

  • @VictorHugoGermano
    @VictorHugoGermano 29 дней назад +1

    Man, I really like you channel! Old dog here - you are a really fresh air on developer RUclips! Down to earth, skeptic and kinda all over the place geek from the 90s!

  • @ThalisUmobi
    @ThalisUmobi Месяц назад +26

    Here's a hot take: The reason general public can have access to this A.I models, is that they're dumb and "useless". The moment it becomes intelligent and useful, only big corporations and governments will have access. So, as long as you can purchase a subscription to it, is total garbage.

    • @PaulanerStudios
      @PaulanerStudios Месяц назад +2

      The AIs will remain fundamentally stupid and we will run out of things to prove that they are in fact stupid. Douglas Adams "I'll have to think about that [6 million years later] I don't think you're going to like it" style.

    • @NihongoWakannai
      @NihongoWakannai Месяц назад +1

      It's not garbage, chatGPT is useful just not for replacing entire jobs.

    • @ThalisUmobi
      @ThalisUmobi Месяц назад +3

      ​@@NihongoWakannai Pardon the strong take, but my point is, GPT like A.Is haven't brought any advancements/improvements to the field of programming. All i see is shittier code comming mainly from JR DEVS, and what pisses me of the most, they are clueless about how their code came to be. I want to retire some day. How TF would i do that if new generations of programmers only know how to copy and paste?

    • @ThalisUmobi
      @ThalisUmobi Месяц назад +2

      @@PaulanerStudios A good indicator of A.I's becoming more intelligent, is when we stop talking about using it for something so trivial and mundane like programming.

    • @NihongoWakannai
      @NihongoWakannai Месяц назад +1

      @@ThalisUmobi The usefulness of ChatGPT is in being able to ask it questions that you don't have the proper keywords to google effectively. I don't use ChatGPT for writing code unless it's like AHK or batch scripts.

  • @donjaime_ett
    @donjaime_ett Месяц назад +20

    We still need skilled people to understand and deploy the AI output. AIs can’t be held accountable for bad business and technical decisions. Prime’s take at the end is so so good. Totally spot on. The people that understand how things work will be the best equipped to wield AI. And I think approaches like Cursor are going to be the winning paradigm - using AI as an integrated tool and not an outright replacement - but again best wielded by skilled engineers that understand how things work already and can use these tools to automate busy work, craft the best prompts, and verify and deploy outputs the fastest.

    • @RomeTWguy
      @RomeTWguy Месяц назад +4

      The issue is that for any non-trivial task it is just easier to write out the code than to create a prompt that is precise enough.

    • @NihongoWakannai
      @NihongoWakannai Месяц назад

      So you're saying lower rung jobs will be cut even more and to even get a single job in CS you will need at least 10 years experience?
      Wow so optimistic.

    • @donjaime_ett
      @donjaime_ett Месяц назад +2

      @@NihongoWakannai the main point is that learning computer science and software engineering and developing your hard skills is not a pointless endeavor. In some senses, it might become even more important.

    • @BHBalast
      @BHBalast Месяц назад +2

      AI can be liable for their actions, someone just needs to create AI insurance. It sounds insane but we do live in linsane times afterall, it might work.
      Also, insurance scams on AI agents pulled of by rouge AI agents in few years? 😂 Such Cyberpunk 2077 vibes.

    • @archardor3392
      @archardor3392 Месяц назад

      @@BHBalast Nobody in their right mind is creating AI insurance. The risk is simply too high.

  • @ayron419
    @ayron419 Месяц назад +13

    it financially benefits these companies to create such hype about their product.

  • @young9534
    @young9534 Месяц назад +32

    Man those RUclips videos with “AGI Achieved!!” in the titles are so cringey. Unfortunately that level of sensationalism works on RUclips if you want views

    • @blubblurb
      @blubblurb Месяц назад +6

      I started hiding those channels. They fooled me like 3 times, that was enough.

    • @2290961
      @2290961 29 дней назад

      True!

  • @samuelp7847
    @samuelp7847 22 дня назад +3

    Honestly the early models really help me get started when I code, that alone helps speed me up. I don’t expect or want it to do everything. Just write me some boilerplate or UI with decent accuracy and I’ll happily pay $20/month

  • @anistissaoui
    @anistissaoui Месяц назад +5

    It would be interesting to have o3 vs a team of software engineers. The input can be incremental requirements for a web app for example.

  • @BowBeforeTheAlgorithm
    @BowBeforeTheAlgorithm Месяц назад +17

    We might be experiencing peak AI hype. The findings and the state of the art may be advancing, but the energy differentials are multiples orders of magnitude in the wrong direction.

    • @archardor3392
      @archardor3392 Месяц назад +1

      But this is the sentiment every time OpenAI releases a model. We are always experiencing the peak.

    • @BowBeforeTheAlgorithm
      @BowBeforeTheAlgorithm Месяц назад +3

      @@archardor3392 In the world of finance and money, this might be called the irrational exuberance phase before a bust. The valuations don't make sense given the profit and losses, we're left wondering if "they" know something we don't know. The AI market must be differentiated from the AI technology. There's clearly a market for some forms of AI technology, but the market doesn't properly reflect the state of the technology.

  • @voidreact
    @voidreact 28 дней назад +2

    This was a great take, and refreshing one. Even youtubers and influencers that I trusted went the click bait and scare route.

  • @layer4down
    @layer4down Месяц назад +7

    The rate of technology change is no excuse to not learn coding. Even if you don’t have a job in coding like 10-20 years from now, you won’t get a job in playing word games but how much time do we invest in that as well? Coding is an excellent exercise for the brain in addition to teaching us new ways to think and reason about the world around us.

  • @__kvik
    @__kvik Месяц назад +100

    Just the other day I had a conversation with my boss who said he's willing to "invest" in "AI": "if it can help us with even 50% of the tasks". I had to then explain that I've been using this thing myself for two years already and that it's closer to 10% helpful when it's actually helpful, and entirely detrimental when it's wrong, which is most of the time.

    • @danieldorn9989
      @danieldorn9989 Месяц назад +14

      Your boss sounds like boss from Dilbert

    • @erik....
      @erik.... Месяц назад +9

      It depends a lot what you do. Some coding tasks are just insane, like when I ported SQL code and models from C# to Python in like a few minutes. Probably saved a day there. Or the other day I needed a simple function that generates an SVG from a list of points, with a few options, and it just did it in a few seconds, ready to paste into the code.

    • @foreignaaa
      @foreignaaa Месяц назад +3

      My boss has unironically mentioned the idea of using cursor and I’m like “well… I won’t say nothing I’ll just laugh when it becomes a catastrophe”. I’ll be ready to clean up the mess though. 🤷‍♂️

    • @neithanm
      @neithanm Месяц назад +4

      I can't believe you just ignored >at what task< in your argument. That's the whole point for AI and brains alike. How good are you at doing what. For it to be useful to you, you use it on the things you know it's good at and not otherwise.

    • @mattymerr701
      @mattymerr701 Месяц назад +3

      ​@@neithanm if you have a tool at hand that seems easy and good, you will begin to use it for everything and quickly find out why you can't.

  • @edevPedro
    @edevPedro Месяц назад +9

    after any ai update, tech influencers need to convince devs to not kill themselves

    • @GTASANANDREASJOURNALS
      @GTASANANDREASJOURNALS Месяц назад +1

      Most people here are privileged af, "AI Can't do my job" is such a privileged answer fr, I want AI to take my job

    • @DrDeaddddd
      @DrDeaddddd 29 дней назад

      @@GTASANANDREASJOURNALS You wanna be homeless?

    • @timothyjjcrow
      @timothyjjcrow 29 дней назад +2

      @@DrDeaddddd i want everyone to be homeless then we figure out how to communism with the help of ai

  • @MateoRattagan
    @MateoRattagan 28 дней назад +1

    lol. what a fresh, great video. thanks. non tech founder here looking to learn to code. been funding a team of juniors and sketchy seniors for 3 years til I ran out of money and patience. now using cursor, bolt, lovable, etc. all of those just can't get to the bone as I need. that's why I know the only way out is to finally learn to code. thank you for your video. subscribed. will be looking forward to new ones.

  • @matthew1791
    @matthew1791 Месяц назад +5

    The Arc public data set has been available since 2019. This means that data will have been trained on every frontier model including Gpt4 and o1. And this is understood by the ARC foundation, they made it public afterall. Also, look at the frontier math benchmark by EpochAI. These are novel questions that are tough for fields medalists and o3 scored 25%, previous SOTA was 2%!

  • @alanfender123
    @alanfender123 Месяц назад +5

    You know that Primeagen has accepted AI as a thing when he calls his master's degree one in AI rather than machine learning

    • @ThePrimeTimeagen
      @ThePrimeTimeagen  Месяц назад +1

      When I was getting my master's degree it was just called a masters, an artificial intelligence, not machine learning, this was 12 years ago

    • @alanfender123
      @alanfender123 Месяц назад

      @@ThePrimeTimeagen Holy crap, you actually responded! really like your videos! didn't realize that schools were using the AI label that long ago, I just mistakenly assumed that you had given up and started referring to machine learning as AI like many others at this point. I'll still be referring to radial basis functions as being part of machine learning (this will get easier and easier as they become more forgotten with time)!

  • @christianhilburger5026
    @christianhilburger5026 Месяц назад +2

    As a beginner self-teaching developer, I find LLMs are a fast way to learn about tools or libraries that fit a specific use case I describe. Also, when I struggle to understand the terminology or concepts in documentations I use LLM’s to get things simplified. Still, many times if I get stuck on a problem long enough the temptation to ask LLM to provide a solution is great. And when I do ask I feel dirty afterwards.

  • @TheSnero3
    @TheSnero3 Месяц назад +8

    permanent code reviewer!! After 26 years that is pretty much my job now!

  •  Месяц назад +1

    One of the best videos I've ever seen about this topic! Thanks from Brazil!

  • @mattd8725
    @mattd8725 Месяц назад +15

    AGI is almost as expensive as it was to use dial-up internet to play Everquest.

    • @SlyNine
      @SlyNine Месяц назад +6

      Considering what we could do 10 years after that. People should be more concerned.

  • @austinhuntley5169
    @austinhuntley5169 Месяц назад +2

    I remember o1 initially taking between 15s to 45s per prompt. Now that it's released. It's better and instant in most cases. Most of that apparently was tuning to make sure simple questions weren't overthought. I'm curious to see how much they can get the cost down by the time it's actually released.

  • @Jubijub
    @Jubijub Месяц назад +3

    I love "Based Primeagen" videos, you give really good insights. The issue is trying to bring reasonable arguments to a crowd of people who just "want to believe", most of whom have never used ML at any depth (I am amazed by how many people give authoritative opinions on AI and AGI with 0 technical background). I will say this : my team has been using similar models for 1.5 year. I can easily manufacture a set of examples that will blow you mind and make you think my team is redundant in 6 months. I can equally easily cherry pick example that will make AI look stupid and will make you doubt we should even continue. My take ? It's a tool, it's good a certain things, let's use it for that. Oh, and apply a modicum of critical thinking : of course the snake oil seller will tell you their snake oil is the best, that you need snake oil and [insert here compelling reasons]. But that is still someone profiting from selling snake oil, and that requires to sift through what they say critically.

  • @adnankhalil9640
    @adnankhalil9640 Месяц назад +2

    Man i have dug into cursor this previous and with their composer agent i am astonished how much i can do in so little time. I think the future of programming is here in cursor like solutions, where you as a programmer will act as a project manager and cursor agent is like your 10x developer team

    • @my_online_logs
      @my_online_logs Месяц назад +1

      i have llm agent in my neovim, for starter, i just need to tell him what i want to create with the features and tell him to save it to the file name, then it will generate code and save it in the file with filename as i want, then it will run the code. then if there is error, i just tell him a keyword how to solve this @terminal, the @terminal will include all the code in the current file + all the error message in the terminal output, then he will edit the file 😅 continously until there is no error, if there is still error then i just review and change little bit then viola the prototype is done

    • @CabbageYe
      @CabbageYe Месяц назад

      ​@@my_online_logsthat's not a good way to use ai for programming. You won't know anything about the code. It's better to use something like cursor or copilot chat

  • @LifeLoggingAI
    @LifeLoggingAI Месяц назад +4

    It’ll be AGI for SWE when it can self verify using arbitrary tools and computer use.
    I had Claude 3.5 add a download button to a complex page. It gets it in the first go. That was pretty impressive. So kudos. But I still needed to QA the feature. I had to rebuild the app, open a browser, navigate to the right place in the app, create the history in order to generate non-trivial download content, look for the download button, make sure it’s in the right place, make sure that the styling is legible , test the hovering operation, press the download button to see if it responds at all, know where to look and what to look for to see if it is downloading, find the downloaded file , open it, inspect the contents and make sure that they match what’s on the screen and formatted in the way that was requested in the prompt.
    We’re getting there but I’m still having to do a lot. I want it to do all this before it presents its solution to me.
    That’s what the G in AGI means to me.

    • @kietphamquang9357
      @kietphamquang9357 Месяц назад

      seems like you're babysitting an AGI 🤣

    • @RM-xl1ed
      @RM-xl1ed Месяц назад

      @@kietphamquang9357 Yeah because non-AGI couldn't write a basic download button at all. lmaoooo. get real you fuckin AI koolaid drinker

  • @drilkus1312
    @drilkus1312 Месяц назад +10

    I use gippity daily and its more useful as a rubber duck than anything. Sometimes its a good reminder of something basic you didn't want to google. But its wrong so often that theres no way you trust most of it

    • @NihongoWakannai
      @NihongoWakannai Месяц назад

      You don't need to trust it. A lot of times it's easier to ask ChatGPT and then check its facts than it is to do research or read docs manually.

    • @Nnm26
      @Nnm26 Месяц назад

      That’s laughably wrong

  • @lxn7404
    @lxn7404 Месяц назад +13

    I used to work in localization. They started to introduce machine translation, then they started to lay off language specialists. Now you see bad translations everywhere but low quality is not an issue for most businesses

    • @mate8115
      @mate8115 Месяц назад +1

      yeah, i dont really understand these sentiments at all. Yeah, AI is very shitty at developing. Do they have to be good at it for retarded higher ups to start replacing humans with it? Not really. Already happening with marketing and graphic design, see all the clearly ai generated billboards. Looks like shit, do they care though? Not really. And people are out of their job, just like that. Same thing will happen with programming, maybe already have, with how the junior market has been for years now

    • @AlanMitchellAustralia
      @AlanMitchellAustralia Месяц назад +1

      Good point, it's perfectly acceptable to have low quality for some tasks. Most buildings have sloppy insides because the aesthetics don't matter

  • @thebearded4427
    @thebearded4427 Месяц назад +48

    You know what I find funny? It's that companies look forward to see AI do things they want done and create something by interpreting the user........which programmers do today.
    Also, companies live in a imaginary world when they think buying a AI tool in a few years will be cheaper than hiring experienced programmers.
    Humans are IRL AI agent equivalents, but companies aren't able to treat humans humanely or give them time to educate themselves so they need insanely expensive AI to sink their ships instead.
    Watch a single company who understand the value to knowledge swoop up insane amounts of talent and just outrun the market at light speed.

    • @iz5808
      @iz5808 Месяц назад +21

      I don't see even the companies being happy about AI. If AI is capable of writing a good working program, and maintaining it reliably, the value of an IT product will be the same as the cost of running AI. The whole IT sector will be a no money place dominated by a couple of AI companies, or the companies which run apps that need a large sums of money to run their services. Everything you can now produce digitally will be worthless in the future, if a good AI is shipped

    • @stukyCZ
      @stukyCZ Месяц назад

      ​@@iz5808 Interesting point to think about

    • @AlanMitchellAustralia
      @AlanMitchellAustralia Месяц назад

      ​@@iz5808This assumes perfect efficiency, which is unrealistic. In the same way that a retailer with the lowest prices doesn't get 100% of sales

  • @orterves
    @orterves Месяц назад +6

    We've built a Mr Meeseeks box. Just don't give it to a Jerry and we'll be fine

    • @hck1010
      @hck1010 Месяц назад +1

      Teach me golf !

  • @StefanoMalagrino
    @StefanoMalagrino 17 дней назад +1

    Maybe I'm misunderstanding something, but as far as I understand, the term AGI implies the ability to surpass human knowledge. To achieve this, it must be capable of continuously learning on its own, as this is the only way for AGI to exceed human capabilities - a goal that is certainly still a long way off. This will probably only become possible once we can operate LNNs on neuromorphic chips at an industrial scale.

  • @NicoCoetzee
    @NicoCoetzee Месяц назад

    Great take and I completely agree with your assessment and assumptions.
    Somewhere in the near future the pool of those with the technical depth to understand beyond the AI suggestion will dwindle to a critical point at which time humanity would loose the abilities we take for granted right now.

  • @Qew77
    @Qew77 Месяц назад +35

    5:55 bruh that 62 can itself be an AGI test

  • @rmtbc1
    @rmtbc1 Месяц назад +4

    There is a good scene on this in Stanislaw Lem - Fiasco - between an old spaceman and super AI computer

  • @arastoog
    @arastoog Месяц назад +3

    Also, IIRC, there is another ARC AGI test which o3 was put up against and it got like 30%, which wasn't much of an improvement over o1. Really does seem like marketing hype, kind of like what google did with the willow quantum processor

  • @artificialartlab
    @artificialartlab Месяц назад

    Thanks for being always the most honest in the business! Love you work!

  • @drakewinwest9888
    @drakewinwest9888 Месяц назад +80

    This just proves how fucking ruthlessly efficient biology has become. I guess a few billion years of guessing beats a few thousand years of technological advancement.

    • @dansplain2393
      @dansplain2393 Месяц назад +11

      I think this is where we’ll end up. An ai that can program cobol at great expense, as everyone else has died.

    • @bobcousins4810
      @bobcousins4810 Месяц назад +12

      Thinking the same thing. Brains may be limited in lots of ways, but they do pretty well on 20 Watts. Pre-training can take years though.

    • @astledsa2713
      @astledsa2713 Месяц назад +4

      @@bobcousins4810 But think about it: pre-train for muscle coordination, vision, audio, smell and then finally a few years to fully develop the pre-frontal cortex

    • @hamm8934
      @hamm8934 Месяц назад +2

      I agree with the sentiment, but we've only really begun AI work 50 years ago. Relatively speaking with the time span of humanity as a species, AI is far out pacing us.
      That said, I fundamentally think brute forcing with distributive semantics is a fools errand.

    • @Watershed09
      @Watershed09 Месяц назад

      @@bobcousins4810 how long does it take to train every single human though

  • @lebleathan
    @lebleathan Месяц назад +15

    7:30 not 5 gallons. 5 full tanks. thats 50 - 100 gallons

    • @louiscouture9139
      @louiscouture9139 Месяц назад +3

      We worried about the wrong cooking

    • @TanerH
      @TanerH Месяц назад +1

      Came to point this out, too.

  • @rosstocher
    @rosstocher Месяц назад +8

    One should be shorting of all over valued AI equities. Which is all of them.

  • @Ashutosh-ahhkgdo
    @Ashutosh-ahhkgdo 15 дней назад +2

    Why don’t you understand compute cost comes down. Gpt 4 model is 100X cheaper than it was when it was launched

  • @oussama7132
    @oussama7132 Месяц назад +10

    Doesn't this happen with every new model?

    • @Alx-h9g
      @Alx-h9g Месяц назад

      One day it will be right

  • @uxel-g9y
    @uxel-g9y Месяц назад +2

    90% of my time fixing a bug is figureing out in which branches it must be fixed, filling possibly 5 pull requests, restarting flaky tests on ci for the 5 branches, requesting a review for 5 pr, nagging my boss until he finally presses the approve button, realize that on 2 of the 5 branches someome has merged first and now I have to rebase thosr commits (not allowed to merge...) so rinse and repeat. In effect the loc written are like usually 2-3 times. The Ai would save me no time doing most of my tasks since they are not even coding things.

    • @timothyjjcrow
      @timothyjjcrow 29 дней назад

      what happens when they scrap your whole codebase and ask the ai to make the app from scratch?

    • @uxel-g9y
      @uxel-g9y 28 дней назад

      @@timothyjjcrow I would be happy If I could do that, as my job would shift to prompting the AI to do things, leaving me more time to enjoy drinking tea.
      Half of my job is doing office politics anyways. When you are in a project of this size there is so many layers of management that you have to play bullshit bingo with... Imagine PM #1 comes to you and asks you to do a feature, that feature would be inconvinient for PM #2 you know this due to context you have, the ai would have just implemented it. So what would I do? thats right make a meeting and have PM 1 and 2 fight each other, or nudge PM 1 into a direction thats likely to be acceptable for PM 2. I highly doubt that the AI would in a reasonable amount of time have the social skills necesarry to perform these tasks. Coding is really a very small part of my job. Its one I like, I just hate the codebase I have to work with because its older than me and several million LOC's.
      Also I dont make "App's". I make highly complex vendor neutral microscope software that also happens to be able to interface with a pelthora of other systems via very arcane api's. (not the http kind of api) This is so niche that even if you wrote down a full requirement sheet for the software, which is nearly impossible in my opinion because we dont even fully understand what our software does, then I doubt that current and next 3 year AI's would be able to comprehend what it has to do. In the industry make this software for failure/bugs lead to reasonable terrible consequences so the confidence would be low for an iterative AI approch. Will it happen in 20 years? maybe, will it happen in the next 10 years? very unlikely.

  • @toasterenthusiast6188
    @toasterenthusiast6188 Месяц назад +13

    Analog computing or computing in memory are potential solutions to the energy inefficiency problem, boolean just takes way too many precise steps to get at a conclusion.

    • @malekith6522
      @malekith6522 Месяц назад +8

      Analog computing is effective for differential equations and not much else, as it tends to lack precision and the hardware isn't very scalable. There's a reason why this approach wasn't chosen. What does 'computing in memory' even mean? Boolean logic requires too many precise steps to reach a conclusion? What?

    • @steve_jabz
      @steve_jabz Месяц назад +2

      Thermodynamic compute is much better and coming quite soon. Works with the same standardized CMOS silicon architecture we've used for decades. 10,000x on first release would be childsplay.

    • @iz5808
      @iz5808 Месяц назад

      @@steve_jabz it's nuclear fusion reactor 2.0. I will accept it only when I see it really working. I believe the current computers are good enough for AI, it's just the matter of getting proper foundation for the model

    • @steve_jabz
      @steve_jabz Месяц назад +2

      @@iz5808 Not really comparable. Fusion is hard. Thermodynamic processors are super easy and we can start mass producing them right away, we just didn't have a reason to have intentionally noisy circuits until it became obvious that scaling modern techniques like diffusion and attention was extremely effective. In 99.9% of cases, the last thing we want is noise in a circuit.

    • @mattymerr701
      @mattymerr701 Месяц назад

      ​@@steve_jabz super easy and yet not a product that exists.
      We must have different definitions of "super easy" where yours is "something extremely hard and cost prohibitive to release and full of bugs"

  • @lichtundliebe999
    @lichtundliebe999 Месяц назад +1

    What do you think of spiking neural nets and other approaches near to natural neurons to lower energy consumptions and maybe to speed up few things?

  • @David-ty6my
    @David-ty6my Месяц назад +14

    I achieved AGI Internally (my brain)

    • @RM-xl1ed
      @RM-xl1ed Месяц назад

      BUT HOW DO YOU DO ON THE BENCHMARKS

  • @jorgelopez3896
    @jorgelopez3896 Месяц назад +2

    12:11 He just explained pretty much tech. Just a small, really small fraction of tech is really useful, the more you put thought on it the more makes sense.

  • @quantumastrologer5599
    @quantumastrologer5599 Месяц назад +3

    Let's first see if this internet thing takes off.

  • @GAD-mn5wn
    @GAD-mn5wn Месяц назад +1

    Ok, puzzles, but can it refactor this 15 years old legacy codebase with crazy 1127 lines long for-loops, without producing bugs?

  • @OperationXX1
    @OperationXX1 23 дня назад +3

    I think you're missing the bigger picture here. The o3 mini model offers performance equal to or better than the o1 model at a fraction of the cost. Sure, the full o3 model is expensive, but if you consider how much better the o10 could be in just two years, it’s only a matter of time before it surpasses nearly all coders at a fraction of the cost ... coders need to plan for a world where human coding is not needed anymore !

    • @ThePrimeTimeagen
      @ThePrimeTimeagen  19 дней назад

      i know people saying things like this but there is no official o3-mini comparisons. If o3-mini was that good and better than o1 they would have shown in. they didn't. i think this is just a hand wavey thing they are saying until they can figure it out

  • @marcokoegler750
    @marcokoegler750 Месяц назад +1

    If there's a known test to prove you are AGI, then you can probably train for it. Not trying to diminish the progress which OpenAI has been able to achieve. I like viewing LLMs as 'general trained models' which you can condition to a 'predictable' function via prompting to prove out the viability of training a custom ML model. They are fantastic for throwing stuff at the wall and seeing what sticks!

  • @pritonce6562
    @pritonce6562 Месяц назад +26

    The CO2 impact of this is actually insane.

    • @schtormm
      @schtormm Месяц назад +3

      yeah that amount of CO2 is fucking craaaaaaazy

    • @7th_CAV_Trooper
      @7th_CAV_Trooper Месяц назад

      Nah, they're building nukes to power these.

  • @thomaslecoz8251
    @thomaslecoz8251 27 дней назад +1

    i think you missed the point.
    O1 & O3 will be mainly used to produce a LOT of very high quality synthetic data in order to train next models MUCH MORE efficiently (it's proven that a small LLM can works as good as a big one depending on the quality of the training data)

  • @draken5379
    @draken5379 Месяц назад +5

    The thing is, its not worth the compute right now. That doesnt mean in a year, with compute costs lowering, and changes/improvements to the model, will result in speed ups/cost reductions etc.
    What i dont get, is how programmers cant grasp, that we are only scratching the surface of transformers, let alone any sort of future neural network architectures. Nearly EVERY DAY, there is new papers/studies showing aspects/discoveries about transformer models.

    • @celex7038
      @celex7038 Месяц назад +2

      Compute costs aren't lowering. Computers can't be much cheaper becouse of physics laws

    • @draken5379
      @draken5379 29 дней назад

      @@celex7038 Sora took an hour, to create a min video, a year ago.
      We now have OPEN SOURCE video transformer models, that are 1000x faster.
      Its wild how everyone wants to talk about AI, yet they know literally fuck all about it.
      Compute costs lower, because models are reworked,retrained and optimized to use insane amounts of less compute.
      But i get it, Its 2024, we have people using junk like Javascript to host backends, i get that you youngsters wouldnt know what optimization was if it tried to suck your cock.

  • @nrg4285
    @nrg4285 28 дней назад

    The thimble of water comparison was spot on, every CIO needs to see this

  • @alecsbizarrememes7862
    @alecsbizarrememes7862 Месяц назад +5

    Thank god for making me so environment efficient

  • @musicalvisions
    @musicalvisions Месяц назад

    Regarding qualifying cost at 3:30:
    The pricing structure resonates with AWS and and the exorbitant bill some companies run up with many AWS products.

  • @Willifordwav
    @Willifordwav Месяц назад +3

    But can it push to master?

  • @RuinedLiberty
    @RuinedLiberty 27 дней назад +1

    I'm coding a game with chatgpt. Paying $200/month for the pro subscription. And what you said couldn't be more true. I don't know much about coding, only the basics, but as the code gets bigger with more files, it starts to hallucinate. Difficulties solving simple issues. Giving code with bugs.
    I started coding with AI from the beginning, the improvements are crazy and still going. AI is an amazing tools, but it's far from being to the point of replacing other developers, that's for sure. As for the future, I have no clue.

  • @siamesestormtrooper
    @siamesestormtrooper Месяц назад +4

    2:17 “WE DID IT GUYS WE MADE AN AI AS SMART AS PEOPLE” “okay could it solve this picture puzzle an actual 1st grader would consider trivial?” “No” 😂 every single fucking time

    • @alex-rs6ts
      @alex-rs6ts Месяц назад +1

      It's far smarter than most people in many areas, but lacking in another areas. We just have to improve those less developed areas. Not that complicated

    • @Zatarra69
      @Zatarra69 Месяц назад +1

      @@alex-rs6ts congratulations on completely, COMPLETELY, miss the point of generalized intelligence.

    • @siamesestormtrooper
      @siamesestormtrooper Месяц назад +1

      @ exactly, MOST people, just not the people who would actually be hired for those positions 😭

    • @timothyjjcrow
      @timothyjjcrow 29 дней назад

      it will beat you at coding tho so what is your point?

    • @siamesestormtrooper
      @siamesestormtrooper 29 дней назад +1

      @ Lol no it won’t. I’ll be better at programming than any AI agent for at least the next 50 years.

  • @sebbytrial
    @sebbytrial 29 дней назад

    love the way you wrapped this. thank you sir

  • @williamjuang5793
    @williamjuang5793 Месяц назад +3

    The fact that they trained their model on AGI stuff seems extremely unfair. That’s like telling a robot it’s a human before the Turing test

  • @tonybowen455
    @tonybowen455 Месяц назад

    Love it. Preaching to the choir. I've started taking college math classes online to a comp sci degree largely because of you. If that doesn't work out there is electrical engineering, which I do want to do even if comp sci does work out.

  • @mattki-y9y
    @mattki-y9y Месяц назад +79

    Oh we are so cooked bro, I got like 200 years of full stack software engineering web 3.0 development experience and I just got laid off from my job, because CEO will use AI on legacy code base. I can literally code in binary and this wasn’t good enough. Cooked.

    • @JorgetePanete
      @JorgetePanete Месяц назад +41

      Try specializing in trinary

    • @yumearia3662
      @yumearia3662 Месяц назад +11

      You need to master coding in qbit and you should be safe until quantum AGI is achieved

    • @Repeatedwaif
      @Repeatedwaif Месяц назад +3

      I feel like they'll want you back when it doesn't work then you come back for a premium

    •  Месяц назад +5

      In about a month when they ask you if you'd like your old job back you should be ready with your terms as a very highly paid Winston Wolf type character. Also wear a suit at least for the first two weeks.

    • @Maxo-bh5ni
      @Maxo-bh5ni Месяц назад +1

      They will be missing you. I highly suggest sending them an email making it clear that you are willing to be rehired.