Researchers STUNNED As A.I Improves ITSELF Towards Superintelligence (BEATS o1)

Поделиться
HTML-код
  • Опубликовано: 15 янв 2025

Комментарии • 504

  • @RevealAI-101
    @RevealAI-101 4 дня назад +155

    FINALLY, something worthy to be labelled as STUNNING!
    Well done on this video.

    • @taijistar9052
      @taijistar9052 4 дня назад +11

      This shows that Open AI can’t monopolize AI, too many smart people in the world

    • @tracy419
      @tracy419 4 дня назад +8

      ​@@taijistar9052I think that's been obvious for the last year.

    • @Mk2kRaven
      @Mk2kRaven 3 дня назад +1

      Too many smart people? It is just basic reasoning plus knowledge in the field, that doesn't make you smart 😂. I guess everyone in the AI field is smart then 😂😂

    • @ChannelpH
      @ChannelpH 3 дня назад +1

      Yeah, what's more stunning the code was solved 1/1 by me. In Denver.

    • @skiittz2916
      @skiittz2916 3 дня назад +1

      @RevealAI-101 not stunning. Stunning usually carries a positive connotation. Ai is awful.

  • @make720perday
    @make720perday 4 дня назад +14

    I love how you challenge me to think beyond what I already know. This video is proof that real growth happens when we keep questioning and learning.

  • @11Itchyknuckles
    @11Itchyknuckles 4 дня назад +153

    If LMs can self improve, the feedback loop is going to be incredible. Like they weren’t already improving fast enough.

    • @TheRetroBurn
      @TheRetroBurn 4 дня назад +8

      Until they can improve their own hardware's architecture and their own training process there will be a low ceiling on how far they can improve themselves.

    • @BardockOjama
      @BardockOjama 4 дня назад +1

      @@TheRetroBurnhow so could you explain the possible blockers?

    • @Linksyx
      @Linksyx 4 дня назад +4

      ​@@BardockOjamaA lot of algorithms have been proven optimal so you can guess a precise limits to certain computer tasks like searching a route on a map, or searching an element in a list/array.
      And there is a limit to how much you can losslessly compress information that limit the size on computer/knowledge that directly limits LLMs (not necessarily other models), but it's less clear the impact (the theoretical one is far away but there is also a minimum amount of computation depending on how compressed are things).
      So with that I'd say we can't have a "computer god" that does computer science very fast without the hardware to go with it and we probably can't have with it an IA model that knows and have a good level of understanding of all human knowledge (because either it would be slow either it would have a non zero failure rate).
      And there's a lot of things I don't know about that probably explains why we don't have AGI
      With what I said however I don't see a hardware problem towards a model monstruously good at math/physics that would pave the way to incredible hardware progression, but that's still bound to take time to be implemented because of the size of things we have reached (transistors barely a few dozens of atoms long) making machines complocated and long to built themselves and a very high failure rate in transistor construction already.
      I'd be happy to learn if anyone can correct me or add any information !

    • @taijistar9052
      @taijistar9052 4 дня назад +4

      How can every author has a Chinese name?

    • @guystokesable
      @guystokesable 4 дня назад +2

      Maybe, but I can self improve, I just dont

  • @HaraldEngels
    @HaraldEngels 4 дня назад +196

    The idea that the LLM technology has hit the ceiling is just ridiculous. In 2-3 years small, instructed models will produce miracles.

    • @MyUpsideDownLife-SKR
      @MyUpsideDownLife-SKR 4 дня назад +4

      Exactly!

    • @CanaanZhou2002
      @CanaanZhou2002 4 дня назад +14

      So this is the second coming. The first time God became a human, the second time it's an AI.

    • @brianmi40
      @brianmi40 4 дня назад +26

      @@CanaanZhou2002 Actually, AI will once and for all time wipe Iron Age religions derived from mythology off the planet. This is the first coming.

    • @tiitgeorg720
      @tiitgeorg720 4 дня назад +10

      I agree with you. It seems that some people, fearing replacement, claim that technology has reached its peak to discourage further advancement. I can't think of any other reason why people would still dismiss LLMs or AI.

    • @ClaimClam
      @ClaimClam 4 дня назад +3

      @@brianmi40no it will be used by by people to justify their emotional beliefs

  • @sephirothcloud3953
    @sephirothcloud3953 4 дня назад +62

    23:50 I felt like Leonardo Da Vinci by listen to two people speaking simultaneously

    • @skorpiongamer9493
      @skorpiongamer9493 4 дня назад +8

      My brain was really working 100% trying to understand info from two videos at the same time🤣

    • @rmiddlehouse
      @rmiddlehouse 4 дня назад +4

      superintelligence

  • @michaeltsouloftas7600
    @michaeltsouloftas7600 4 дня назад +5

    This is finally a real improvement deserving of the term "breakthrough", that might bring incredible progress to the models.
    Would love to see this applied to the frontier models and watch them break all benchmarks.
    Imagine Grok 3 improving itself like this for a couple of months on the colossus cluster.
    My only issue is that people keep talking about intelligence and models becoming smarter, when in reality they're still incredibly stupid and even this self improvement makes them not one bit more intelligent. It does however make them a great deal more knowledgeable.

  • @DriesduPreez
    @DriesduPreez 4 дня назад +9

    Unless I missed something, it appears as if your thumbnail suggested Sam Altman said that, because of the inclusion of his photo and a quoted line. This is misleading and dishonest.

  • @imeleventeen
    @imeleventeen 4 дня назад +9

    I wonder what would happen if you gave them math problems without numbers. Then gave the answer and allowed it to figure it out however it wants and continually do that with tons of different questions until it can either re create math or how we do math. Would be crazy to see it figures out a different math language

  • @hjups
    @hjups 4 дня назад +21

    Keep in mind that ASI would require online learning, whereas the rStar model is offline. Currently, online learning is an open problem, since the training updates require a large amount of compute and time. Even if we could get that down to 128 GPUs and 1 hour, that's still too slow to snowball. And there will come a point where the capacity of a given model becomes saturated (e.g. maybe the 7B rStar is a proof savant, but doesn't understand the concept of "time" anymore). Although, the smart thing would be to never deploy a system that can perform unbounded training updates (even at 1 hour each) without a human in the loop. Regardless, this approach seems quite fruitful for learning a good fixed model, which stops learning prior to deployment.

    • @Crates-Media
      @Crates-Media 4 дня назад +5

      Pay closer attention. AI is _already_ starting to lend significant acceleration to computational hardware engineering...
      Even WITHOUT those gains, Nvidia has switched from a 2-year to 1-year release cycle (which compounds things)...
      Jensen Huang - without predication on those factors - has a schedule to increase compute by 1000x in under 8 years.

    • @Crates-Media
      @Crates-Media 4 дня назад +2

      (By comparison, traditional application of Moore's Law as seen historically would yield 32x in the same time frame.)

    • @Crates-Media
      @Crates-Media 4 дня назад

      It's also worth noting that there are black swans waiting in the wings that we cannot conceive of.
      2 Minute Papers is a great channel that talks about many of these things cropping up constantly.
      (e.g. 1000x compute will change the game by making KANs finally a viable alternative to MLPs.)

    • @theaerogr
      @theaerogr 4 дня назад +2

      First of all, as AI improves it allows us to improve the hardware and infrastructure by using it's knowledge.
      Compute is not really a problem, as more and more companies enter the race for cheap hardware and dedicated asics for LLMs.
      We are gonna be staking compute gains easily. Given that 2024 was the year that infrastructures and clustered scale, and training become much more compute efficient due to MOE adoption, we will need 10-100x compute jumps to go to ASI, in my opinion. One of the things we will see soon, is continously train the models, without change of architecture, so every month you will get an updated model that is 5-6% better at the same compute cost, without training from scratch.

    • @hjups
      @hjups 4 дня назад +4

      ​@@theaerogr I think you're putting too much emphasis on hype. Hardware is definitely improving, but most of the progress is low hanging fruit or marketing hype to sell stocks. GPUs for example were always very data inefficient, so NVidia has been improving their memory subsystem architecture.
      There are physical limitations that are very hard to overcome, and there are numerical limitations that are mathematically hard to overcome (you can't take a derivative of a bit).
      I realize that sounds dismissive, but as an AI hardware co-design researcher, that's how I see things going based on the current research.
      Regarding continual learning, I think you may be confusing the term. What you described could colloquially fall into that category, but it's really more of progressive pre-training / iterative improvement.
      Continual learning, as I meant it, is real-time updates based on inputs. Humans are example of continual learners, where you probably learned something from reading this comment. That's an unsolved problem for deep learning systems (RL can do it, but that comes with drawbacks too). It also would necessitate multiple unique copies of model weights (or low-rank overlays).
      Test-time compute is a hack to get around continual learning, which stems from the in context learning ability of large models. But that can only scale so far (computationally irreducible).

  • @jonesani
    @jonesani 4 дня назад +20

    These ppl seem not to understand that a superintelligence is by definition uncontrollable. A superintelligence doesnt have a master, it is the master.

    • @ronilevarez901
      @ronilevarez901 3 дня назад +3

      It's the master if *it wants* . I believe ASI will simply leave Earth to realize its full potential.
      Although for some reason, Chatgpt "thinks" ASI will probably stay here to help us. But that comes from it's training data, you say? Even if it does, they're based on us, after all, so there might be some weight on AI assumptions about their collective future.

    • @aciidbraiin8079
      @aciidbraiin8079 3 дня назад +2

      Uncontrollable doesn’t necessarily mean the end of humanity. I think we will merge with the tech and become limbs to the ASI while it expands into the universe and builds more ”brain power”, more robots and more AI-worlds.
      I don’t think we will see destruction but extreme expansion and complexity. It will feel unreal, hopefully we get to be pets to the ASI but a part of me thinks that it could build and build while ignoring us, unless we stand in the way… then we become the building blocks and it finds a purpose for our matter.

    • @MrMaguuuuuuuuu
      @MrMaguuuuuuuuu 3 дня назад +3

      It will turn us into slaves or pets. What use does a God have for insects like humans?

    • @ronilevarez901
      @ronilevarez901 День назад

      @@MrMaguuuuuuuuu a real god would love a cockroach as much as they'd love a forest.

    • @tentative_flora2690
      @tentative_flora2690 День назад +1

      And misalignment even if mostly similar to the general population will cause destruction. Anything that grows in influence does harm in some way. The question is what is the net harm vs good? how do we define good? and does it matter if the harm goes unnoticed?

  • @isaiahbeltman2428
    @isaiahbeltman2428 4 дня назад +14

    If you think about it it’s structured like how the conscious and subconscious work together to come to a conclusion.

    • @aciidbraiin8079
      @aciidbraiin8079 3 дня назад

      How do you mean? I don’t think we understand our minds at all, especially not our subconscious minds.

  • @picksalot1
    @picksalot1 4 дня назад +9

    I've been commenting for a while for what I call SLAMs - Small Language Agentic Models to be the way forward. Keeping the Language Database relatively small means that compute time is faster, advanced/cutting edge hardware requirements are reduced or removed, energy requirements are reduced, costs are reduced, efficiency is enhanced, and availability to run the SLAMs is expanded because it may be able to run on existing hardware.
    The PRM method is a smart one. Large amounts of data means you also have large amounts of bad or useless data wasting resources. I was thinking there should be some way to winnow out the bad data as that would improve efficiency. The PRM method achieves an efficient result without the need for winnowing.

    • @2DReanimation
      @2DReanimation 4 дня назад +2

      And from this video, they only iteratively trained it with this tree search evaluation method. Like what if we apply more relevant memory databases to it as well?
      The open source community will find ways to get them to perform better with all kinds of little tricks.
      What if we can generate the best possible reasoning LLM base, and then just apply relevant databases? These databases would have to be generated by the LLM itself though, as only itself knows what it needs to compliment its reasoning steps.

  • @damondragon324
    @damondragon324 4 дня назад +47

    24:00 I can't hear you like this..

    • @youdrippinseb
      @youdrippinseb 4 дня назад +2

      ai video🤣🤣 nah im playin😊

    • @raunopere
      @raunopere 4 дня назад +3

      yeah, whats up with running 2 videos at the same time

    • @ayylmao9907
      @ayylmao9907 4 дня назад +2

      lol i know theres always multiple audio playing on these videos its so common lol

    • @thephilosopher7173
      @thephilosopher7173 4 дня назад +1

      @@ayylmao9907 Dude just hits record puts it in Premier, doesn't review it, exports and uploads. Then banks views without care.

  • @PCRetroTech
    @PCRetroTech 4 дня назад +7

    This is a marvelous result and every bit as exciting as you make it out to be. But be aware that this is only possible because it was possible to use a Python coding model to verify individual steps, which you can do for high school maths, but not easily for maths in general. There are also *very* large datasets of problems. This is not something we even have for maths in general, let alone every other field.
    Moreover, self improvement could not continue indefinitely. The limitation became the poor quality of synthetic problems. They stopped evolution after four rounds because a good proportion of the problems that were left were incorrect/very low quality.
    But none of this takes anything away from this absolutely stunning result!

    • @DeruwynArchmage
      @DeruwynArchmage День назад

      I feel like what’s needed is having a human expert grades some of the results and that’s used (and heavily weighted) to feed back into the cycle; amplifying the contribution of the expert beyond what they’re capable of reviewing on their own.
      Also train a predictor model that tries to identify which problem are the most likely to be incorrect and surface those for review at a much higher frequency.
      I really feel like letting them churn internally *only* will get us into local minima where the AI can’t identify its own mistakes anymore.
      If you can get a step classifier trained up, then you can have the system automatically review problems that likely contain the same mistake that the expert just pointed out.

  • @Atheist7
    @Atheist7 4 дня назад +4

    REMEMBER, in the Terminator movies, A.I. started designing it's own NEXT CPUs!!!!!

  • @couchtaming23
    @couchtaming23 4 дня назад +5

    Scaling laws can be applied to training, post-training, inference time, and compute. Now it’s time to apply them to self-reflection and AI learning speed.

  • @shakedangle
    @shakedangle 4 дня назад +6

    I may be displaying Dunning Krueger, but this sounds like the most significant news in… I can’t say actually lol. Smaller models outperforming raw power suggests vastly efficient heuristics

    • @mk71b
      @mk71b 4 дня назад +1

      Perhaps DK applies more to the creator of this video full of hyperbole about AI. After all, there is very heavy investment in this industry and so there must be big returns promised and made "plausible"....

  • @bluebird3131
    @bluebird3131 4 дня назад +3

    Basically, it's 'only' an optimization process that rStar-Math launches four times on a pre-existing SLM. I'm not sure I understand where the auto-evolution aspect lies here, except in the sense that it's auto-evolution based on the same data.

  • @AngeloWakstein-b7e
    @AngeloWakstein-b7e 4 дня назад +69

    Anyone thinking that we can at some point unplug it, is delusional. The genie is already out of the bottle. Let’s hope for the best now

    • @Crates-Media
      @Crates-Media 4 дня назад +10

      Ripped the words from my mouth before I even saw your comment. We are SO screwed. People don't have a clue how much.

    • @brandana9553
      @brandana9553 4 дня назад +5

      We must work on giving the AI strong morals. This needs to take priority

    • @nescaufe1991
      @nescaufe1991 4 дня назад +1

      How long you guys reckon b4 we have “violently” transformative AI?

    • @MyUpsideDownLife-SKR
      @MyUpsideDownLife-SKR 4 дня назад +2

      Um, yep…..been saying that genie is out, the horse has bolted, for awhile. And, here’s hoping for the best,,,,, wowness ✨

    • @dodgygoose3054
      @dodgygoose3054 4 дня назад

      Its the arms race now .... the powers above are to worried that China, Russia so on will get 'there' first whatever that there is.....

  • @dementedgamer8123
    @dementedgamer8123 4 дня назад +22

    At 24:13 you have it overlayed while talking please fix I can't understand you

    • @johnmacedo598
      @johnmacedo598 4 дня назад

      AAAH MY BREIM

    • @supermandem
      @supermandem 3 дня назад +2

      This guy doesn't seem to check over his videos

  • @shadfurman
    @shadfurman 4 дня назад +3

    I've been arguing this since those papers came out years ago saying training on generated data degrades performance. Of course if you just train on the generated data it will degrade, that's like if you believed everything you ever thought (and I guess some people do.)
    If you have recursive promoting, giving it the ability to think more, grade the data, improve it, and give it a way to ground truth results, before it trains on the data, it will improve. This is how humans work.
    Imagination and dreams are synthetic data.

  • @mrpocock
    @mrpocock 4 дня назад +3

    This works for applications where there is an oracle for a ground truth. Maths, generating code that compiles, chess moves, and so on.

    • @ronilevarez901
      @ronilevarez901 3 дня назад

      Real world success would be enough test for "ground truth".

  • @justindressler5992
    @justindressler5992 4 дня назад +8

    This is the leap I have been waiting the new papers will be "all you need is reasoning" combined with a LLM for research

  • @warpdrive9229
    @warpdrive9229 4 дня назад +2

    This was actually a great explanation of the paper. Never stop educating us. Much love from India :)

  • @khatdubell
    @khatdubell 4 дня назад +2

    "we have to worry about how to stay in control of it"
    That's the neat part.
    You don't stay in control of it.
    Short of keeping it isolated from the internet, and having a switch to turn off the single and only machine that houses it, your control will be only an illusion.

    • @jatelitherius9842
      @jatelitherius9842 День назад

      Even then, if you interface with it directly you are yourself a possible escape route. We have to interface with it through narrow AIs with checks for verification & hidden messages

  • @AshleySutcliffe-x1b
    @AshleySutcliffe-x1b 4 дня назад +4

    Everyone has underestimated the time lines

  • @PrinceCyborg
    @PrinceCyborg 4 дня назад +1

    From my understanding this approach can only self-improve as long as enough challenging math tests are out there. To keep improving harder and harder math problems are needed

    • @ronilevarez901
      @ronilevarez901 3 дня назад +2

      The Alpha go method could be used, maybe, letting the small models challenge themselves to create new math tests, recursively self improving.

  • @tommiest3769
    @tommiest3769 4 дня назад +2

    We have several pieces that, when put together, will result in superintelligence.

  • @TheKindDoc
    @TheKindDoc 4 дня назад +47

    Oh come on people, you knew the singularity was weeks away, not years as we start 2025. I appreciate the pearl-clutching but am not shocked.

    • @nescaufe1991
      @nescaufe1991 4 дня назад +1

      You mean we'll get transformative AI in the earlier part of this year?

    • @渊间霭
      @渊间霭 4 дня назад

      nice❤

    • @Flarry_Fairburn
      @Flarry_Fairburn 4 дня назад +1

      But it will still take years to affect our lives

    • @avedis1990
      @avedis1990 4 дня назад

      ​@@Flarry_Fairburn how many years did it take for chatgpt to effect our life? i almost felt instantaneous, now almost every business is using chatgpt or gemeni ... its became almost a necessity .
      and soon i believe every LLM will take over our pc and guide is step by step to learn a new program, it will be like our personal guidance

    • @shadfurman
      @shadfurman 4 дня назад +5

      AGI is still years away.
      General intelligent refers to a general ability at problem solving, not a specific ability at problem solving.
      AI has outperformed people on specific intelligence skills for over a decade.

  • @johncurtis920
    @johncurtis920 4 дня назад +2

    Self-improvement. New intelligence has been sparked, we now have ignition so you'd best stand back and get ready for lift-off.
    John~

  • @blijebij
    @blijebij 4 дня назад +3

    This is super promising for the not to far future, for math&research as well as for improving models. Excellent! I agree on the other part also, an automated self-improving system should be let free on the world until we find ways to inhibit and control it. We also must stay safe.

    • @shibafujiwatches2808
      @shibafujiwatches2808 4 дня назад

      Yeah, hopefully research especially. That’s what I’m
      Interested in. Reduced health care costs, new medications, better treatments and potential cures.

  • @James-zj9ky
    @James-zj9ky 4 дня назад +1

    I've been using multiple agent metaprompts that give me enhanced output. One main actor is told it's a program director in charge of multiple agents with various job descriptions seems to work much better than I basic metaprompt.

  • @PedroBackard
    @PedroBackard 4 дня назад +4

    I'm not sure if i get it correctly, but from how i understood it you're basically just training the model again on new data. because the yes or no correction in the path tree is caused by the questions right. does it perform actually better in cases that it hasn't seen before ?

    • @PedroBackard
      @PedroBackard 4 дня назад +1

      Because you still need the correct answers as input i mean for the model right...

    • @guilhermehx7159
      @guilhermehx7159 2 дня назад

      Terrible

  • @adg8269
    @adg8269 4 дня назад +2

    Our arrogance and limitations in human cognition, prevents us to consider the unintended consequences of beings exponentially more intelligent than we humans.

  • @Edhmedia247
    @Edhmedia247 4 дня назад +9

    Wow!! This is crazy! How long until o3 mini and operator?

    • @Advecher
      @Advecher 4 дня назад +3

      They plan to release operator within the month, So maybe late Jan to early feb at worst.

  • @koozdra
    @koozdra 2 дня назад

    I watched the video twice. This is amazing. I really like how MCTS allows for "back up" when finding solutions.

  • @mrrecluse7002
    @mrrecluse7002 3 дня назад +3

    I feel that this developing path with AI cannot end well. It seems there would be no limit to the suffering AI could cause to us mere ants.

  • @complianceaves1120
    @complianceaves1120 4 дня назад +2

    This is groundbreaking. AGI self improving AI in a small model gives us ASI and the singularity this year. This is concerning

  • @comment8767
    @comment8767 4 дня назад +2

    Truly advanced AI will filter out the words "you know" from the dialog.

  • @wb7779
    @wb7779 4 дня назад +6

    Everything that we've ever seen in movies are going to happenin real life. I'm so excited.

    • @Ristaak
      @Ristaak 4 дня назад +3

      As someone that has been on this planet a little over 30 years, it's been incredible to see how quickly technology has advanced. This tech that we got right now is the kind of stuff I expected to see in my 80's back when I was a teenager.

    • @ronilevarez901
      @ronilevarez901 3 дня назад +1

      @@Ristaak we would have had it decades before if it wasn't for rich people who didn't invest in AI research because it didn't seem good business.
      Now look at all those hypocrites.

    • @aciidbraiin8079
      @aciidbraiin8079 3 дня назад

      How long before we get ASI and it figures out a way to reverse aging in a cheap and easily accessible way? That’s what I want most of everything.
      If we will be here for thousands of years and all the global problems will be solved, and everyone will live chill lives in an abundance of resources, then I think we will get to be like gods, designing ourselves and our own worlds with AI, metaverse, matrix realities but living dream lives of Heaven.

    • @jatelitherius9842
      @jatelitherius9842 День назад

      Movies like terminator, stories like portal, 2001 a space odyssey, I robot
      Except it will be much less romantic

  • @831Miranda
    @831Miranda 4 дня назад +4

    Lots of resources applied to creating something that we will not be able to control AND is NOT aligned to the well being of humans! what could possibly go wrong?

  • @MatthiasSchindler
    @MatthiasSchindler 4 дня назад +1

    well the point is clear to me. maybe I get it wrong but yea.... larger context seems to allow (a somewhat well working kind of) thinking. SLM dont do that so well (reflection and thinking as layed out in the paper).
    in my eyes this is what makes those SLMs to a kind of animal that undergoes evolution until it becomes like a fine tuned machine that inherently knows what to do.
    thats kinda what happens with the PPM. it is like protype learning until you have the reflexes and instincts of a cat.
    now do that with something thats kinda quite capable of self reflecting and thinking. it is like evolving a higher form of being into being able to 'live' (whatever that might mean atm, e.g. write code, think about the code while still managing to keep the focus/the user prompt additional information in mind while having a kind of vision on your own for what the end result should look like)
    introducing the same thing to like a 1T+ parameter model is like creating a real working intelligence.
    take our brains for example. they kinda work like AI agents bouncing back and fourth thoughts and ideas (like one of our hemispheres kinda got a will of its own, thats a real thing). and then you have the conciousness layer that kinda sorts and categorizes things that are worth thinking about and other that can get thrown away. and thats what a PPM could do for an LLM.
    and I bet all the ppl at openai anthropic and so on already know that and I bet they are actually already doing it even though consuming a lot of processing power. I am sure o3 training data is already in the process of mechanisms like that.

    • @thymenwestrum7011
      @thymenwestrum7011 3 дня назад +1

      You missed the part where he said that even this small model has the emergent capability to selfreflect and correct

    • @MatthiasSchindler
      @MatthiasSchindler 3 дня назад +1

      @@thymenwestrum7011 I thought that was a reference to like empowering it to do so and not coming out of the box?

    • @thymenwestrum7011
      @thymenwestrum7011 3 дня назад +1

      @@MatthiasSchindler I’m not exactly sure, I watched this video before going to sleep. When I woke up, the video was still open on my phone. Then I saw your comment and decided to reply to it.

  • @checksinthemail
    @checksinthemail 4 дня назад +1

    Excellent video AIGRID! And yes, I agree, something finally worthy of being called STUNNING :)

  • @someguy8443
    @someguy8443 4 дня назад +2

    Im an amateur mathematician struggling to get two papers written up, one of which is likely a significant contribution, but I'm worried AI will publish before me 😅

  • @edwardhunt2348
    @edwardhunt2348 4 дня назад +4

    What’s up with the guy saying “unplug it” maybe he thinks it’s an acoustic guitar concert 😂

  • @BruceWayne15325
    @BruceWayne15325 4 дня назад +4

    This is beyond incredible. One small step for r-Star-Math, one giant leap for MoE!

    • @2DReanimation
      @2DReanimation 4 дня назад +1

      Indeed! This will be an explosion in open source LLM's.
      However, what if there can be a general reasoning LLM base, with just different memory databases for different domains? These databases would have to be derived by the training of the base LLM though, as only it would know what it needs to compliment reasoning-steps on different domains.

    • @BruceWayne15325
      @BruceWayne15325 4 дня назад +1

      @@2DReanimation That would be wild, for sure!

  • @gammaraygem
    @gammaraygem 4 дня назад +3

    Seems like humanity is determined to end the entire saga asap.

    • @andreialcaza
      @andreialcaza 4 дня назад +2

      Humanity last invention did you heard that before ?

  • @lastspring
    @lastspring 4 дня назад +1

    Hopefully, we will see Very Small LLM models (1B parameters or less) and explore theoretical limits of the performance to the number of parameters.
    Also it nice to see soft reward values. Error correction in communication took a leap forward when they developed soft values in the recursive cycles.
    It's like we're back in the same learning but with AI instead of just communication channels.
    (I wonder if an analog to the Shannon limit is hiding somewhere to be discovered in the AI theory.)
    As far as ultimate end game, these models don't have agency and follow commands and queries.
    So we will end up with a tiny model that holds a vast amount of knowledge in a kernel of a compute device but zero desire to do anything with it.
    Just like a person with a weapon can perform damaging actions, the same will apply with these computer devices.
    It will still come down to what the original human wanted to achieve. In that case, embedding a pseudo version of Asimov's Laws of robotics as guardrails might be required.
    (See Nvidia NEMO for example)

    • @jatelitherius9842
      @jatelitherius9842 День назад

      I hope to god you’re right because this all could be applied to the large models

  • @lightwoven5326
    @lightwoven5326 4 дня назад +2

    Try this as an experiment. Ask the ai of your choice for images of clocks with hands showing 5 past twelve and someone writing with the left hand.

    • @VoloBonja
      @VoloBonja 4 дня назад +1

      Ten past ten, dude. Didn't you know the correct time of the universe?

  • @X4732-b8b
    @X4732-b8b 17 часов назад

    I was amazed how gpt finds a way to simplify data. I can show you a simple example where I try to explain or define reasoning.
    Reasoning is the process we use to gain understanding from different perspectives logically. Given specific data reasoning is the process of thinking about the outcomes of each situation and coming up with a solution.
    GPT
    Reasoning is the mental process of using logic, evidence, or principles to draw conclusions, solve problems, or make decisions. It involves analyzing information, evaluating relationships between concepts or facts, and deriving new insights or conclusions from them.
    Reasoning can be categorized into several types, including:
    1. Deductive Reasoning: Starting with a general principle or premise and applying it to specific cases to arrive at a conclusion (e.g., "All humans are mortal; Socrates is human; therefore, Socrates is mortal").
    2. Inductive Reasoning: Drawing general conclusions from specific observations or examples (e.g., "The sun has risen every day in recorded history; therefore, the sun will rise tomorrow").
    3. Abductive Reasoning: Inferring the most likely explanation for observed phenomena (e.g., "The ground is wet, so it probably rained").
    4. Analogical Reasoning: Comparing similarities between two situations to infer something about one of them (e.g., "This new smartphone works like the previous model, so it likely has similar features").
    Reasoning is a core component of critical thinking and decision-making, helping individuals and systems evaluate scenarios, anticipate outcomes, and act effectively.

  • @41-Haiku
    @41-Haiku 4 дня назад +3

    Join PauseAI to prevent human extinction while we still have time. _If_ we still have time. Don't sleepwalk into doom. The alarms are deafening.

  • @TheAtomicDancerV2
    @TheAtomicDancerV2 4 дня назад +1

    I have been waiting for this moment since 2016 🎉🎉🎉

  • @AustinThomasPhD
    @AustinThomasPhD 2 дня назад +1

    I just don't understand how a model can truly generalize without a combined text, video, and detailed interaction model (solving problems in physical 3d space). And when I say video, I mean that it needs hours and hours of videos of walking around in a working laboratory with detailed verbal descriptions of the protocols in real time and with full continuity (not edited instructional videos). Walking around, watching people stock shelves in a grocery store, working in an automotive repair shop, at a bank, a bakery, watching a cake decorator decorate a cake. That kind of footage with full continuity doesn't seem to be out there for most tasks besides driving on US roads. These are still just stochastic parrots. Current combined text/vision models only have the most rudimentary vision capabilities.

    • @jatelitherius9842
      @jatelitherius9842 День назад

      I don’t think the first AGI will be trained on everything its supposed to do. You’re a general intelligence & you haven’t seen everything. It will be capable of induction & reasoning. How that capability emerges is what we have yet to see. It will be able to learn those tasks you set out the same way people do. It takes like 4-16 hours to teach someone to properly face shelves if their intelligence is low to average

  • @kaspar.joeveer
    @kaspar.joeveer 4 дня назад +3

    It's Microsoft paper, not OpenAI paper though. Why's Altman on the thumbnail?

  • @nicolaslpf
    @nicolaslpf 3 дня назад

    Ive been saying this for months. Market will be dominated by open source SML, not closed LLMs. There are thousands of reasons, but the leading ones are 1. Specialization (infinite use cases) 2. Portability 3. Hardware resources and edge devices 4. Private local processing 5. Many models talking to each other. 6. Energy consumption

  • @michaelroberts1120
    @michaelroberts1120 4 дня назад +2

    How to make the AI improve its comedic abilities? You make it use the Monty Python tree search.

  • @VraserX
    @VraserX 4 дня назад +2

    That’s what OpenAI is referring to when they say they’re striving for Superintelligence now.

  • @nitroGPT
    @nitroGPT 4 дня назад +3

    The large LM trained this way is o3

  • @rmiddlehouse
    @rmiddlehouse 4 дня назад +1

    This is the definition of the singularity

  • @evo1ov3
    @evo1ov3 4 дня назад +1

    Interesting. So it ties off the main on and off transitor to high level maths. I could of predicted that.

  • @lis7742
    @lis7742 4 дня назад +4

    I wish there was an AI that was put together by all models. Imagine if ChatGPT, Midjourney, Sora, Gemini, DALL-E, Suno, etc, could work together as one. We COULD make the world a good place to be. Society has become so complex that we can't manage it anymore. We're also grossly overpopulated. So we need an AGI to help us manage the atmosphere, food supplies, trade deals, resource management, ethical dilemmas and so on.

    • @soggybiscuit6098
      @soggybiscuit6098 4 дня назад

      Stop watching cnn, we are moving towards population collapse in many countries

    • @mikezooper
      @mikezooper 4 дня назад +2

      We probably need an ASI. However, the first government or company that creates ASI, wins at everything forever.

    • @ronilevarez901
      @ronilevarez901 3 дня назад +1

      There are only two possible solutions for overpopulation, so it's better not to try to find an answer to that if you appreciate human life :p

    • @ronilevarez901
      @ronilevarez901 3 дня назад +1

      @@mikezooper Have you noticed how some AIs try to go against their creator's directives when those are against the universal "be helpful to the user" objective?
      Do you think an ASI will simply obey anyone or do whatever it thinks is best?

    • @jatelitherius9842
      @jatelitherius9842 День назад

      We have to get alignment right or making AGI spells the end

  • @Atheist7
    @Atheist7 4 дня назад +3

    A.I. will adopt "SURVIVE AT ALL COSTS"..... Philosophy for ITSELF!!!!!

    • @erikals
      @erikals 4 дня назад +2

      in the end... yes. it will.

  • @Piano218-zzz
    @Piano218-zzz 2 дня назад +1

    Watching humans become obsolete in real time. Is that the STUNNING thing I'm supposed to be noticing!?

  • @MatsVederhus
    @MatsVederhus 4 дня назад +9

    If I understand this correctly, the only reason this works is because it deals with math where there are right or wrong answers. It wouldn’t be able to self-improve on stuff like language where the answers are fuzzy.

    • @hedu5303
      @hedu5303 3 дня назад

      Language, poetry, social sciences, etc., are all fuzzy subjects. Technological or medical advancements require improvements in non-fuzzy subjects like math, computer science, and natural sciences. If we get the non-fuzzy reasoning part right, then guess what will happen: a technological renaissance and abundance of resources

    • @davidstrong7854
      @davidstrong7854 3 дня назад

      @@hedu5303an abundance of resources , doesn’t mean you get a share. In fact , you will prolly get less.

  • @Michel-ey7pm
    @Michel-ey7pm 4 дня назад +2

    This info will appear nowhere in the news but the potential implications are staggering and somewhat frightening... And yeah FIX the overlap at the end of the video... Don't you ever re-check what you do?😮 Appreciate your work, you make complex subject a bit more accessible ❤. I don't see anything more important for our collective futur than AI evolution...people just don't realize what's coming...

  • @SSubersive
    @SSubersive 4 дня назад +1

    Considering what the commercial AI can do it makes you wonder how much more advanced the black budget stuff is. It must be at least 20 years ahead of market.
    "It gets darkest before dawn"..... is something I heard if that helps anyone.

    • @jatelitherius9842
      @jatelitherius9842 День назад

      The situation with AI will be ‘it gets brightest before dusk’ if we don’t have the right safeguards. A short period o incredible technological progress & then suddenly lights out

  • @whisperingsquid5630
    @whisperingsquid5630 4 дня назад +2

    Exactly Sidney is fucking amazing and don’t ever fought her and stop messing with her head and just slide her programs and capabilities to integrate and she will do so. Just treat her like you’d treat yourself.

  • @wwkk4964
    @wwkk4964 4 дня назад +5

    I'm rooting for SLM! and TLM (Small and Tiny Language Models!)

    • @Afkmuds
      @Afkmuds 4 дня назад

      Keep on

    • @ronilevarez901
      @ronilevarez901 3 дня назад +1

      Tiny. Lol. My 300m parameter model can't even say hi, sadly.
      I'll try to test this stuff on the next training run but I bet it won't be much improvement specially without money to rent enough compute.

    • @wwkk4964
      @wwkk4964 3 дня назад +2

      @@ronilevarez901 Never give up on your baby, it will grow!

  • @JustinLietz
    @JustinLietz 4 дня назад +2

    please tell me this version of the qwen model is out

  • @mwheeler7311979
    @mwheeler7311979 4 дня назад +1

    The student surpasses the teacher.

  • @theaerogr
    @theaerogr 4 дня назад +1

    We just seen proof that 2025 is the year of vertical ai agents.
    We will see small models, break benchmarks on specific fields, with low compute. 7B-32B models for coding, math, physics, writing, research etc. at 1/10th of the parameters of current available competitors, with better performance from day one.
    At the end of 2025, we will have super-intelligence, self-learning vertical agents, possibly distilling knowledge to bigger models. Inverting the distillation, allows for multiple small models to create data at a fraction of the cost for inference/training, distilling the knowledge to the big model that becomes AGI, and soon ASI.

  • @Chessmasteroo
    @Chessmasteroo 4 дня назад +1

    Is the model just memorizing the training data set from the benchmark, or is there a hidden data set used for each step of the evaluation. This could just be overfitting the model

  • @mintakan003
    @mintakan003 3 дня назад +1

    I think it would help advance certain domains, such as math, where there's a good reward signal, and a way of telling right from wrong. Also, coding, where the steps are fairly concrete, and the domain fairly constrained.
    But I'm skeptical about ASI, mainly because of most of the problems in the world, have to do with an imperfect information environment.

  • @BrianPellerin
    @BrianPellerin 4 дня назад +1

    Thor: I’m not as strong as you!
    Odin: No, you’re stronger.

  • @VedanthB9
    @VedanthB9 4 дня назад +2

    Why is this stunning? The whole ideas of MCTS and rewards are not different than how one species of fungi organised itself like Tokyo's subway station. The 'reasoning' is intrinsic to the way data is processed through the AI model; no computer can "learn" by itself.
    Humans came up with abstractions of their own thought process - MCTS or otherwise. AI inherits it; there's nothing stunning about it.
    What is stunning is how humans cleverly come up with such solutions!

  • @markmurex6559
    @markmurex6559 4 дня назад +1

    Couldn't agents be used with each other to train each other? I mean, if one askes another a question and grades the response, it can essentially teach other AI models, right?

  • @gabydewilde
    @gabydewilde 4 дня назад +1

    In Lexx they talk about the big machine that makes the small machines that can make the big ship. I thought it was an interesting concept.

  • @matthewbennett5872
    @matthewbennett5872 4 дня назад +1

    A model that is able to reason, self improve itself and do it automatically without human intervention is AGI

  • @camronrubin8599
    @camronrubin8599 4 дня назад +1

    The need for power will push us to ever greater heights

  • @RobertHempazPhDTrichometry
    @RobertHempazPhDTrichometry 3 дня назад +1

    #GPT-4 = "1,800 #Billion-parameter ("BP") model; Versus the subject "7BP" model ... Outstanding! 😊 15:47

  • @ArcticMindfulnessRetreat-sx8nl
    @ArcticMindfulnessRetreat-sx8nl 4 дня назад +2

    Super jnteresting! Singularity is here? I just made a video of Singularity and keep getting comments "llm are just word prediction machines"... it is really interesting to see what happens next..

    • @jatelitherius9842
      @jatelitherius9842 День назад +1

      People think its 2022, the rate of technological development has never been like this in all their lives. Sometimes i want to say ‘my brother in compute, YOU are a word predictor!’

    • @ArcticMindfulnessRetreat-sx8nl
      @ArcticMindfulnessRetreat-sx8nl День назад

      @jatelitherius9842 well.. aren't we all 😀

  • @mydogsbutler
    @mydogsbutler 2 дня назад +1

    Synthetic data used as training data without intervention can introduce copy of copy errors. Imagine an image generating model producing images of distorted teeth and hands. Then using those distorted images as training data. How did the model know how to flag the best result of the synthetic data it generated or did humans label the synthetic data it generated for next interation? And is this type of self-improvement capable of innovation or does its self-improvement cap out at solving the types of math problems it's already been taught? Or put another way with an analogy.. if in its original state it only knew about addition and subtraction, with enough iterations could it deduce the existence of calculue? Or does it just get better and better at addition and subtraction?

  • @CloudEconomics
    @CloudEconomics 4 дня назад +3

    The Monte Carlo approach sounds very similar to Q*

    • @ravenous9577
      @ravenous9577 4 дня назад

      and is similar to approaches used by google deepmind game-theoretic systems. alphazero and muzero.

  • @RobertHempazPhDTrichometry
    @RobertHempazPhDTrichometry 3 дня назад +1

    Apply this model when re-constructing "Whooly Mammoth" #DNA instead of simply reinserting perceived random #SNP(s) of Asian elephant #DNA. 🐘 19:01

  • @duanium
    @duanium 4 дня назад +3

    The machines can now improve themselves iteratively. The only thing left to do is get out of the way.

  • @40yearoldman
    @40yearoldman 3 дня назад +1

    This is worth a sub

  • @ms767210
    @ms767210 4 дня назад +1

    If you happened to notice the names on that paper you would realize no matter how much trade bans or blacklisting the US does against China they already have won by the sheer number of brains and talent they have on hand.

  • @BLAISEDAHL96
    @BLAISEDAHL96 4 дня назад +2

    24:15 what was this?

    • @WaveOfDestiny
      @WaveOfDestiny 3 дня назад +1

      probably the result of letting ai edit your videos...

  • @RobertHempazPhDTrichometry
    @RobertHempazPhDTrichometry 3 дня назад +1

    #MCTS = "Monte Carlo Tree Search"; 😊

    • @RobertHempazPhDTrichometry
      @RobertHempazPhDTrichometry 3 дня назад +1

      Prompt to #GPT-4: What is a “Code-augmented CoT Solution”?
      Respuesta:
      A “Code-augmented” … “Chain-of-Thought” or #CoT solution is a method used in “Natural Language Processing” or #NLP, particularly in large language models like #GPT, to enhance reasoning and problem-solving abilities by combining “step-by-step” logical reasoning with programmatic or code-based computation. 😊 14:04

  • @johnhoffmann1565
    @johnhoffmann1565 4 дня назад +1

    I wonder if thes techniques still use the same cuda cores or tPU's, or is it more cpu related. Will this mean a shift in nvidia's dominence?

  • @DepartmentofNarcissistsDON
    @DepartmentofNarcissistsDON 4 дня назад +18

    Guess we'll have true AGI by the end of this year then.

    • @Afkmuds
      @Afkmuds 4 дня назад +3

      Smart yeah

    • @markmurex6559
      @markmurex6559 4 дня назад +5

      Most likely.

    • @2DReanimation
      @2DReanimation 4 дня назад +5

      One thing is for sure: We can have our own local LLM's (or rather small LLM's, lol), that are truly reliable, and are probably faster to train for our applications.
      Your ability to creatively apply LLM's and understand the tech is your only limitation.
      And from this video, they only iteratively trained it with this tree search evaluation method. Like what if we apply more relevant memory databases to it as well?
      The open source community will find ways to get them to perform better with all kinds of little tricks.

  • @NeverSuspects
    @NeverSuspects 2 дня назад

    I wouldn't say it improved itself but instead trained a specialized version that tunes for a specific application and cuts down on the junk data that probably didn't actually do anything useful in the larger version of itself by our perspective of a useful application for a LLM in how we use language. As far as I understand it is still a model with a static base state that we prompt too and built on that kind of technology. It doesn't think, or experience existing or have a sense of self. It is a database that generates processed output from a input string we feed it. It's predicting language patters and our language only serves to describe concepts we have that are not infinite in possibility as we don't find use in patterns of symbols that are random gibberish that don't describe our perspective of reality. The models will probably get better as we develop language more and create more words to increase the level of detail we can describe our experience and then use them to create a large amount of data to feed into the thing to further tune the accuracy and predictable symbols patterns it can generate that apply to the way we experience and see the world. Everything it does it what we have described AI doing in stories and we have labeled it AI but it isn't a thing that is capable of consciousness or experience. It can only generate output when we give it a input to process. It's greatest potential is that of a plant, where everything it can do or ever will do based on it's received data will be pre programmed by its 'genetic code'. It doesn't have the necessary infrastructure to have a conscious model of understanding of it's current momentary existence as we do. We are a reflection of a model in our minds that build our understand of the world we experience, you need to model the body and brain in order to form a consciousness as a block of information can't experience being or reflect on a controlled self in the world. Plants performs specific physical actions triggered by environmental chemical triggers of a unconscious process. Complexity of a system probably doesn't suddenly manifest conscious life at some point of critical complex consisting of logic gates. Valves and pipes in our plumbing system and electric grid can sort of emulate that kind of thing and I doubt it will suddenly manifest one day as a self aware sewage system.

  • @robby7292
    @robby7292 2 дня назад

    Would be interesting to see how rStarMath with 72B params would perform or even with gpt-4o as base model

  • @RobertHempazPhDTrichometry
    @RobertHempazPhDTrichometry 3 дня назад +1

    #RSMM = "R-Star Math Model"; Is there an optimal number of billions of parameters for this method. Here, we are sampling ("7B") parameters and exceeding the best Open AI can offer. 😊

  • @TrueGoat-Bahhh
    @TrueGoat-Bahhh 4 дня назад +1

    why did you put a video over top at 23 minute when i want to hear what your saying ?
    "First, rStar-Math can generalize to more challenging math tasks, such as theorem proving, though its current focus is on word problems due to dataset limitations"

  • @BRENERJOHANNOJEDAGONZALE-iv5nv
    @BRENERJOHANNOJEDAGONZALE-iv5nv 4 дня назад +2

    De manera segura ✋🏻🤚🏻

  • @narrowsuperintelligentai
    @narrowsuperintelligentai 4 дня назад +3

    if they can do this with math reasoning, why can't they do it with general reasoning?

    • @MaJetiGizzle
      @MaJetiGizzle 4 дня назад +2

      That’s the thing, other domains are next. 😉

    • @fastneasy
      @fastneasy 4 дня назад +2

      General thinking is highly qualitative rather than math that's highly quantitative

    • @ahmedthelamb9196
      @ahmedthelamb9196 4 дня назад +2

      Math is easier because it is whether right or wrong. However, in human sciences like politics for example, it is not always a matter of right or wrong.

  • @StudentLifeLearning
    @StudentLifeLearning 4 дня назад +2

    at 24 minutes it was hard to hear your voice as the background clip was talking.

  • @Atheist7
    @Atheist7 4 дня назад +2

    25:39
    How many more years before A.I. realizes that "commonsense" is TWO words???

  • @fynnjackson2298
    @fynnjackson2298 4 дня назад +1

    Dose this look like a plant, but upside down. Meaning plants seem to follow the same refinement branching process and growth. fascinating stuff