The Right Way To Train AGI Is Just GOOD Data?

Поделиться
HTML-код
  • Опубликовано: 20 ноя 2024

Комментарии • 133

  • @bycloudAI
    @bycloudAI  21 день назад +17

    To try everything Brilliant has to offer-free-for a full 30 days, visit brilliant.org/bycloud/ . You’ll also get 20% off an annual premium subscription!
    (I reuploaded this video cuz there was a pretty big mistake at 6:13, sorry notifications!)

    • @ghulammahboobahmadsiddique8272
      @ghulammahboobahmadsiddique8272 21 день назад +5

      What was the mistake? I watched the original one so I don't want to rewatch this just to know what's changed. So could you please say what the mistake was?

    • @heys3th
      @heys3th 21 день назад

      @@ghulammahboobahmadsiddique8272 They had the wrong graphic/text for Class 4 Complex

    • @bycloudAI
      @bycloudAI  21 день назад +3

      @@ghulammahboobahmadsiddique8272 6:13 i showed class 3 chaos twice, i needa catch some sleep lol

    • @jamesgreen.3271
      @jamesgreen.3271 21 день назад

      ​@@ghulammahboobahmadsiddique82726:16 Here class 3 and 4 had the same captions and images. But now he fixed it

    • @venadore
      @venadore 21 день назад

      Your editor still missed it in like two spots lmao

  • @weirdo8435
    @weirdo8435 21 день назад +160

    such a beautiful knowledge, i will not use it anywhere and not talk about it with anybody.

    • @rmt3589
      @rmt3589 20 дней назад +9

      I will! Gotta make my own AGI somehow.

    • @rawallon
      @rawallon 20 дней назад

      ​@@rmt3589that would be easier than actually making friends

    • @weltonbarbosa206
      @weltonbarbosa206 17 дней назад +1

      ironically I feel the same... There is absolutely no one I know that would even understand the beautiful in this concept comparing to human brain functions, the shear realization of how we perceives or own reality in the middle of caos. It's not total caos , it's just complex order.

  • @heys3th
    @heys3th 21 день назад +44

    These videos are so nice for someone like me with no technical background in machine learning. Thank you and please keep making more!

    • @GaiusAnonymous
      @GaiusAnonymous 20 дней назад +1

      Careful, last time I said "pls never change" he changed a week later.

  • @Ikbeneengeit
    @Ikbeneengeit 21 день назад +168

    I fell asleep while listening to this video and dropped my phone on my wife's head and now she's mad.

  • @OperationDarkside
    @OperationDarkside 21 день назад +32

    This feels somehow similar how physics is based on a seemingly simple set of rules, yet creates impossibly complex situations/states.
    There must be a limited set of core rules a base model needs to learn to become an effective reasoner.

    • @hugh8709
      @hugh8709 10 дней назад +2

      It is very similar. These rules are called elementary cellular automata and the naming scheme was developed by the physicist Stephen Wolfram. He has a theory to look for something analogous to explain complex physical phenomena. It has something to do with hypergraphs (I think sabine hossenfelder has a video on it). The connections between complexity theory, physics, machine learning, and intelligence is extremely interesting.

  • @fnytnqsladcgqlefzcqxlzlcgj9220
    @fnytnqsladcgqlefzcqxlzlcgj9220 20 дней назад +5

    Wolfram would love this

  • @zandrrlife
    @zandrrlife 20 дней назад +9

    Bro cover “Fourier Heads” or belief state transformer. Fourier head research is interesting, I see a lot of value of integrating Gaussian mixture model principles into LM to better handle complex distributions.
    To be honest. One of my core principles is disentanglement, there’s a reason why we don’t see expected performance gains with multimodal data and reasoning in general, the model treats it as a single continuous sequence, the solution I’ve been working is multivariate next-token prediction, where each modality is considered, and yes everything can be treated as distinct modality, even reasoning via structured reasoning tokens, instead of for T = sequence length, it would be N x T, where N is the modality count, almost like a time series problem, obviously increases sequence memory for the sequence, I’ve seen clear benefits and think it’s the future. Why I don’t expect legit breakthroughs from any of the top players. No new ideas. Rather no divergent ideas. AGI will be created by divergent thinkers. Someone already releases an entropix I believe it’s called, which recreates o1-preview style outputs lol, just needs DPO to really get that juice out. We need to fund our divergent thinkers.

  • @CYI3ERPUNK
    @CYI3ERPUNK 20 дней назад +3

    very carefully , and with compassion and wisdom

  • @adamrak7560
    @adamrak7560 20 дней назад +15

    You did not mention that rule 110 is Turing complete. It may be not because of the edge of chaos, but because of the Turing completeness.
    All Turing complete systems behave generally similar to what they define as edge of chaos. Although you can construct some which hides this under apparent noise.

    • @rmt3589
      @rmt3589 20 дней назад +4

      GPT-3 is already turning complete. It's a bad test.
      Edit: I mixed up Turing Test with Turing Complete. The above post makes no sense with it's context.

    • @owenpawling3956
      @owenpawling3956 20 дней назад +8

      @@rmt3589 I believe you are confusing the Turing test aka the imitation game with Turing completeness. Turing completeness refers to whether something can be used to simulate a Turing machine, which makes it computationally universal.

    • @4.0.4
      @4.0.4 20 дней назад +1

      ​@@owenpawling3956 nothing "passes" the Turing test, as it depends on the participants. But LLM are somewhat Turing complete if you assume infinite context to use as "tape".

    • @rmt3589
      @rmt3589 20 дней назад +2

      @@owenpawling3956 I was. You are correct. I'm also technically not wrong, but did not convey what I wanted to.
      I will go fix my post.

    • @rmt3589
      @rmt3589 20 дней назад +1

      @@4.0.4 LLMs long passed the Turing Test. Back when widespread conspiracy about Replika, which ran on GPT-3, being real people pretending to be AI.
      Now with so many fake AIs, people literally cannot tell what's human and what's AI, and keep getting surprised when one turns out to be the other. This is a perfect and natural Turing Test, and is being passed with flying colors.

  • @DeniSaputta
    @DeniSaputta 20 дней назад +12

    2:31 Lack of words like "skibidi"

  • @poipoi300
    @poipoi300 20 дней назад +15

    Multi-step prediction has been known for a while to perform poorly. It's best to either predict probabilities and sample or predict a single timestep, recursing for more. LLMs are doing both.

    • @ZeroRelevance
      @ZeroRelevance 20 дней назад +3

      It makes a lot of sense - to predict five steps in advance you’d need to predict one step 5 times in a row, and you only run the model once, so it’d have to make more shortcuts with each step’s prediction given how it has to fit it five times in the same space, accumulating errors in the process.

    • @poipoi300
      @poipoi300 20 дней назад

      @@ZeroRelevance What you said is true, but I think it's still possible to implement multi-step prediction in a performant manner. It would depend on the specific problem, but generally I can see a lot of instances where timestep 5 does not rely on timestep 4 or maybe even 3, so there is no error to accumulate from those steps.
      Currently, one of the big drawbacks of predicting multiple steps (this is true of predicting multiple values for one step as well) is that the loss associated with each predicted value is only weakly accounted for and chasing the gradient for an average increase in performance is likely to make some of the predicted values worse.
      What we need are better feedback mechanisms and more channels. MoE are a sort of rudimentary solution to channels, but we're still relying on SGD of the whole network and in some instances manual freezing, which I don't really like from a technical standpoint. We need to be able to decompose problems into multiple losses, but that might not even be possible depending on the problem.

  • @4.0.4
    @4.0.4 20 дней назад +7

    The "long tail" really explains why AI slop is so mid - it is literally the middle of the distribution of language. And you can see it in most models, even if different wording is used.

  • @ConnoisseurOfExistence
    @ConnoisseurOfExistence 20 дней назад +1

    Great video! I first heard about the brain being on the edge of chaos by Artem Kirsanov, who has a great channel (of the same name) on computational neuroscience. I'm thinking, those models that are trying to predict 5 steps at once might be ultimately better, but they would require much longer training (and maybe size), and therefore computational resources, to start learning some complex patterns. It could probably be tested with models that try predicting 2 steps ahead...

  • @tisfu17
    @tisfu17 18 дней назад

    It seems almost obvious that just chasing complexity horizons will lead to increasingly complex output potentials also, but to see how this can be done in practice, and related back to OG cell automata is very cool.

  • @cagedgandalf3472
    @cagedgandalf3472 19 дней назад

    What you mentioned reminds me of curriculum learning from RL. Start off training easy then gradually make it harder.

  • @MaxBrix
    @MaxBrix 20 дней назад

    If you train a model to predict sin waves from discrete data points it will approximate sin. The more training it gets the closer the approximation. The model does not learn the sin function. The benefit of this is that with missing or incorrect data the model can still approximate the correct sin wave where a real calculation would be completely wrong.

  • @ryanengel7736
    @ryanengel7736 20 дней назад

    very cool and interesting. I had some similar intuition and I'm glad you discussed this paper. Great work.

  • @ashish54713
    @ashish54713 16 дней назад

    Would love for you to cover the things going on around Entropix!

  • @ginqus
    @ginqus 20 дней назад

    why do i perfectly understand some of your videos, and at the same time get absolutely confused by others?? 😭😭

  • @GodbornNoven
    @GodbornNoven 20 дней назад

    Essentially, because the model learns to incorporate past states in its decision making, it becomes capable of better reasoning. AKA, this is just another case where transfer learning is truly an important key. Transfer learning aka generalization is also the reason why sample efficiency improves with training.

  • @MathewRenfro
    @MathewRenfro 21 день назад +11

    Stephen wolfram a New Kind of Science and itscomputational theory applied to training models is the way to go, me thinks

    • @stephen-torrence
      @stephen-torrence 19 дней назад +2

      Read it almost 13 years ago and felt I was accessing some seriously Arcane shit that I was not ready to know. Amazing seeing it applied to transformers.

  • @sahilx4954
    @sahilx4954 20 дней назад

    Thank you for your hard work. ❤ 🙏

  • @redthunder6183
    @redthunder6183 21 день назад +23

    This is basically just information theory…

    • @rmt3589
      @rmt3589 20 дней назад +2

      Looking it up. Ty.

    • @fnytnqsladcgqlefzcqxlzlcgj9220
      @fnytnqsladcgqlefzcqxlzlcgj9220 20 дней назад

      I have a playlist on my channel that has info about this, there are some good lecture sets linked in it, it's not in order so look through to find stuff, it's more of a directory. it's called advanced apaSiddhanta ​@@rmt3589

    • @fnytnqsladcgqlefzcqxlzlcgj9220
      @fnytnqsladcgqlefzcqxlzlcgj9220 20 дней назад

      @@rmt3589 I made a playlist about information theory, complexity, emergence etc

    • @warguy6474
      @warguy6474 17 дней назад

      not rly

    • @redthunder6183
      @redthunder6183 17 дней назад

      @ yeah, it’s exactly the same thing, read through the original information theory paper from like 70 years ago

  • @Lexxxco1
    @Lexxxco1 20 дней назад

    Great video as always, edge of chaos where understanding ends)

  • @tiagotiagot
    @tiagotiagot 19 дней назад

    Kinda reminds me of how some fighter-jets are designed to be on the edge of instability allowing for more extreme maneuverability by controlling when to lose control

  • @Game99Boss
    @Game99Boss 20 дней назад +2

    It's all about the "day daa" 😂

  • @ckq
    @ckq 20 дней назад

    Need someone to train an LLM on NBA scores. pretty simple but also trends like scoring effects comebacks run and momentum

  • @Zbezt
    @Zbezt 14 дней назад

    The collective human endeavor needs to mature before AGI can emerge in classical data its a simple fundamental that people dont understand

  • @timeflex
    @timeflex 20 дней назад

    I think, AI should constantly assess the state of play and pick an appropriate strategy, but in order to do that, it should be able to self-reason. Something like o1 but with an extra degree of freedom.

  • @retrofuturism
    @retrofuturism 20 дней назад

    Intelligence is the edge of chaos in a map territory feedback loop

  • @6AxisSage
    @6AxisSage 21 день назад

    Been there, done that

  • @sgttomas
    @sgttomas 20 дней назад +1

    fascinating ideas 💡

  • @KingKogarasumaru
    @KingKogarasumaru 16 дней назад

    So in the end, training from small problem to big problem like us humans is the best way to get better? Would the patterns equal to personality in terms of human?

  • @michelprins
    @michelprins 20 дней назад

    great video thx

  • @APozzi
    @APozzi 20 дней назад +1

    One extra point for the "Critical brain hypothesis", to become a factual theory.

  • @kryptobash9728
    @kryptobash9728 17 дней назад

    great vid

  • @smartduck904
    @smartduck904 20 дней назад +9

    The proper way to train AGI is to put it in the square hole

  • @blinded6502
    @blinded6502 14 дней назад

    I believe AIs should be taught similarly to humans. First they need to understand simpler artificial cases to perfection, sometimes giving them realworld cases. Like give them spinning cubes, then switch yo concave meshes, and only then to compound scenes

  • @pajeetsingh
    @pajeetsingh 20 дней назад

    Whichever supports my interests.

  • @drlordbasil
    @drlordbasil 20 дней назад

    I always say AI is evolving fully backwards, vision is one of basics after brain before language and logic. We are doing it completed in reverse.

  • @setop123
    @setop123 20 дней назад

    fascinating

  • @termisher5676
    @termisher5676 20 дней назад

    What about infinite language model
    Working on entire output at once and constantly improving it
    Like it was looking at babel libry with all possible word combinations and looking for the best one using hiking 7z algorithm or smth but it'll need beefy evaluation ai
    You could then use debuggers or language engines to prune combinations that are certain to fail

  • @md6886
    @md6886 20 дней назад +1

    I have not seen single AGI yet.

    • @tendriel
      @tendriel 20 дней назад

      How so

    • @waterbloom1213
      @waterbloom1213 20 дней назад +1

      No one has and those that claim so are deluded.
      Mistaking advanced algorithms for reasoning or sentience is a problem among techbros and data scientists, although for very different reasons.

  • @JaredQueiroz
    @JaredQueiroz 20 дней назад

    Well, AGI folks. Thats our future doodling chaotically like a baby... Good news: chaos was the answer and emergency is awesome. Bad news: that embryo can beat you at chess...

  • @JinKee
    @JinKee 20 дней назад

    The year is 2029 and the first AGI was raised Catholic

  • @io9021
    @io9021 21 день назад

    LLMs meet game of life 🤯

  • @Napert
    @Napert 20 дней назад

    Garbage in, garbage out

  • @Nimifosa
    @Nimifosa 20 дней назад

    Give me a pen and paper and i'll teach AGI anything and everything.

  • @warsin8641
    @warsin8641 19 дней назад

    Intelligence is fragile that's why it took so long to emerge

  • @problemsolver3254
    @problemsolver3254 17 дней назад

    I'm hella suss about Intelligence at the Edge of Chaos some how you make a violine graph for complex rules despite having 2 data points.

  • @robertburton432
    @robertburton432 19 дней назад

    The Pandoras box of programming?

  • @telotawa
    @telotawa 20 дней назад

    RLHF is the thing that makes them slop. base models are still way better at good writing

  • @kryptobash9728
    @kryptobash9728 17 дней назад

    what did training on rule 30 do?

  • @oonaonoff4878
    @oonaonoff4878 День назад

    not sierpinskis triangle

  • @eRuTAmMi_gNieB
    @eRuTAmMi_gNieB 20 дней назад

    Where's the 34th rule?

    • @chelol208
      @chelol208 20 дней назад

      Kazuma licks Aqua holes, by the way Colette from brawl stars have tasty legs 😋

  • @DuckGia
    @DuckGia 20 дней назад

    Or you can just understand intelligence down to a fundamental level.

  • @shodanxx
    @shodanxx 21 день назад +1

    IN PUBLIC

  • @jamesgphillips91
    @jamesgphillips91 20 дней назад

    self reflexive knowledge graphs... llms are a piece, the reasoning needs to be in a separate non-black boxed system. Neuro-symbolic ftw

  • @S8N4747
    @S8N4747 17 дней назад

    What does rule 46 do? 🐾

  • @ainet8415
    @ainet8415 19 дней назад

    I can't understand

  • @NLPprompter
    @NLPprompter 20 дней назад +1

    whoa whoa dude slow down.... need another video please this time slower.... please

  • @sirtom3011
    @sirtom3011 18 дней назад

    You don’t…😂

  • @Илья-у9в5с
    @Илья-у9в5с 20 дней назад

    dream

  • @bokuboke482
    @bokuboke482 19 дней назад

    Am I alone in noticing that A.I. learns a lot like bio brains? Study, reflect and connect, repeat incrementally to make REAL gains in comprehension and knowledge!

  • @vigoworkchannel1681
    @vigoworkchannel1681 20 дней назад

    Keep trying to break the wall set by God. I can’t wait to see the next Magic

  • @panzerofthelake4460
    @panzerofthelake4460 21 день назад +8

    why does this video sound like it's AI generated tho

    • @float32
      @float32 21 день назад +3

      AI reaches bycloud level of intelligence. - 2024

  • @demonicedone
    @demonicedone 20 дней назад

    🐢🐢🐊🦖

  • @Supermayne555
    @Supermayne555 21 день назад +2

    Thumbs down for the singing voice. It’s annoying. Ending with a high pitch like if Becky is gossiping with Amanda all day long