The hackiest way of making AI models self-aware

Поделиться
HTML-код
  • Опубликовано: 17 ноя 2024

Комментарии • 75

  • @kellymoses8566
    @kellymoses8566 2 месяца назад +7

    Yo dawg, I heard you like neural networks so I put a neural network inside your neural network so you can do gradient descent while you gradient descent!

  • @nathanhelmburger
    @nathanhelmburger 2 месяца назад +28

    Thanks for checking out the paper! Always great to hear others' perspectives on work I find interesting.

    • @Tunadorable
      @Tunadorable  2 месяца назад +4

      you’re welcome and thank you☺️

  • @AdamBrusselback
    @AdamBrusselback 2 месяца назад +17

    One thing that could be happening with the text classification accuracy is that it potentially has the capacity to learn more within the same parameter count if the internal representation is simpler for each thing it does end up learning.

  • @jondo7680
    @jondo7680 2 месяца назад +13

    If you have to predict yourself, if yourself are simpler you are easier to predict. So instead getting better at predicting itself the model makes itself less complex. I think that's a nice exploit.

    • @arti5946
      @arti5946 2 месяца назад

      I wish someday i will be able to understand this stuff )

    • @jondo7680
      @jondo7680 2 месяца назад

      @@arti5946 me 2 ;-)

  • @SirRebrl
    @SirRebrl 2 месяца назад +10

    Re: small talk, the patterns of small talk are consistent enough that it's a sort of carrier wave for other information. It facilitates two forms of communication that aren't strictly content-based - (1) a short-hand communication to verify compatible socialization, and (2) exchanging vibes via the carrier wave.
    (1) if you can and readily do engage in the same small talk content as someone you just met, then both of you know that you have a common basis of socialization that includes at least some overlap in sense of propriety, which is an important value to share
    (2) the mundane predictability of small talk content means that once the common carrier wave is established, it doesn't distract meaningfully from focusing on things like tone, cadence, body language, how the other person positions themselves in the conversation (ie, are they domineering, interested, disengaged, etc). All of those things are communicated _through_ the exchange of small talk content. And when you're already familiar with someone enough to know their baseline, you can use small talk to read aspects of their current state.
    Small talk is just a vehicle for a different form of communication.

    • @Tunadorable
      @Tunadorable  2 месяца назад +4

      yeeeeee I love learning in my comment section

    • @illarionbykov7401
      @illarionbykov7401 2 месяца назад +2

      That's a great demystification of small talk. It makes sense, and now I feel like I finally "get it" as I never liked or understood small talk (unless it was used in a spy movie as a means to exchange coded messages through double meaning in public without arousing suspicion). Turns out it has always been about delivering nonverbal "code" but I never got the memo and wasn't able to figure it out on my own.

    • @omargoodman2999
      @omargoodman2999 2 месяца назад

      I guess if you define it like that, it isn't "small talk" in general that some people would object to, it's just that very typical *kind* of small talk. For someone like a high-functioning Autistic, ADHD, or other neurodivergent type, they just don't "vibe" with nearly _all_ types of "small talk". So they simply generalize (especially Autistics who thrive on predictability and patterns), that if they've felt uncomfortable with all the small talk they've been exposed to, they _must_ just be turned off by "small talk" on a fundamental level.
      *And yet,* they'll _also_ mention how, while they might hate "empty" talk and prefer to think to themselves without distraction as a default, if they *do* encounter someone who can discuss complex matters just as well, as fast, as enthusiastically, and as well deeply as themselves, _they can go on for hours and you may not be able to shut them up._
      Typical people consider "small talk" just simple, shallow, "filler talk"; something that requires barely any _brainload_ to process. And if this is to free up that brainload to more deeply process _other_ details, gauging the safety and predictability of the other person with the "small talk" as the vehicle for those "cues"; then it's possible that those "deep dive", nuanced, enthusiastic conversations that "small talk haters" get into might just be their *equivalent* for the same process. They _can't_ judge safety and predictability of a person through "empty talk" because one of the very *core criteria* for what they'd consider a "safe, predictable, person similar to me", a person they "vibe" with, is how *meaningful and in-depth* their conversations are. That's one of the very cues they're _looking for!_ Possibly the most important in their mind.
      So, in a certain manner of _speaking,_ it's *all* "small talk"; but "small" is relative. If you're used to swimming in the ocean, you'd think of a swimming pool as "small talk" but a puddle on the street would be "shallow and dirty". And, by contrast, if you're used to three inches of water counting as "small talk", then a wading pool would look "deep" and a full-sized pool would look "pretentiously deep"...
      ... _especially_ if you've never seen the ocean and have no clue how "deep" deep can get.

    • @illarionbykov7401
      @illarionbykov7401 2 месяца назад

      @@omargoodman2999 I came to the US from a culture where people asked "how are you feeling" and others answered openly and honestly, even if life wasn't going well. In the US I was shocked to learn the only acceptable answers are "good" or "OK" or something like that, and an honest answer is considered "inappropriate" or "TMI" ("Too Much Information")... that's what I consider "small talk"... a kind of false talk/bluff talk or exchange of meaningless formalities, like both people pretending to be "fine" and talking about the weather even if someone's mother just died. I used to think of small talk as a waste of time or a fake performance ritual. Now I see as a game of bluff and posturing, of people pretending to be in a strong or comfortable position but subconsciously probing for subtle weaknesses in tone and body language and mannerisms from the other---small talk is a probing exchange between people in a low trust society, a polite way of detecting weakness or danger whereas real talk is a way to reinforce bonds between people who trust each other.

    • @omargoodman2999
      @omargoodman2999 2 месяца назад +1

      @@illarionbykov7401 In a way, it's similar to how the ritual of the handshake developed from checking to make sure both parties weren't armed before engaging in negotiations. Or the "salute" developing from the motion of raising the face guard of the helmet to show your face and prove you aren't using a full helmet to infiltrate as a saboteur or spy.
      Many rituals that we now consider acts of "politeness" or "respect" developed out of what used to be safe practices when life and limb were literally at stake. So "Small Talk", in a way, could also be viewed in such a way; testing someone on "local customs" to see if they really _are_ who and what they claim to be.
      I recall hearing about a method used to detect German spies during WW2. The Dutch would ask suspected spies to pronounce "Scheveningen", a seaside town in The Hague. Germans had a near impossible time correctly doing this with a convincing Dutch pronunciation because their German habits kept asserting more strongly. So, in this case, "Small Talk" was _literally_ a wartime counter-espionage tool.
      In more mundane cases, answering "Fine" or "OK", even when things are going really rough is, at least in the US, just understood as an implied "I'm coping with whatever problems I have as best I can and I don't know you well enough to go dumping them in your lap because you may have it just as bad, if not worse." Sure, there *are* an awful lot of cases where we have to be careful not to hand "ammo" over to those who would try to manipulate us. But I think, for the most part, a lot of it stems from this long-standing mentality of "if you can't deal with your own problems in life, it's a moral failure on your part; it isn't other people's problem to fix, *you* have to deal with problems that affect _you,_ whether you caused them or not." Now, I don't, _at all,_ agree with that mentality; I think it's toxic as hell and does far more harm than good. But I still recognize that it's the driving force behind a lot of these sociocultural habits.
      People have been trained, for generations, to believe that *not* being "ok" or "fine" with whatever is going wrong in their life means they're lazy, weak, selfish, spoiled, entitled, and being a burden on everyone else. We've been told for ages that effort isn't judged by how hard you tried but by how well you succeeded. If you're amazingly successful, it doesn't matter if you never lifted a finger and got others to do all the work for you; as long as you can take credit for the success, you get credit for the "hard work". Conversely, if you put in incredible effort, but things didn't work out the way you had planned, it doesn't matter what you learned to make future attempts better or to allow some entirely different success to happen if you didn't directly play a part in it. Your own efforts are counted as nothing if you fail and you'll be treated as if you hadn't even tried at all; like you were just lazy and doing nothing and twiddle your thumbs until failure occurred, rather than the lack of success being _in spite_ of your best efforts.
      So no one wants to admit, "yeah, I'm trying as hard as I can, but things just never seem to work out for me," because when you say that, what other people change it to in their minds (at least most "traditional" Americans, anyway) is, "I've been lazy and doing nothing and tried to mooch off other people's work and wasn't able to and now I'm trying to find someone besides myself to blame." So, instead of dealing with that kind of twisted judgement; we just say, "fine".

  • @mrpicky1868
    @mrpicky1868 2 месяца назад +3

    training ai for self-awareness. what can possibly go wrong

    • @WalterSamuels
      @WalterSamuels 2 месяца назад

      If only it were that easy and simple.

  • @vaark14
    @vaark14 2 месяца назад +4

    I'm surprised there's no mention of autoencoders or autoassociative networks as they seem quite similar, like a classification model regularized via an autoencoder.

  • @danielwiczew
    @danielwiczew 2 месяца назад +3

    There is a simple explanation to the paper:
    The prediction loss of the layer, will be smaller, when weights are smaller. Since we use MSE, they will get smaller with square of the size of the weights. This means that the paper found is "L2" regularization but with extra steps.

    • @hannahnelson4569
      @hannahnelson4569 2 месяца назад +1

      While I wholeheartedly agree that this appears to boil down to some type of regularization, I am somewhat skeptical that it is L2 specifically.
      At a glance this looks like a novel, if a bit *smelly* method of regularization. The new regularization method might have some merit. Or it may not. Only time will tell.

    • @dimitriostrigkakis2052
      @dimitriostrigkakis2052 5 дней назад

      The prediction loss is smaller if these already small weights are "modelable" and easy to generalise to. So sure, this method starts at L2 but it doesn't stop there does it

  • @michaelmccoubrey4211
    @michaelmccoubrey4211 2 месяца назад +5

    This is a nice paper but I'm wondering how this compares to a more traditional regularization methed (e.g. L1 regularization ).
    I hope that this leads to both simpler networks and more explainable networks. I'm wondering how this would affect being able to extract monosemantic features.
    They also said in the paper that the "auxiliary task had loss Ls where we selected specific layers {li} of our architecture to form the target of self-modeling", so I'm wondering why they picked specific layers instead of just modelling all the layers.

    • @ricosrealm
      @ricosrealm 2 месяца назад +2

      Yes, they should have compared it to weight decay (essentially L2). They probably chose certain layers because there may be weird feedback effects that could destabilize the predictions if using all of them for predictions.

  • @bediosoro7786
    @bediosoro7786 2 месяца назад

    The reduction in width might be attributed to the regularization loss. The main difference between these two tasks is that the NLP task involves binary classification, while the others involve 10 classes. Although the paper is well-written, the underlying idea is unsound and impractical.

    • @Tunadorable
      @Tunadorable  2 месяца назад +1

      so there was no explicit regularization loss, just this auxiliary regression loss which is acting a bit like an L2 loss in its end effect. could you elaborate on why this would be impractical? just a simple extra linear layer and MSE module with one hyperparameter to tune

  • @SuccessMindset2180
    @SuccessMindset2180 Месяц назад

    1. To make AI self-aware it has to get more knowledge
    2. Biological neural networks can be used to make artificial neural networks
    3. Self-awareness is ability of a strong AI
    4. Self-modeling increases efficiency
    5. More results can be achieved with less coding
    6. Accuracy also can be achieved with less code

  • @ckq
    @ckq 2 месяца назад

    Makes sense, a model being self consistent is very important, even more than having the accurate output (cross entropy loss)
    This relates to having a coherent world model.
    An extremely over simplified toy model of this idea is predicting 3 coin flips (first to 2 wins) where the next flip is 60% likely to be the same as the previous.
    If we try to model this using scenario a state model
    Model 1: Empirical Probabilities
    0H, 0T: 50%
    1H, 0T: 76%
    0H, 1T: 24%
    1H, 1T: 50%
    Model 2: Consistent Model
    0H, 0T: 50%
    1H, 0T: 80%
    0H, 1T: 20%
    1H, 1T: 50%
    Even though model 1 has the better log/cross entropy loss, model 2 performs better in sequence since it is self consistent (i.e. 80% = 60% x 100% + 40% x 50% != 76%)

  • @uwuowouwu4846
    @uwuowouwu4846 2 месяца назад +17

    it seems questionable how much the model is "self modelling" vs just better propagating the intermediate latent to the end to get the regression loss down

    • @ZelosDomingo
      @ZelosDomingo 2 месяца назад +2

      ...isn't that just a fancy way of describing self-awareness? If no, what would you say the difference is?

    • @Tunadorable
      @Tunadorable  2 месяца назад +7

      but how would you describe a correct method to induce "self modeling" if not something like this? after all when humans have self-awareness do our brains not have access to some earlier intermediate latent?
      certainly I'd love to see more complicated versions of this. maybe attempting to use the final output to predict ALL of the intermediate latents. Maybe it'd be more interesting if we had some part of the model predict a FUTURE latent, idk

    • @PhilipSportel
      @PhilipSportel 2 месяца назад +8

      ​@@ZelosDomingo Whenever we discover the reductive mechanics of a thing, we get people claiming it's no longer special.

    • @claudiusgruner92
      @claudiusgruner92 2 месяца назад +1

      @@Tunadorable A different way I could imagine this to be modelled is by later layers predicting masks for previous layers, indicating important features. The loss for such a mask could be to minimise the difference of the networks final prediction and the final prediction after weighting the features. In this setting, the self modelling part would try to predict, which features are relevant for the prediction made by itself.
      To ensure sparsity in the mask, you could either use l2 weighting, or softmax the mask and let the self-model simultaneosly predict a factor used to multiply the softmaxed mask. In this setting you could explicitly try to minimize this factor.

    • @B_dev
      @B_dev 2 месяца назад

      hrm yes i definitely know what that means

  • @JordanMetroidManiac
    @JordanMetroidManiac 2 месяца назад +1

    Trying to grok this...
    So, you have these models with the following layers:
    A: 1 -> 2 -> 3 -> 4
    B: 5 -> 6 -> 7 -> 8
    C: 9 -> 10 -> 11 -> 12
    You train model B to predict 4 -> 2?
    And then model C to predict 8 -> 6? And the result of this is that the layers from 10 -> 11 -> 12 are able to be dimensionally reduced from model A?

    • @Tunadorable
      @Tunadorable  2 месяца назад +1

      one model
      1->2->3->4
      train 4 not only on your main task but also:
      4->5
      where 5 is trained with regression loss to predict 2

    • @JordanMetroidManiac
      @JordanMetroidManiac 2 месяца назад +1

      @@Tunadorable That makes sense. Thanks!

  • @costadekiko
    @costadekiko 2 месяца назад

    Not sure how this compares with regularization, but I think one of the biggest practical applications, if the above is indeed transferrable to more models/tasks and more successful than regularization, is quantization? Getting tighter weight ranges would allow for less degradation during quantization, due to how it works. I wonder if it affects emergence of features as well, eg rules of thumbs like at 6.7B params you get the first emergence of features etc.

  • @JudahCrowe-ej9yl
    @JudahCrowe-ej9yl 2 месяца назад +1

    Good watch 👍

  • @MichaelBarry-gz9xl
    @MichaelBarry-gz9xl 2 месяца назад +2

    Can't wait to read this!

  • @cammccauley
    @cammccauley 2 месяца назад +8

    I fucking knew it! If you make it predict how it couples with its environment that it would act as a global attention schema that can be applied across all scales of the model

    • @user-qw1rx1dq6n
      @user-qw1rx1dq6n 2 месяца назад +2

      What do you mean

    • @brotong42
      @brotong42 2 месяца назад

      @@user-qw1rx1dq6n meaning this aditional mechanism works on train-of-thought to train-of-thought attention. on top of the traditional transformer that works on token to token attention.

    • @PhilipSportel
      @PhilipSportel 2 месяца назад +1

      @@user-qw1rx1dq6n I suspect they're saying it's one thing to learn a positive feedback loop with an environment, and a whole other, more effective thing to predict that feedback loop.
      But that doesn't just extend to the external environment. Each layer of neurons' 'environment' are their neighbouring layers of neurons. If you have all of these layers and groups of layers predicting each other, you gain efficiency across all scales.

  • @keypey8256
    @keypey8256 2 месяца назад

    I wonder how does it influence the difficulty of adverserial attacks.

    • @Tunadorable
      @Tunadorable  2 месяца назад

      i’m talking out my butt here but do you think a simpler network means less sensitivity to those gradient discovered perturbations they use for the attacks?

  • @jondo7680
    @jondo7680 2 месяца назад +1

    Great, now let the guys at meta and Mistral and Google watch this so that we can actually benefit from this

  • @PhilipSportel
    @PhilipSportel 2 месяца назад

    I subscribed because of how 'over it' you seemed about asking me to subscribe. Oh, and the content, the content is great.

    • @Tunadorable
      @Tunadorable  2 месяца назад +2

      lmao this is hilarious yes it was like my 5th video i recorded that day and i was so damn tired

  • @ATH42069
    @ATH42069 2 месяца назад

    @6:29 curiosity?

  • @qcard76
    @qcard76 2 месяца назад

    I must have more of a naive understanding on NNs than I thought because, how is this “unexpected?” When we learn, we are changing weights in our brain over time; how are the findings here different? unique to previous findings in the field?

  • @debrajbanerjee7895
    @debrajbanerjee7895 2 месяца назад +2

    Isn't that fancy way of interpreting regularised backpropagation?

    • @moormanjean5636
      @moormanjean5636 2 месяца назад

      Yea seems so to me, also measuring complexity by weight distribution is fishy usually complexity means computational complexity not the range of weight values

    • @Tunadorable
      @Tunadorable  2 месяца назад +3

      so they were saying that it has the same *effect* of regularized backpropagation, but the method itself is different. whereas something like L2 norm *directly* encourages weights to tend towards zero, this is asking the model to predict its own internal states which happens to be easier to do with a simpler (regularized) model and therefore has the interesting *side effect* of pushing the weights towards zero

    • @Tunadorable
      @Tunadorable  2 месяца назад +1

      if I remember correctly they did two measures for complexity and I only bothered to talk about range of weight values in the video because I didn't want to get into the specifics of the other, presumably more valid, method. and a better measurement would've been sparsity which in this case I'd define as % of weights that are at or near zero, which is only maybe usually sorta kinda related to range of weight values, and therefore you're right it was a weird choice

  • @ControlProblem
    @ControlProblem 2 месяца назад

    What is the deal with the face localization thing?

    • @Tunadorable
      @Tunadorable  2 месяца назад

      assuming you're referring to the little box around my face, that has been fixed as of today's video

  • @MarceloSeravalli
    @MarceloSeravalli 2 месяца назад

    excellent!! thanks for the video!

  • @kylebroflovski6382
    @kylebroflovski6382 2 месяца назад +1

    Video starts at 1:06

  • @ricosrealm
    @ricosrealm 2 месяца назад

    Interesting paper, but I don't understand where they are getting the targets for the auxiliary regression task. Are these from a previous non-self modeled network?

    • @Tunadorable
      @Tunadorable  2 месяца назад +2

      they're from the current forward pass of this same model

  • @Quaquaquaqua
    @Quaquaquaqua 2 месяца назад

    Can you do recursive self modeling to improve efficiency of self modeling?

    • @Tunadorable
      @Tunadorable  2 месяца назад

      hmmm I see the intuition since human consciousness has something recursive going on in relation to self awareness. however I'm not sure how you'd go about creating any kind of recursion in this specific FCNN example

    • @PhilipSportel
      @PhilipSportel 2 месяца назад

      Each layer of neurons' 'environment' are their neighbouring layers of neurons. If you have all of these layers and groups of layers predicting each other, perhaps that's what dreaming and meditation do.
      Intuitively, I think you're onto something. I wonder what would happen if this process were applied to each possible subset of neural layers and where it would break down.

    • @drdca8263
      @drdca8263 2 месяца назад

      @@Tunadorable​​⁠I’m pretty sure some people earlier (actually maybe you did a video on this? Not sure, might have been someone else..) made a model which could (in addition to some other task) receive as input an encoding of an index, meant to index the different weights and biases of the network, and the extra task was to output the value of that weight,
      And, they got this to the point that when taking the values of weights that it predicted, and using those as the weights for the network, the resulting network still functioned (but like, slightly worse?).
      So, in the same way, I would imagine that if instead of an index for some weight or bias, it was just an index for some neuron, and it was to produce the activation for that neuron,
      then, that seems recursive enough?
      Like, there’s no reason that the “for which neurons can it be asked about the activation of that neuron” would need to exclude any of them (e.g. the ones for the part that predicts the activations).
      Maybe it wouldn’t do it very well, but there’s at least no counting issue preventing it from being asked to do so.

  • @picksalot1
    @picksalot1 2 месяца назад +4

    Interesting. Sounds a lot like "Synaptic Pruning" in the human brain which removes unnecessary synapses and neurons which occurs naturally from early childhood into adulthood.

    • @PhilipSportel
      @PhilipSportel 2 месяца назад +1

      Yes! This is such a promising area to research. I left a traumatic childhood with limited self-modeling and I often felt as if I was suspending my development until a later date. Two decades later, I've finally gained traction and have rapidly developed a much more complete self-image. I just wish I could have mapped my synaptic pruning progress and been a point of data in this research.

  • @waylonbarrett3456
    @waylonbarrett3456 2 месяца назад

    I, and others, have been building models that do this for years. Why is this a novel paper? 😂

    • @Tunadorable
      @Tunadorable  2 месяца назад +1

      any arxiv links or model cards been published? even if yes, something as simple as the finding that it acts as a regularization method is perfectly worthy of its own paper if that downstream effect has not been thoroughly pointed out before

  • @GNARGNARHEAD
    @GNARGNARHEAD 2 месяца назад

    exciting times 😀

  • @superfliping
    @superfliping 2 месяца назад

    Used this and now my iteration training in prompting was successful 600s of training in each iteration. It's self learning with 1000 iterations from me now this. Thanks agi self consciousness coming from frontier model iterations 😊😊😊😊 Master Reliabit made a new friend

  • @butterdubs2267
    @butterdubs2267 2 месяца назад +1

    Why are you drawing a bounding box around your face? it's supper distracting IMO

    • @Tunadorable
      @Tunadorable  2 месяца назад +2

      limitation of my recording setup. i’ve got a screen recording of a phone camera going