LLMs: Where Did They Come From? Where Are They Going? Will They Become Sentient? | C3 Transform 2024

Поделиться
HTML-код
  • Опубликовано: 17 окт 2024

Комментарии • 7

  • @huveja9799
    @huveja9799 7 месяцев назад +2

    What the speaker does not say is that the hypothesis is that by feeding the training process with trillions of words and trying to predict the masked word, the model will try to "understand" the Language and the World. That's not a fact, it's a hypothesis, probably a product of our anthropomorphization of the model (or dehumanization of the human).
    By increasing the scale, both in the amount of data, as well as the size of the model, what is done is to increase the ability to capture more complex statistical patterns latent in our Language expression (that is, what we actually express using the ability for Language).
    Obviously those statistical patterns are going to reflect some aspect of this Language ability and the World encoded through that particular statistic (the World's aspect in turn encoded through the expression of our Language ability). Now, to say that by doing that, it "understands" the Language ability or the World, well, it is like saying that it is possible to understand how ants communicate and also the World by analyzing in an exhaustive way the tracks they leave on the ground ..

    • @420_gunna
      @420_gunna 6 месяцев назад

      🤷‍♂
      "Now, to say that by doing that, it "understands" the Language ability or the World, well, it is like saying that it is possible to understand how ants communicate and also the World by analyzing in an exhaustive way the tracks they leave on the ground .."
      I'm guessing there is some amount of information in ant communication in the way that they move around on the ground, but you're right in that it doesn't encode anything about (eg) stuff like phermone information. There are also some things that you _can_ infer about the world by analyzing the tracks of ants, assuming you didn't a priori know anything about the word other than ants and their objectives! Information isn't a "yes" or "no" thing, it's a spectrum! (Like you said)
      And human language certainly encodes a lot of information about _The World_ in it!
      "Understand" (as far as I know) is a fuzzy word though, like "intelligence" 😄 -- I wouldn't try to argue as to whether even _I_ have either of those things 😜

    • @huveja9799
      @huveja9799 6 месяцев назад

      @@420_gunna
      I can see that you are missing a crucial part of the analogy. You are facing the analogy as a human, and you think that analyzing the tracks of the ants with all the information that you, as a human, already have, and also with the intelligence that you have proven to have, will give you some information (which is true, although it will also give you very partial information).
      But the correct way to think about the analogy is that you know absolutely nothing about anything, and your only source of information are those tracks. In addition, you don't do this analysis with a brain like ours (which, by the way, we don't know how it works), but you only have at your disposal an extremely basic statistical technique ..

    • @420_gunna
      @420_gunna 6 месяцев назад

      ​@@huveja9799
      Sure, yeah -- I guess the way that I almost think about it is that I'm a language model, and I exist in a narrow, featureless hallway where the only thing that I can see is the next token slowly approaching me from the fog at the end of the hallway, and the only thing that I have to do in life is predict what token will fuzzily materialize from the fog as it approaches (I realize that I'm not including attention-y look-backs, etc).
      Certainly the only thing that I'm learning is this distribution of tokens, here. (Or like you say in yours, just the tracks and their distribution).

    • @420_gunna
      @420_gunna 6 месяцев назад

      Comparing an ant-footstep sequence model and a language sequence model though... surely the embedding space of the latter is a richer representation of The World than the ant footprints, though? ~Everything of what you and I interact with physically (and much of what we don't) is described in language
      Whereas I'm curious about what sort of semantically meaningful things you could learn from an ant footprint language model -- "fear, hunger, attack, retreat"? Whereas language has (let's ignore subwords) tokens like Schaudenfreude?
      If we consider a model that only processes "one thing" (ant footprints, language tokens)... the distribution of those things can have more or less information, right? As we slide that scale up and down, does your answer re: "understanding" about the world change?
      Do you have any pitch as to what's required for a model to have an "understanding" of the world? Thanks for replies

    • @huveja9799
      @huveja9799 6 месяцев назад

      @@420_gunna
      I like the use of your word "described", the language "describes", that is, it writes down, it puts a mark. Written language would be the equivalent of the trails (footprints) left by humans. The statistical relationships in those marks (spatial position with respect to other symbols and the amount of the different spatial configurations) would be the latent information in those trails. Embedding is a mechanism to efficiently represent that information (much more efficient - orders of magnitude - than Stochastic Matrices - or Markov Matrices-, although the principle would be basically the same).
      The LLM is a machine capable of computing these embedding (capturing statistical information in a vector space) and performing operations with them (transforming one set of embedding into another using operators that are restricted by this statistical information and manipulate that statistical information). What we call "emergency" is the final architecture of the LLM that implements those operators, and we call it "emergency" to be able to label (to mark) through language a phenomenon that we don't understand, that is, we don't know what those operators are, we don't know how they are produced by the training of the LLM, we cannot replicate their implementation outside of that training, and neither can we use them directly. The configuration of that vector space and the operators that are derived, and restricted, from it is what we call an "understanding" of the world by a model.
      Basically what is done is to compare human intelligence with those operators, that is, it would be assuming that the principle of human intelligence would be restricted to some vector space and based on some operators that are restricted by this space, and allow manipulating this space. Therefore, all we have to do is improve the LLMs to approximate that vector space, and obtain operators with capabilities similar to those possessed by humans and even better (super-intelligence).
      If that assumption, i.e. human intelligence restricted to a vector space with a set of operators, is accepted as true (which would be in line with the thinking of pragmatists like Rorty or Dewey), then we can perfectly make the comparison with ants. The only difference would be that ants would be simpler robots than humans ("simpler LLMs"), their vector space would be simpler and therefore their operators would be restricted by that vector space.
      But even within the mental model where this assumption is true, a model based exclusively on marks (language) and their trails, begins to show its deficiencies, just as the case of ants shows. The trails only are a mirror of the ants' activity, and you can extrapolate many things from that mirror (i.e. the trails), but ants don't work based on those marks, but, among other things, by communicating with pheromones, the marks being a reflection of them. A model based only on those trails, will ignore the pheromones, and will predict that an abandoned path is the way to go, instead of going on a terrain still without trails but with recent pheromones (the right way to go from the perspective of the real ant's working).

  • @420_gunna
    @420_gunna 6 месяцев назад

    The white suit goes crazy