AI Reading List (by Ilya Sutskever) - Part 4

Поделиться
HTML-код
  • Опубликовано: 4 фев 2025

Комментарии • 5

  • @datamlistic
    @datamlistic  7 месяцев назад +1

    Link to the full AI reading list series: ruclips.net/p/PL8hTotro6aVGtPgLJ_TMKe8C8MDhHBZ4W&si=u9Gk38MaQ7VLH3lf
    Important note: As some of you pointed out that Ilya never confirmed this list, and I would like to apologize to those of you whom I misinformed by saying this is the official list. I was under the impression that he did confirm it. I'm very sorry for that! I promise I will do a better job researching the topics I present.

  • @unclecode
    @unclecode 7 месяцев назад

    Waiting for the fourth part, and this part's papers are a hell of a papers haha. To me, "Neural Turing Machines" is the top in your series. Back in 2021, during the COVID lockdown, I worked on defining a "finite-state automata" using a transformer-essentially, an autoregressive model where the next token is the next state of the machine. It was an extremely fun experience, and the results were amazing. I intended to publish a paper but got caught up with other things. I actually built a library called ASM (Autoregressive-State-Machine). This video has motivated me to revisit it and publish it! If I do, I will ask you to review it :)) Anyway Turing machine is the everything of computation, and if it can be redefined in such a manner, then we might have a hope for AGI. Sometimes, I feel that if we are going to have AGI, that needs such a state machine, then the everything else will follow inevitably. Anyway Kudos, looking for the rest of this series and reviewing recent papers.

    • @datamlistic
      @datamlistic  7 месяцев назад +1

      Haha, glad to hear that this video has reignited your passion for Neural Turing Machines! Maybe one day they will have a comeback and people will start using them again. Would you mind sharing more details regarding the ASM at datamlistic@gmail? It sounds like a really interesting idea. :)
      On another note, I believe we still need additional things to reach AGI and that the Transformers-based language models are not enough. They basically act as compressors & retrievers of past data, and in my opinion a true AGI has to have the ability to add new knowledge on the fly, either through something like a neural long term memory (no RAGs!) as in Neural Turing Machines or modify its own weights to incorporate the new data.

    • @unclecode
      @unclecode 7 месяцев назад

      @@datamlistic I completely agree with you. That's very similar to how I view language models. To me, aspects of information theory, particularly Shannon theory, contribute to this understanding. It's like a data compression algorithm that allows sampling from the compressed data, essentially sampling from a distribution. What made it stand out was the significant investment in increasing model and data size. The autoregressive nature of these models is impressive, but it’s not the sole component needed for AGI. It will be a part of AGI, but additional components are necessary.
      Recently, there's been a paper on combining Monte Carlo tree search with large language models, to boost up their reasoning power. Basically they try to fill the absence of search in transformers. It brings the taste of A* algorithm into the realm of auto regressive models. This is the kind of focus is what we need.
      That’s why, even two years ago, I was interested in seeing how I could combine it with finite state machines. I realized this wasn’t the whole solution for AGI. I’d love to collaborate and share my work with you. I just need to revisit my code base and refresh my mind. 😅 And definitely will be in touch soon, thx for the offer.
      Thank you for this series and the excellent papers you share. They are invaluable for those wanting to dive deeper into these concepts or newcomers seeking foundational knowledge.

  • @adammobile7149
    @adammobile7149 7 месяцев назад

    Bit advanced stuff, but it gives some overview of what's going on in the field 👍