The Hidden Life of Embeddings: Linus Lee

Поделиться
HTML-код
  • Опубликовано: 6 ноя 2023
  • We love text embeddings as a critical pillar of LLM applications, but there's so much to text embeddings beyond their value in vector search. This talk will be a grand tour through a series of experimental projects from my last two years of research for visualizing, manipulating, and interpreting embeddings. We'll start with the basics (t-SNE, UMAP, and PCA), talk about how language models can be used to manipulate and interpret embeddings, and end by using a new tool I've built that lets us directly observe which features popular embedding models like to encode into their embeddings.
    Recorded live in San Francisco at the AI Engineer Summit 2023. See the full schedule of talks at ai.engineer/summit/schedule & join us at the AI Engineer World's Fair in 2024! Get your tickets today at ai.engineer/worlds-fair
    About Linus Lee
    Linus is a Research Engineer at Notion prototyping new software interfaces for augmenting our collaborative work and creativity with AI. He has spent the last few years experimenting with AI-augmented tools for thinking, like a canvas for exploring the latent space of neural networks and writing tools where ideas connect themselves. Before Notion, Linus spent a year as an independent researcher, during which he was Betaworks's first Researcher in Residence.
  • НаукаНаука

Комментарии • 10

  • @kevon217
    @kevon217 9 месяцев назад +1

    Awesome demo and great work/tools.

  • @endlessvoid7952
    @endlessvoid7952 9 месяцев назад +1

    Wow, that was fascinating! Awesome demo 👏

  • @gopikrishna8063
    @gopikrishna8063 8 месяцев назад +2

    just wow...fascinating demo.👍

  • @swyxTV
    @swyxTV 9 месяцев назад +7

    someone said “this was the first talk were there were audible gasps in the audience”. amazing demos… and all on the side too
    all the mentions of Latent Space is music to my ears!

    • @Star-rd9eg
      @Star-rd9eg 4 месяца назад

      did he say when he would releease code?

  • @twoplustwo5
    @twoplustwo5 9 месяцев назад

    🎯 Key Takeaways for quick navigation:
    00:01 🎵 Introduction and Background
    - Introduction to the speaker, Linus Lee, who works on AI at Notion.
    - Brief overview of his past work with language models and embedding models.
    - Mention of Notion AI's progress and features since its launch in November 2022.
    01:30 🧠 Discussing Latent Spaces
    - Explanation of the concept of latent spaces in AI models.
    - Comparison of controlling language models to steering a car from the back seat.
    - Discussion on the potential of gaining more control by looking inside the model.
    03:09 📊 Understanding Embeddings
    - Explanation of how embeddings represent the most salient features of a text or image.
    - Discussion on the potential of disentangling meaningful attributes from embeddings.
    - Suggestion of building more expressive interfaces by intervening inside the model.
    05:20 🛠️ Manipulating Embeddings
    - Demonstration of how to manipulate embeddings to generate different versions of a text.
    - Explanation of how to project texts into meaningful directions in the embedding space.
    - Discussion on the potential of mixing embeddings to generate new texts.
    09:43 🔄 Adapting Embedding Models
    - Explanation of how to adapt an embedding model to read out text from other embedding spaces.
    - Demonstration of recovering text details from OpenAI's embedding space.
    - Discussion on the potential of manipulating image embeddings.
    13:58 📚 Models Used and Research
    - Overview of the custom text model used in the demonstrations.
    - Mention of the recent research in the field of latent spaces.
    - Announcement of the release of the models used in the demonstrations on Hugging Face.
    17:01 🎓 Conclusion and Takeaways
    - Emphasis on the importance of making complex models tangible and interactive.
    - Discussion on the potential of generative models as a laboratory for knowledge.
    - Encouragement to build more human interfaces to knowledge.
    Made with HARPA AI

  • @johntanchongmin
    @johntanchongmin 8 месяцев назад +2

    Any code available? Great talk!

  • @Star-rd9eg
    @Star-rd9eg 6 месяцев назад +1

    What program is he running?

    • @DanDascalescu-dandv
      @DanDascalescu-dandv 2 месяца назад

      Probably a custom-built UI (maybe with Streamlit) for his Contra model.

  • @alexchiang2617
    @alexchiang2617 8 месяцев назад

    tokens everything, then Fourier transform everywhere?