The Hidden Life of Embeddings: Linus Lee
HTML-код
- Опубликовано: 6 ноя 2023
- We love text embeddings as a critical pillar of LLM applications, but there's so much to text embeddings beyond their value in vector search. This talk will be a grand tour through a series of experimental projects from my last two years of research for visualizing, manipulating, and interpreting embeddings. We'll start with the basics (t-SNE, UMAP, and PCA), talk about how language models can be used to manipulate and interpret embeddings, and end by using a new tool I've built that lets us directly observe which features popular embedding models like to encode into their embeddings.
Recorded live in San Francisco at the AI Engineer Summit 2023. See the full schedule of talks at ai.engineer/summit/schedule & join us at the AI Engineer World's Fair in 2024! Get your tickets today at ai.engineer/worlds-fair
About Linus Lee
Linus is a Research Engineer at Notion prototyping new software interfaces for augmenting our collaborative work and creativity with AI. He has spent the last few years experimenting with AI-augmented tools for thinking, like a canvas for exploring the latent space of neural networks and writing tools where ideas connect themselves. Before Notion, Linus spent a year as an independent researcher, during which he was Betaworks's first Researcher in Residence. - Наука
Awesome demo and great work/tools.
Wow, that was fascinating! Awesome demo 👏
just wow...fascinating demo.👍
someone said “this was the first talk were there were audible gasps in the audience”. amazing demos… and all on the side too
all the mentions of Latent Space is music to my ears!
did he say when he would releease code?
🎯 Key Takeaways for quick navigation:
00:01 🎵 Introduction and Background
- Introduction to the speaker, Linus Lee, who works on AI at Notion.
- Brief overview of his past work with language models and embedding models.
- Mention of Notion AI's progress and features since its launch in November 2022.
01:30 🧠 Discussing Latent Spaces
- Explanation of the concept of latent spaces in AI models.
- Comparison of controlling language models to steering a car from the back seat.
- Discussion on the potential of gaining more control by looking inside the model.
03:09 📊 Understanding Embeddings
- Explanation of how embeddings represent the most salient features of a text or image.
- Discussion on the potential of disentangling meaningful attributes from embeddings.
- Suggestion of building more expressive interfaces by intervening inside the model.
05:20 🛠️ Manipulating Embeddings
- Demonstration of how to manipulate embeddings to generate different versions of a text.
- Explanation of how to project texts into meaningful directions in the embedding space.
- Discussion on the potential of mixing embeddings to generate new texts.
09:43 🔄 Adapting Embedding Models
- Explanation of how to adapt an embedding model to read out text from other embedding spaces.
- Demonstration of recovering text details from OpenAI's embedding space.
- Discussion on the potential of manipulating image embeddings.
13:58 📚 Models Used and Research
- Overview of the custom text model used in the demonstrations.
- Mention of the recent research in the field of latent spaces.
- Announcement of the release of the models used in the demonstrations on Hugging Face.
17:01 🎓 Conclusion and Takeaways
- Emphasis on the importance of making complex models tangible and interactive.
- Discussion on the potential of generative models as a laboratory for knowledge.
- Encouragement to build more human interfaces to knowledge.
Made with HARPA AI
Any code available? Great talk!
What program is he running?
Probably a custom-built UI (maybe with Streamlit) for his Contra model.
tokens everything, then Fourier transform everywhere?