Видео 86
Просмотров 201 035

Harish Tayyar Madabushi | Emergent Abilities in Large Language Models

1:04:24

Yong Jae Lee | Next Steps in Generalist Multimodal Models

59:57

Brett Larsen | The Importance of High-Quality Data in Building Your LLMs: Lessons from DBRX

58:33

Ziming Liu | KAN: Kolmogorov-Arnold Networks

1:00:51

Meng Fang | Large Language Models Are Neurosymbolic Reasoners

1:09:52

Yuxiong Wang | Bridging Generative & Discriminative Learning in the Open World

1:01:42

Hugo Laurençon | What Matters When Building Vision-Language Models?

Organised by Evolution AI - AI data extraction from financial documents - www.evolution.ai/
Abstract: The growing interest in vision-language models (VLMs) has been driven by improvements in large language models and vision transformers. Despite the abundance of literature on this subject, we observe that critical decisions regarding the design of VLMs are often not justified. We argue that these unsupported decisions impede progress in the field by making it difficult to identify which choices improve model performance. To address this issue, we conduct extensive experiments around pre-trained models, architecture choice, data, and training methods. Our consolidation of findings includes ...

Видео

Harish Tayyar Madabushi | Emergent Abilities in Large Language Models

1:04:24

Harish Tayyar Madabushi | Emergent Abilities in Large Language Models

Просмотров 2002 месяца назад

Organised by Evolution AI - AI-powered data extraction from financial documents: www.evolution.ai/ Sponsored by Man Group and Arctic DB: www.man.com/ and arcticdb.io/ Title: Emergent Abilities in Large Language Models: Do they pose an existential threat? Speaker: Harish Tayyar Madabushi (University of Bath) Abstract: Large language models, comprising billions of parameters and pre-trained on ex...

Yong Jae Lee | Next Steps in Generalist Multimodal Models

59:57

Yong Jae Lee | Next Steps in Generalist Multimodal Models

Просмотров 4672 месяца назад

Organised by Evolution AI - AI data extraction from financial documents - www.evolution.ai/ Title: Next Steps in Generalist Multimodal Models Abstract: The field of computer vision is undergoing another profound change. Recently, “generalist” models have emerged that can solve a variety of visual perception tasks. Also known as foundation models, they are trained on huge internet-scale unlabele...

Brett Larsen | The Importance of High-Quality Data in Building Your LLMs: Lessons from DBRX

58:33

Brett Larsen | The Importance of High-Quality Data in Building Your LLMs: Lessons from DBRX

Просмотров 3814 месяца назад

*NOTE* Due to a recording error, the first minute of the Meetup isn't available. Organised by Evolution AI - AI data extraction from financial documents - www.evolution.ai/ Abstract: Pretraining datasets for large language models (LLMs) have grown to trillions of tokens composed of large amounts of CommonCrawl (CC) web scrape along with smaller, domain-specific datasets. However, it’s expensive...

Ziming Liu | KAN: Kolmogorov-Arnold Networks

1:00:51

Ziming Liu | KAN: Kolmogorov-Arnold Networks

Просмотров 3,9 тыс.4 месяца назад

Organised by Evolution AI - AI data extraction from financial documents - www.evolution.ai/ Abstract: Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KAN...

Meng Fang | Large Language Models Are Neurosymbolic Reasoners

1:09:52

Meng Fang | Large Language Models Are Neurosymbolic Reasoners

Просмотров 3858 месяцев назад

Organised by Evolution AI - AI extraction from financial documents - www.evolution.ai/ Sponsored by Man Group - www.man.com/ Abstract: A wide range of real-world applications is characterized by their symbolic nature, necessitating a strong capability for symbolic reasoning. This paper investigates the potential application of Large Language Models (LLMs) as symbolic reasoners. We focus on text...

Yuxiong Wang | Bridging Generative & Discriminative Learning in the Open World

1:01:42

Yuxiong Wang | Bridging Generative & Discriminative Learning in the Open World

Просмотров 57310 месяцев назад

Sponsored by Evolution AI: www.evolution.ai Abstract: Generative AI has emerged as the new wave following discriminative AI, as exemplified by various powerful generative models including large language models (LLMs) and visual diffusion models. While these models excel at generating text, images, and videos, mere creation is not the ultimate goal. A grand objective lies in understanding and ma...

Brenden M. Lake | Addressing Two Classic Debates in Cognitive Science with Deep Learning

1:02:42

Brenden M. Lake | Addressing Two Classic Debates in Cognitive Science with Deep Learning

Просмотров 54011 месяцев назад

Sponsored by Evolution AI: www.evolution.ai Abstract: How can advances in machine learning best advance our understanding of human learning and development? In this talk, I'll describe two case studies using deep neural networks to address classic debates in cognitive science: What ingredients do children need to learn early vocabulary words? How much is learnable from sensory input with relati...

Yuandong Tian | Efficient Inference of LLMs with Long Context Support

53:35

Yuandong Tian | Efficient Inference of LLMs with Long Context Support

Просмотров 1 тыс.Год назад

Sponsored by Evolution AI: www.evolution.ai Abstract: While Large Language Models (LLMs) demonstrate impressive performance across many applications, how to inference with long context remains an open problem. There are two issues. First, current pre-trained LLMs may experience perplexity blow-up, when the input length goes beyond the pre-trained window; Second, inference with long context is b...

Baptiste Rozière | Code Llama: Open Foundation Models for Code

58:27

Baptiste Rozière | Code Llama: Open Foundation Models for Code

Просмотров 1,1 тыс.Год назад

Sponsored by Evolution AI: www.evolution.ai Abstract: We release Code Llama, a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks. We provide multiple flavors to cover a wide range of applications: foundation...

Jean Kaddour & Joshua Harris | Challenges and Applications of Large Language Models

55:43

Jean Kaddour & Joshua Harris | Challenges and Applications of Large Language Models

Просмотров 755Год назад

*NOTE* Unfortunately, due to a recording error, the first two minutes of the introduction are not available. Sponsored by Evolution AI: www.evolution.ai Link to paper: arxiv.org/abs/2307.10169 Abstract: Large Language Models (LLMs) went from non-existent to ubiquitous in the machine learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify the remaini...

Ofir Press | Complementing Scale: Novel Guidance Methods for Improving Language Models

1:03:10

Ofir Press | Complementing Scale: Novel Guidance Methods for Improving Language Models

Просмотров 201Год назад

Sponsored by Evolution AI: www.evolution.ai Abstract: This talk will cover a few of my recent papers, and will discuss my current views on the field and what future directions excite me. First I'll provide a quick overview of ALiBi, and talk about how to build LMs that can process longer sequences than those they were trained on. I'll talk about how to evaluate such models and my thoughts about...

Joon Sung Park | Generative Agents: Interactive Simulacra of Human Behavior

1:05:28

Joon Sung Park | Generative Agents: Interactive Simulacra of Human Behavior

Просмотров 9 тыс.Год назад

Sponsored by Evolution AI: www.evolution.ai Abstract: Believable proxies of human behavior can empower interactive applications ranging from immersive environments to rehearsal spaces for interpersonal communication to prototyping tools. In this paper, we introduce generative agents computational software agents that simulate believable human behavior. Generative agents wake up, cook breakfast,...

Tim Dettmers | QLoRA: Efficient Finetuning of Quantized Large Language Models

1:01:53

Tim Dettmers | QLoRA: Efficient Finetuning of Quantized Large Language Models

Просмотров 6 тыс.Год назад

Sponsored by Evolution AI: www.evolution.ai Abstract: Recent open-source large language models (LLMs) like LLaMA and Falcon are both high-quality and provide strong performance for their memory footprint. However, finetuning these LLMs is still challenging on consumer and mobile devices with a 32B LLaMA model requiring 384 GB of GPU memory for finetuning. In this talk, I introduce QLoRA, a tech...

Meta AI | Language Models Can Teach Themselves to Use Tools

58:11

Meta AI | Language Models Can Teach Themselves to Use Tools

Просмотров 704Год назад

Sponsored by Evolution AI: www.evolution.ai Abstract: Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. They also, paradoxically, struggle with basic functionality, such as arithmetic or factual lookup, where much simpler and smaller models excel. In this paper, we show that LMs can teach themselves to us...

Shikun Liu | Vision-Language Reasoning with Multi-Modal Experts

59:05

Shikun Liu | Vision-Language Reasoning with Multi-Modal Experts

Просмотров 343Год назад

Shikun Liu | Vision-Language Reasoning with Multi-Modal Experts

Lukas Lange | SwitchPrompt: Learning Domain-Specific Gated Soft Prompts

51:12

Lukas Lange | SwitchPrompt: Learning Domain-Specific Gated Soft Prompts

Просмотров 303Год назад

Lukas Lange | SwitchPrompt: Learning Domain-Specific Gated Soft Prompts

Jing Yu Koh | Grounding Language Models to Images for Multimodal Generation

51:52

Jing Yu Koh | Grounding Language Models to Images for Multimodal Generation

Просмотров 1,8 тыс.Год назад

Jing Yu Koh | Grounding Language Models to Images for Multimodal Generation

Meta AI | Human-level Play in Diplomacy Through Language Models & Reasoning

58:12

Meta AI | Human-level Play in Diplomacy Through Language Models & Reasoning

Просмотров 1,6 тыс.Год назад

Meta AI | Human-level Play in Diplomacy Through Language Models & Reasoning

Andrew Lampinen | Language models show human-like content effects on reasoning

59:17

Andrew Lampinen | Language models show human-like content effects on reasoning

Просмотров 6942 года назад

Andrew Lampinen | Language models show human-like content effects on reasoning

Lucas Beyer | Learning General Visual Representations

1:03:56

Lucas Beyer | Learning General Visual Representations

Просмотров 1,1 тыс.2 года назад

Lucas Beyer | Learning General Visual Representations

Gwanghyun Kim | DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation

42:59

Gwanghyun Kim | DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation

Просмотров 6952 года назад

Gwanghyun Kim | DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation

David Ha | Collective Intelligence for Deep Learning: A Survey of Recent Developments

32:14

David Ha | Collective Intelligence for Deep Learning: A Survey of Recent Developments

Просмотров 6012 года назад

David Ha | Collective Intelligence for Deep Learning: A Survey of Recent Developments

Stéphane d'Ascoli | Solving Symbolic Regression with Transformers

54:58

Stéphane d'Ascoli | Solving Symbolic Regression with Transformers

Просмотров 1,2 тыс.2 года назад

Stéphane d'Ascoli | Solving Symbolic Regression with Transformers

Ting Chen | Pix2Seq: A New Language Interface for Object Detection and Beyond

58:39

Ting Chen | Pix2Seq: A New Language Interface for Object Detection and Beyond

Просмотров 9312 года назад

Ting Chen | Pix2Seq: A New Language Interface for Object Detection and Beyond

Drew Jaegle | Perceivers: Towards General-Purpose Neural Network Architectures

58:36

Drew Jaegle | Perceivers: Towards General-Purpose Neural Network Architectures

Просмотров 1,5 тыс.2 года назад

Drew Jaegle | Perceivers: Towards General-Purpose Neural Network Architectures

Martha White | Advances in Value Estimation in Reinforcement Learning

59:32

Martha White | Advances in Value Estimation in Reinforcement Learning

Просмотров 4342 года назад

Martha White | Advances in Value Estimation in Reinforcement Learning

Alexey Bochkovskiy | YOLOv4 and Dense Prediction Transformers

1:03:40

Alexey Bochkovskiy | YOLOv4 and Dense Prediction Transformers

Просмотров 1,1 тыс.2 года назад

Alexey Bochkovskiy | YOLOv4 and Dense Prediction Transformers

Anna Rogers | BERTology nuggets: what we have learned about how BERT works

59:02

Anna Rogers | BERTology nuggets: what we have learned about how BERT works

Просмотров 6842 года назад

Anna Rogers | BERTology nuggets: what we have learned about how BERT works

Anees Kazi | Graph Convolutional Network for Disease Prediction with Imbalanced Data

1:04:13

Anees Kazi | Graph Convolutional Network for Disease Prediction with Imbalanced Data

Просмотров 8353 года назад

Anees Kazi | Graph Convolutional Network for Disease Prediction with Imbalanced Data

@KevinTaylor-x9h 17 дней назад
I really appreciate your efforts! I need some advice: My OKX wallet holds some USDT, and I have the seed phrase. (alarm fetch churn bridge exercise tape speak race clerk couch crater letter). How should I go about transferring them to Binance?
@dharmaone77 2 месяца назад
do they really have emergent abilities (eg. reasoning) or just curve fit to the training data? I thought it's been shown that LLMs are very bad at dealing with stuff outside their training data
@TwoStepsFromAnywhere 3 месяца назад
Wow, let's refocus all mathematicians from string theory to this, just for 5 years, and we will have AGI
@yyx_W115 3 месяца назад
Good!
@MartinOnassisGoodson 3 месяца назад
I been watching y’all for years.. This is a very exciting video to me…
@sora1104 5 месяцев назад
Hello Tim, this is such a great explanation, but there is 1 thing that still confuses me a little bit. In the paper, you stated that you used 2^(k-1) quantiles for the negative part and 2^(k-1) + 1 for the positive, then concatenate both side and remove 1 zero, but in this video, you showed that you used 2^(k-1) - 1 quantiles for the negative and 2^(k-1) for positive, then concatenate and add 1 zero to the final result. I wonder what is the correct way to create NF4 and what is the approach you used in your code? Thank you
@aojing 9 месяцев назад
@24:38 extend to MCMC is a surprise
@ΜιχαήλΣάπκας 10 месяцев назад
wtf happened at @22:39 ? :P
@richardnunziata3221 11 месяцев назад
I am a lttile confused in the verbal embedding example at 24:25 I think you said tsne. My understanding is that tsne can be used for distance , umap would be better. I am sure I am not understanding .
@Mawubo Год назад
Incredible work!
@won20529jun Год назад
interesting!!
@AlfredPros Год назад
Amazing talk! I'm looking forward for more breakthrough researches on LLM and alike!
@ralphprice7365 Год назад
Enormously useful for my social psych. research. A step up from purely mathematical based scripted simulations. Congratulations to Joon.
@JamesSarantidis Год назад
I think we will need a Digivice in the not so distant future :D What an amazing research and project. It's also so aesthetically pleasing for us, millennials. Kudos to the team!!! Your passion is contagious!!!
@stanpikaliri1621 Год назад
I’m very excited to run this simulation but with my own local AI model.
@FoxYaDigg Год назад
Been playing around with Smallville for a couple of days - the emergent behavior's are fascinating to observe. Thanks for putting this out it was an incredible watch
@oguzhanyldrm962 Год назад
Free Guy getting real omg!
@antonpictures Год назад
🎯 Key Takeaways for quick navigation: 00:00 🎉 Introduction to the machine learning meetup, introducing speakers. 01:31 🧠 Researchers aim to simulate human behavior for interactive applications using generative models. 03:18 🤖 Large language models hold potential to simulate human behavior for various applications. 04:15 🏙️ Generative agents can populate an open world, remember, reflect, plan, and coordinate based on growing memories. 05:17 🎮 "Smallville" environment demonstrates generative agents' interactions, actions, and social dynamics. 08:30 🤝 Users can influence, interact, and even control generative agents in Smallville. 09:55 🏠 Example: The Lin family's daily routine showcases individual generative agent behaviors. 11:08 👥 Emergent behaviors in Smallville include information diffusion, new relationships, and agent coordination. 13:18 📚 Architecture of generative agents includes memory stream, retrieval, reflection, and planning modules. 20:12 ⏰ Planning module generates detailed schedules by recursively decomposing plans. 23:37 📊 Evaluation of agents' believability using natural language interviews and comparison with human authors. 25:31 ✅ Components of the generative agent architecture (observation, plan, reflection) significantly contribute to believability. 26:14 🤖 Agents share and remember information about events and experiences. 26:43 💼 Agents' behaviors reflect realistic human behavior, including social conflicts and politeness. 27:27 🛡️ Agents' language model instructions guide them to be overly polite and cooperative. 28:13 💡 Interest in generative agents spans multiple fields, showing potential for various applications. 29:22 🤝 "Social Similacra" is a new approach using generative agents to prototype social computing systems. 31:21 🔄 Social Similacra generates synthetic social interactions for prototyping system designs. 33:03 🚀 "Generate" feature helps designers envision a broad range of interactions within a subreddit community. 34:49 🧩 "What if" feature allows designers to explore alternative paths and interventions in simulated conversations. 36:16 🌌 "Multiverse" feature displays multiple possible outcomes, fostering more comprehensive design exploration. 37:43 🧐 Social Similacra's designer evaluation demonstrates its value in proactive design and security testing. 39:26 🤝 Social Similacra offers a new approach for designers to prototype social computing systems and explore social dynamics. 51:57 🕒 Agents develop inductive biases over time, changing their perspectives and interactions with other agents based on past experiences. 52:50 🗣️ Agents can adapt plans due to changing circumstances, and even small changes might require adjustments to plans for all involved parties. 53:46 📆 Conversations are time-defined in chunks of hours, and the length of a conversation affects plans and activities for that time period. 54:55 🗣️ Conversation duration in the simulation is determined by counting characters, translating to seconds of in-game time. 57:28 🌐 Simulation of online social platforms to study behaviors like content popularity, likes, and network dynamics is being explored. 58:11 🌐 Agents can simulate the creation of a synthetic web, exploring the emergence of network structures like scale-free networks. 59:21 🧠 Investigating the impact of introducing highly intelligent agents to the system to study societal dynamics and interventions is of interest. 01:00:46 🤖 Agents' intelligence and roles can be varied within the system, raising questions about societal robustness and interventions. 01:03:08 🧠 Revisiting early AI ambitions with modern techniques can lead to combining neural networks with historical philosophical approaches. Made with HARPA AI
@MrEmbrance Год назад
6:00 why int4 starts from -7 not -8 ?
@MartinAndrews-mdda 11 месяцев назад
Because there are 16 4-bit numbers, and would like to have 1..8. Zero takes a space, so -1..-7 is all we can do on the negative side.
@MrEmbrance 5 месяцев назад
@@MartinAndrews-mdda thanks
@billykotsos4642 Год назад
BASED
@zayarKMY Год назад
Fascinating! some kind of building your own universe.
@mirach5072 Год назад
Great preso. are the slides posted anywhere?
@ahmadmaulanai4843 Год назад
Well i guest. Caught random duck in wild. Thanks for video and paper.
@matthewclarke1926 Год назад
May I suggest an interior designer :)
@KEKW-lc4xi 11 месяцев назад
no *bonk
@joaomartinho4845 Год назад
thank you.
@multitrickfox Год назад
Great talk !
@inakigorostiaga6305 Год назад
Hsjsjs
@leonlysak4927 2 года назад
Wow this was a really interesting presentation
@normalchannel4747 2 года назад
thank you so mush. This is just what I needed :)
@haochen1868 2 года назад
Thank you！
@hastyroehling5753 2 года назад
🄿🅁🄾🄼🄾🅂🄼 🎊
@jamaicaigot9335 2 года назад
i cant find you on audea - can you post audio versions of your videos there? would love to listen to them! thanks again for the great content!
@NLAI666 2 года назад
Where to get the AI for Pineapple Poker
@samiloom8565 2 года назад
Wow very clear slowly but surly guiding the audiance. Keep the good work
@gtg238s 2 года назад
Perceiver is the next step in AI we just haven’t seen the really complex stuff built with it yet
@riviaroland9114 2 года назад
Thank you! I enjoy it and your review paper. Learning graph from data is an interesting area.
@iciamyplant 2 года назад
it was so clear, thank you
@Matteinko 2 года назад
j'a trouve ce video la entre les links de ton video!
@chrisv9864 2 года назад
ᴘʀᴏᴍᴏsᴍ
@megadero8407 2 года назад
Pretty Cool.
@jeffreylim5920 2 года назад
Does "One" agent57 model covers all 57 atari games? Or, we train 57 models for each game?
@jeffreylim5920 2 года назад
40:37 One agent per one game
@yugu1911 2 года назад
An outstanding talk from a native Chinese scholar!
@RajivSambasivan 2 года назад
This is a superb talk. Kudos!
@anonymous102592 2 года назад
Great talk, thank you
@ivanzhovannik5419 3 года назад
Very interesting talk!
@larrymcgriff1325 3 года назад
SUCCESS
@timjohnson5998 3 года назад
I would like to add, did you even think about putting two programs in the same Arena with or without human players😎🤓 I'm just asking as a human
@marco_gorelli 3 года назад
Betancourt is an unusually good educator. His writing (from his website) is excellent
@ChuckChekuri 3 года назад
What would happen if two machines with the same version of alphastar are made to play against each other? Can we predict if the first one that moves will always win. Assume they are both are use same random weights with the same seed.
@senatusconsultumultimum7815 3 года назад
Transformers are amazing
@lynnalouisa6321 3 года назад
b3khp vyn.fyi
@jolenejoice2238 3 года назад
vj2wh vur.fyi

London Machine Learning Meetup

Видео

Комментарии