Towards Steerable AI Systems, SIGIR 2024 Keynote Speaker - Thorsten Joachims

The moment we stopped understanding AI [AlexNet]

MIT 6.S087: Foundation Models & Generative AI. BIOLOGY

Control | Juno Hero Trailer | Overwatch 2

It’s been SIX YEARS..How well do you know me?

Representation Learning and Information Retrieval -SIGIR 2024, Keynote Speaker, Yiming Yang

Association for Computing Machinery (ACM)

Просмотров 725

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 23 авг 2024
Abstract: How to best represent words, documents, queries, entities, relations, and other variables in information retrieval (IR) and related applications has been a fundamental research question for decades. Early IR systems relied on the independence assumptions about words and documents for simplicity and scalability, which were clearly sub-optimal from a semantic point of view. The rapid development of deep neural networks in the past decade has revolutionized the representation learning technologies for contextualized word embedding and graph-enhanced document embedding, leading to the new era of dense IR. This talk highlights such impactful shifts in representation learning for IR and related areas, the new challenges coming along and the remedies, including our recent work in large-scale dense IR, in graph-based reasoning for knowledge-enhanced predictions, in self-refinement of large language models (LLMs) with retrieval augmented generation (RAG) and iterative feedback, in principle-driven selfalignment of LLMs with minimum human supervision, etc. More generally, the power of such deep learning goes beyond IR enhancements, e.g., for significantly improving the state-of-the-art solvers for NP-Complete problems in classical computer science.
Bio: Yiming Yang is a professor with a joint appointment at the Language Technologies Institute (LTI) and the Machine Learning Department (MLD) in the School of Computer Science, Carnegie Mellon University (CMU). She has jointed CMU as a faculty member since 1996, and her research has been focused on machine learning paradigms, algorithms and applications in a broad range, including her influential early work in large-scale text classification and information retrieval, and more recently on cutting-edge technologies for large language models (e.g., XL-Net), neural-network architecture search (e.g., DARTS), reasoning with graph neural networks, reinforcement learning and diffusion models for solving NP complete problems (e.g., DIMES and DIFFUSCO), AI-enhanced self-alignment of LLMs, knowledge-enhanced information retrieval, LLMs with RAG (Retrieval Augmented Generation), large foundation models for scientific domains, etc. She became a member of the SIGIR Academy in 2023, in recognition for her contributions in the intersection of Machine Learning and Information Retrieval.

Комментарии • 1

@yimingyang4254 Месяц назад ⁺²
Due to some AV issues, I could not hear the questions clearly on the stage. So, let me clarify some of the answers retrospectively.
Question 1. Why did the easy-to-heard voting strategies perform worse than the SFT baseline with greedy decoding when the candidate-pool size is less than 10?
Answer: The voting strategies had a non-deterministic process, with the temperature set to 0.7 (as shown in the slide) while the baseline has the temperature set to 0 (deterministic). This means that the baseline always picked its top candidate per problem instance, but the voting strategies may miss it when the pool size is rather small.
Question 2. How does the proposed method differ from GAN?
Answer: The discriminator in GAN is trained to label each instance as a natural-language output (yes) or un-natural (no), while the evaluator in easy-to-hard generalization is trained to tell whether a math solution is correct (yes) or wrong (no) for a given math problem. Even if an answer is perfect in English, it still can be wrong mathematically.
Question 3. Can we use this idea to improve LLM pre-training?
Answer: Maybe not. When we have enough (unlabeled) data for pre-training of an LLM, we may not be benefited much from the easy-to-hard generalization. On the other hand, if we do not have enough data for pretraining an LLM, we may also not be able to train the evaluator well. One may argue that what if we can train the evaluator on rare patterns that the current LLM cannot handle well? Perhaps yes, but this is a big “if”. That is, the challenge is then shifted to how to obtain the annotated data on the rare patterns.
I hope the above answers help.

Следующие

Автовоспроизведение

Towards Steerable AI Systems, SIGIR 2024 Keynote Speaker - Thorsten Joachims

Towards Steerable AI Systems, SIGIR 2024 Keynote Speaker - Thorsten Joachims

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

MIT 6.S087: Foundation Models & Generative AI. BIOLOGY

MIT 6.S087: Foundation Models & Generative AI. BIOLOGY

Control | Juno Hero Trailer | Overwatch 2

Control | Juno Hero Trailer | Overwatch 2

It’s been SIX YEARS..How well do you know me?

It’s been SIX YEARS..How well do you know me?

Life After Love Island with Serena Page and Kordell Beckham | Baby, This Is Keke Palmer | Podcast

Life After Love Island with Serena Page and Kordell Beckham | Baby, This Is Keke Palmer | Podcast

What Is an AI Anyway? | Mustafa Suleyman | TED

What Is an AI Anyway? | Mustafa Suleyman | TED

Official PyTorch Documentary: Powering the AI Revolution

Official PyTorch Documentary: Powering the AI Revolution

Why So Many CEOs Are Engineers

Why So Many CEOs Are Engineers

AI, Machine Learning, Deep Learning and Generative AI Explained

AI, Machine Learning, Deep Learning and Generative AI Explained

Internet is going wild over this problem

Internet is going wild over this problem

SIGIR 2024 Salton Award Talk

SIGIR 2024 Salton Award Talk

GraphRAG: LLM-Derived Knowledge Graphs for RAG

GraphRAG: LLM-Derived Knowledge Graphs for RAG

MIT Introduction to Deep Learning | 6.S191

MIT Introduction to Deep Learning | 6.S191

Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use

Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use

Российские поезда 👌 #тнт #shorts #юмор #шоу #однаждывроссии #дорохов #поезд #россия

Российские поезда 👌 #тнт #shorts #юмор #шоу #однаждывроссии #дорохов #поезд #россия

Арестович оценил ответ России на Курскую операцию ЗСУ: Кремль не остановит наступление на Покровск

Арестович оценил ответ России на Курскую операцию ЗСУ: Кремль не остановит наступление на Покровск

Светофорная болезнь на моторе G4FG 1.6 (KIA Ceed)

Светофорная болезнь на моторе G4FG 1.6 (KIA Ceed)

НУБ И ПРО СТРОЯТ ДОМ МОНСТР ЗА 10 СЕКУНД / 1 МИНУТА / 5 МИНУТ В МАЙНКРАФТ БИТВА СТРОИТЕЛЕЙ

НУБ И ПРО СТРОЯТ ДОМ МОНСТР ЗА 10 СЕКУНД / 1 МИНУТА / 5 МИНУТ В МАЙНКРАФТ БИТВА СТРОИТЕЛЕЙ

Как Алип забил в свои ворота | #Зенит #Футбол #СПБ

Как Алип забил в свои ворота | #Зенит #Футбол #СПБ

НА НАС НАПАЛИ В МИКРО-ЗЕМЛЯНКЕ!

НА НАС НАПАЛИ В МИКРО-ЗЕМЛЯНКЕ!

ФАНАТКА ГУРАМА АМАРЯНА #большоешоу #9сезон #юленька #куруч #амарян #юмор #mediumquality #мусагалиев

ФАНАТКА ГУРАМА АМАРЯНА #большоешоу #9сезон #юленька #куруч #амарян #юмор #mediumquality #мусагалиев

НИКИТА СДЕЛАЛ ПРЕДЛОЖЕНИЕ ЛЕРЕ? 😍

НИКИТА СДЕЛАЛ ПРЕДЛОЖЕНИЕ ЛЕРЕ? 😍