Robot Learning by Understanding Egocentric Videos

Towards a more contextualized view of the web

The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

Nature's Incredible ROTATING MOTOR (It’s Electric!) - Smarter Every Day 300

ImDavisss - 4 U (Official Audio)

“Mask Girl” Actress Married A Chaebol Then Lost EVERYTHING

You Can't Have AI Safety Without Inclusion

Ai2

Просмотров 197

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 10 июл 2024
Speaker: Dylan Hadfield-Menell, Assistant Professor, MIT
Abstract: The challenge of specifying goals for agents has long been recognized, as Kerr's seminal 1975 paper 'On the Folly of Rewarding A while Hoping for B' highlights the unintended consequences of misaligned reward systems. This issue lies at the heart of AI alignment research, which seeks to design incentive structures that reliably guide AI systems to achieve our intended objectives. In this talk, I will explore how brittle alignment emerges as an inherent result of incomplete goal specification. By presenting a theoretical model, I will demonstrate the sufficient conditions under which the unconstrained optimization of any goal that fails to capture all features of value ultimately leads to worse outcomes than forgoing optimization. Furthermore, I will extend this theoretical framework to address the importance of inclusivity in value specification. By reinterpreting the model such that different features of value represent the diverse perspectives of individuals, optimizing an incomplete goal can be expected to impact those whose values are not taken into account adversely. Consequently, technology that aligns an agent with the values of a single person or organization poses significant risks. I will conclude by outlining potential research avenues for multi-stakeholder alignment and emphasizing the necessity of decentralized value learning and specification.
Bio: I am the Bonnie and Marty (1964) Tenenbaum Career Development Assistant Professor of EECS at MIT. I run the Algorithmic Alignment Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and I'm also a Schmidt Sciences AI2050 Early Career Fellow. My research develops methods to ensure that AI systems behavior aligns with the goals and values of their human users and society as a whole, a concept known as 'AI alignment'. My group and I work to address alignment challenges in multi-agent systems, human-AI teams, and societal oversight of machine learning. Our goal is to enable the safe, beneficial, and trustworthy deployment of AI in real-world settings.
Наука

Комментарии •

Следующие

Автовоспроизведение

Robot Learning by Understanding Egocentric Videos

Robot Learning by Understanding Egocentric Videos

Towards a more contextualized view of the web

Towards a more contextualized view of the web

The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

Nature's Incredible ROTATING MOTOR (It’s Electric!) - Smarter Every Day 300

Nature's Incredible ROTATING MOTOR (It’s Electric!) - Smarter Every Day 300

ImDavisss - 4 U (Official Audio)

ImDavisss - 4 U (Official Audio)

“Mask Girl” Actress Married A Chaebol Then Lost EVERYTHING

“Mask Girl” Actress Married A Chaebol Then Lost EVERYTHING

J.P., NLE Choppa - Bad Bitty (Remix) [Official Music Video]

J.P., NLE Choppa - Bad Bitty (Remix) [Official Music Video]

2023 MIT Integration Bee - Finals

2023 MIT Integration Bee - Finals

Barak Hirshberg: Algorithms for long timescales and q statistics in molecular dynamics simulations

Barak Hirshberg: Algorithms for long timescales and q statistics in molecular dynamics simulations

The Real Reason You Can’t Afford A House

The Real Reason You Can’t Afford A House

Training Human-AI Teams

Training Human-AI Teams

What if my Intel CPU explodes??

What if my Intel CPU explodes??

Training AI Without Writing A Reward Function, with Reward Modelling

Training AI Without Writing A Reward Function, with Reward Modelling

Objective Mismatch in Reinforcement Learning from Human Feedback

Objective Mismatch in Reinforcement Learning from Human Feedback

The moment we stopped understanding AI [AlexNet]

The moment we stopped understanding AI [AlexNet]

AI Safety Gym - Computerphile

AI Safety Gym - Computerphile

ЧТО ЭТО За Флешки Замурованные в СТЕНЕ? #shorts

ЧТО ЭТО За Флешки Замурованные в СТЕНЕ? #shorts

КОМП В МЕШКЕ / ПК ЗА 100К ОТ ARDOR GAMING! СБОРКА ПК или ПК ИЗ ДНС?

КОМП В МЕШКЕ / ПК ЗА 100К ОТ ARDOR GAMING! СБОРКА ПК или ПК ИЗ ДНС?

iOS 17.6 обновление! Что нового в iOS 17.6? Обзор iOS 17.6, батарея, скорость, стоит ли ставить?

iOS 17.6 обновление! Что нового в iOS 17.6? Обзор iOS 17.6, батарея, скорость, стоит ли ставить?

New setup part 3: There's still a lot to add #setup #gamer #gameroom #techhouse #gamingtech

New setup part 3: There's still a lot to add #setup #gamer #gameroom #techhouse #gamingtech

⚠️ ЧТО ДЕЛАТЬ, ЕСЛИ СМАРТФОН УПАЛ В ВОДУ?! Самый НЕОБЫЧНЫЙ способ от Xiaomi!

⚠️ ЧТО ДЕЛАТЬ, ЕСЛИ СМАРТФОН УПАЛ В ВОДУ?! Самый НЕОБЫЧНЫЙ способ от Xiaomi!

Попробую поставить мать на монитор... Только вот с питанием и Тконом надо разобраться...

Попробую поставить мать на монитор... Только вот с питанием и Тконом надо разобраться...

Секрет, как продлить жизнь аккумулятора iPhone📱 Сохрани и поделись с друзьями!🙌🏼

Секрет, как продлить жизнь аккумулятора iPhone📱 Сохрани и поделись с друзьями!🙌🏼

БЕЗОПАСНОСТЬ!! Apple выпустила iOS 17.6 Релиз для Айфона! Стоит ставить? Что Нового?

БЕЗОПАСНОСТЬ!! Apple выпустила iOS 17.6 Релиз для Айфона! Стоит ставить? Что Нового?