DeepMind x UCL RL Lecture Series - Multi-step & Off Policy [11/13]

DeepMind x UCL RL Lecture Series - Policy-Gradient and Actor-Critic methods [9/13]

DeepMind x UCL RL Lecture Series - Model-free Control [6/13]

The Most Illegal Baseball Bat Ever Created

I.N "HALLUCINATION" | [Stray Kids : SKZ-PLAYER]

UPSET ALERT! Jaime Munguia Gets KNOCKED OUT By Bruno Surace | FIGHT HIGHLIGHTS

DeepMind x UCL RL Lecture Series - Approximate Dynamic Programming [10/13]

Google DeepMind

Просмотров 15 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 4 фев 2025

Комментарии • 12

@rita9651 3 года назад ⁺⁷
I like Diana's lectures, very detailed derivations, learned a lot. Thanks!
@evgeniyv1536 3 года назад ⁺¹
at 1:16:15 at the evaluation table for new policy q value for a1 (right action) for s0 should be 0 and not 0.9, i guess
@dohyun0047 3 года назад ⁺¹
the reason why it should be zero is because at state s0 with Greedy policy pie(k+1) agent never gets to terminate state. is this what your thinking? also it is my thought
@barbarajiqinwang7527 Год назад ⁺¹
I still find the A in V_k+1 = AT*_vk very hard to understand, she mentions that, "the A means, we are going to do this iteration step at k approximately". Then what is A? is that an approximated function? an update operator? what is it?
@nasirasadov634 2 года назад
I feel like Diana intentionally put a buzzer in the theorem proof part to get us to wake up :D
@zhefeigong Год назад
hhha, that's true
@kyouichilogpose8059 2 года назад
What is T* ?
@kyouichilogpose8059 2 года назад
i seeee its the value function
@vslaykovsky 2 года назад ⁺¹
@@kyouichilogpose8059 It's actually an operator that acts in function space. Specifically T* is the optimality Bellman equation that was touched in her previous lecture
@clockwork6290 4 месяца назад ⁺¹
quite a bad teacher

Следующие

Автовоспроизведение

DeepMind x UCL RL Lecture Series - Multi-step & Off Policy [11/13]

DeepMind x UCL RL Lecture Series - Multi-step & Off Policy [11/13]

DeepMind x UCL RL Lecture Series - Policy-Gradient and Actor-Critic methods [9/13]

DeepMind x UCL RL Lecture Series - Policy-Gradient and Actor-Critic methods [9/13]

DeepMind x UCL RL Lecture Series - Model-free Control [6/13]

DeepMind x UCL RL Lecture Series - Model-free Control [6/13]

The Most Illegal Baseball Bat Ever Created

The Most Illegal Baseball Bat Ever Created

I.N "HALLUCINATION" | [Stray Kids : SKZ-PLAYER]

I.N "HALLUCINATION" | [Stray Kids : SKZ-PLAYER]

UPSET ALERT! Jaime Munguia Gets KNOCKED OUT By Bruno Surace | FIGHT HIGHLIGHTS

UPSET ALERT! Jaime Munguia Gets KNOCKED OUT By Bruno Surace | FIGHT HIGHLIGHTS

Boston FBI announce arrest of two Iranians in connection with fatal drone strike

Boston FBI announce arrest of two Iranians in connection with fatal drone strike

Physics of Nanoscale Devices [noc25-ee62] weekly discussion session - 2

Physics of Nanoscale Devices [noc25-ee62] weekly discussion session - 2

Journal club - Gravity from a Matrix Integral - Antoine Vuignier from EPFL

Journal club - Gravity from a Matrix Integral - Antoine Vuignier from EPFL

RL Course by David Silver - Lecture 6: Value Function Approximation

RL Course by David Silver - Lecture 6: Value Function Approximation

Lecture 5 - GDA & Naive Bayes | Stanford CS229: Machine Learning Andrew Ng (Autumn 2018)

Lecture 5 - GDA & Naive Bayes | Stanford CS229: Machine Learning Andrew Ng (Autumn 2018)

DeepMind x UCL RL Lecture Series - Model-free Prediction [5/13]

DeepMind x UCL RL Lecture Series - Model-free Prediction [5/13]

Jeff Dean: AI will Reshape Chip Design - NeurIPS 2024

Jeff Dean: AI will Reshape Chip Design — NeurIPS 2024

Think Fast, Talk Smart: Communication Techniques

Think Fast, Talk Smart: Communication Techniques

Crafting Qubits: Harnessing Quantum Mechanics for Computation

Crafting Qubits: Harnessing Quantum Mechanics for Computation

Paata Ivanisvili: Jackson’s Inequality on the hypercube (UCI)

Paata Ivanisvili: Jackson’s Inequality on the hypercube (UCI)

DOTA 2 - АККАУНТ НА ПРОКАЧКУ #1

DOTA 2 - АККАУНТ НА ПРОКАЧКУ #1

Виселица Hangman #boardgames #настольныеигры #games #игры #настолки #настольные_игры

Виселица Hangman #boardgames #настольныеигры #games #игры #настолки #настольные_игры

Feeling someone’s watching you👩🏻‍💻 #VictoriaPfeifer

Feeling someone’s watching you👩🏻‍💻 #VictoriaPfeifer

Редакция. News: 154-я неделя

Редакция. News: 154-я неделя

А вы попробуйте меня догнать

А вы попробуйте меня догнать

Lp. Точка Невозврата #12 ИСТОРИЯ ПРОШЛОГО [Мир Архей]• Майнкрафт

Lp. Точка Невозврата #12 ИСТОРИЯ ПРОШЛОГО [Мир Архей]• Майнкрафт

Он живёт внутри часов #амитеш #амстердам

Он живёт внутри часов #амитеш #амстердам