I run untested, viewer-submitted code on my 500-LED christmas tree.

2 Create Unity RL env WITHOUT mlagents! [v2, no music; shorter transitions]

AI, Machine Learning, Deep Learning and Generative AI Explained

Hurricane Francine getting stronger ahead of Louisiana landfall.

I Became his Sugar Daddy

DRAGON BALL: Sparking! ZERO - GT Character Trailer

My vision for mountain car (no coding in this video!)

RL Hugh

Просмотров 214

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 13 сен 2024

Комментарии • 13

@MobyMotion 11 месяцев назад
I got that far! What I’d love more of is that video that showed us the benefits of incorporating entropy regularisation. I want to learn all the tips and tricks that you can throw at RL when it’s not behaving as you want, and you’re one of the only channels I’ve seen that talks about that. Really helpful :)
@merv893 Год назад
Great vid, hope this makes your day.
@Drwildy 5 месяцев назад ⁺¹
I liked this video because it made me really think... OK why does Policy Iteration and Value Iteration NOT work for the mountain car problem.
But it also made me look into Q-learning because many others have solved with problem with Q learning, so why DOES Q learning work.
Well Q-learning updates its estimates based not only on the current reward but also on the estimated future rewards. This makes it adept at handling environments where rewards are sparse and delayed, as it can effectively backpropagate from the rare occurrences of receiving rewards.
@elijahberegovsky8957 Год назад ⁺²
I’m trying to solve it right now using proximal policy optimisation with random network distillation (no reward shaping, just the curiosity module), and my goodness is it hard. Might just be the issue of hyperparameters, but it took around 1.5M timesteps just to get to the flag once, and after 7.5M it was still taking on average ~150 steps to get to the flag. The agent quickly explores everything easily accessible, and the predictor network just sorta learns everything and stops giving rewards. And then the agent is pretty much at the ground zero again. I want to build Never Give Up on top of it, but I wonder if RND is enough with appropriate parameter tweaking
@dmsovetov 6 месяцев назад
Hi, so how is it going? Did you manage to solve this problem with A2C?
@2theorists 2 года назад
Hi, fantastic content :) I am a bit new to the field of RL and would appreciate it if we could go through some project-based videos (explanation + code)
@rlhugh 2 года назад
Yes definitely. Any particular preferences for what kind of projects you could be interested most in?
@2theorists 2 года назад
@@rlhugh Like we can have some videos in which you can solve mountain car with having some code explanations side by side, I really like your knowledge depth being a deep learning practitioner myself :)
@rlhugh 2 года назад
@@2theorists awesome. Sounds great :)
@elijahberegovsky8957 Год назад
By the way, have you actually coded it up after this video? I’d be very interested to hear what method ended up working
@rlhugh Год назад
No, I started trying more mainstream videos. Which crashed and burned to be honest :P anyway, im kind of dabbling in using Unity as an environment for now, and seeing where that takes me.
@rlhugh Год назад
It's possible I should come back to the mountain car idea. I seem to be getting a fair few comments on this video (relative to my other videos :P )
@rlhugh Год назад ⁺¹
Making RUclips videos is actually like a kind of RL tbh. Very sparse rewards. The reward assignment problem is very challenging. Huge combinatorially large action space...

Следующие

Автовоспроизведение

I run untested, viewer-submitted code on my 500-LED christmas tree.

I run untested, viewer-submitted code on my 500-LED christmas tree.

2 Create Unity RL env WITHOUT mlagents! [v2, no music; shorter transitions]

2 Create Unity RL env WITHOUT mlagents! [v2, no music; shorter transitions]

AI, Machine Learning, Deep Learning and Generative AI Explained

AI, Machine Learning, Deep Learning and Generative AI Explained

Hurricane Francine getting stronger ahead of Louisiana landfall.

Hurricane Francine getting stronger ahead of Louisiana landfall.

I Became his Sugar Daddy

I Became his Sugar Daddy

DRAGON BALL: Sparking! ZERO - GT Character Trailer

DRAGON BALL: Sparking! ZERO – GT Character Trailer

NEW ELDEN RING UPDATE! Patch 1.14 Nerfs Radahn and Buffs Many Weapons!

NEW ELDEN RING UPDATE! Patch 1.14 Nerfs Radahn and Buffs Many Weapons!

Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED

Harvard Professor Explains Algorithms in 5 Levels of Difficulty | WIRED

Real-time Eulerian fluid simulation on a Macbook Air, using GPU shaders

Real-time Eulerian fluid simulation on a Macbook Air, using GPU shaders

A Day in the Life of a Machine Learning Engineer (at a *small* startup)

A Day in the Life of a Machine Learning Engineer (at a *small* startup)

What are AI Agents?

What are AI Agents?

3 Train a Unity RL Env using Stable Baselines3!

3 Train a Unity RL Env using Stable Baselines3!

these compression algorithms could halve our image file sizes (but we don't use them) #SoMEpi

these compression algorithms could halve our image file sizes (but we don't use them) #SoMEpi

How I'd Learn AI (If I Had to Start Over)

How I'd Learn AI (If I Had to Start Over)

NEW Tesla Prototype LEAKED at WB Studios | This Design Is Weird

NEW Tesla Prototype LEAKED at WB Studios | This Design Is Weird

How To Self Study AI FAST

How To Self Study AI FAST

Shocking moment busy bridge COLLAPSES in northern Vietnam

Shocking moment busy bridge COLLAPSES in northern Vietnam

Новейший ИРП Франции! Вот это технологии! Я такого еще не видел

Новейший ИРП Франции! Вот это технологии! Я такого еще не видел

Good fragrance doesn’t have to come with a big price tag🏷️ #trending #shorts #catchysmells #itmaaz

Good fragrance doesn’t have to come with a big price tag🏷️ #trending #shorts #catchysmells #itmaaz

😳Что делать, если вас Похоронили заживо ? #shorts

😳Что делать, если вас Похоронили заживо ? #shorts

A small kitten was dumped #cat #kitten #cutecat

A small kitten was dumped #cat #kitten #cutecat

Эркак ва Аёл #rek #uzbwedding #love #kulgilivideo #live #svadbauz #uzbekistanmusic #music #hahaidea

Эркак ва Аёл #rek #uzbwedding #love #kulgilivideo #live #svadbauz #uzbekistanmusic #music #hahaidea

МАМА В 16 | 2 СЕЗОН, 2 ВЫПУСК | КРИСТИНА, ТЮМЕНЬ

МАМА В 16 | 2 СЕЗОН, 2 ВЫПУСК | КРИСТИНА, ТЮМЕНЬ

Цены на КИТАЙСКОЕ авто которое везут в Россию

Цены на КИТАЙСКОЕ авто которое везут в Россию