Deep Q-Learning/Deep Q-Network (DQN) Explained | Python Pytorch Deep Reinforcement Learning

Reinforcement Learning, by the Book

Make Python code 1000x Faster with Numba

Boys... We Need to Talk

September 06 2024 Papua New Guinea looks forward to Pope Francis' arrival

[DEPRECATED] Visual Maze Solving with Deep Reinforcement Learning in Keras | Detailed Explanation

Jack of Some

Просмотров 10 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 8 сен 2024

Комментарии • 20

@ashishbhong5901 Год назад
loved it. Keep up the work. It was amazing.
@yashbhinwal2801 3 года назад
you are awesome dude
@perlindholm4129 4 года назад
Idea - Could you do a reinforcement learning where you learn MNIST numbers. A random path is optimized for best fit in viewing the pixels in each sample. Like moving a camera or moving an MNIST number in the camera view. Move the number so a simple model gets the best recognition. Maybe a spring constant function. One the resets its value but gets the right value when there is input to be processed.
@JackofSome 4 года назад
That's a good idea, though I don't think it needs reinforcement learning. spatial transformer networks already work on this principle I believe
@fjolublar 3 года назад
I'm really having a bad time understanding how the DRL model needs to be trained in regards to the Bellman equation. Simple Q-Learning is easy to comprehend with the q -table and continues update to the q-values per actions. But in Deep RL i can't seem to grasp how the model needs to be updated. My first thoughts was using DRL like a normal supervized learning problem. first creating a q-table from using Q-Learning and than using a deep model on that data and so trying to train this model for other data that the Q-Learning has not seen yet. But obviously this is not the case. Do you have any recommendations on what to read for me to understand more slowly what is supposed to happen regarding the model update phase of the training?
@rp88imxoimxo27 3 года назад
Next time may be you should create a model with Subclassing API which allows you to create any kinds of custom losses, metrics, etc instead of using a poor sequential model wrapped in one function lol. And for such models have you tried to use Batch Normalization or selu activation with lecun kernel norm to get much faster convergence?
@JackofSome 3 года назад
Unsure if your comment is regarding the old Keras API or the new post tensorflow 2 API. TF2 makes this a breeze (and I will continue to use sequential models as they continue to be useful).
@zhaoyang6964 4 года назад
Really useful and I'm also struggling with DQN for solving mazes. Currently, I'm not using a target network since a lot of tutorials don't contain a target network as well. Does that really matter and will try to add it now. Thanks a lot
@JackofSome 4 года назад
It matters a great deal. Check out my more recent streams where I do dqn from scratch in pytorch
@zhaoyang6964 4 года назад
@@JackofSome Thanks a lot, will check it out :)
@rajroy2426 3 года назад
i was wondering if you completed the robot docking problem , I like robotics a lot
@JackofSome 3 года назад
Unfortunately no. I haven't had enough time to continue my RL work sadly.
@rajroy2426 3 года назад
@@JackofSome thanks a lot for your reply, I will give it a try, is it okay if ask you for some suggestion if I get stuck
@JackofSome 3 года назад ⁺¹
Yeah that's fine
@mike_o7874 2 года назад
what does this line do?
trainable_model.compile(optimizer=Adam(), loss=lambda yt, yp: yp)
what is lambda yt
and what is yp?
@JackofSome 2 года назад ⁺¹
lambda is the keyword in python to define a function inline. I recommend reading up on it. It's pretty useful
@mike_o7874 2 года назад
@@JackofSome thanks!
@michaelbittar964 4 года назад
Where is livestream number 2 i cant find it and i NEED IT
@JackofSome 4 года назад ⁺¹
ruclips.net/video/_7D8W-uUSxw/видео.html
I made it unlisted because of the unmitigated disaster it was. Trust me you don't need it 😅
@doonamkim7593 Год назад
Why this is [DEPRECATED]? Is there a new method of Maze Solving with Deep RL?

Следующие

Автовоспроизведение

Deep Q-Learning/Deep Q-Network (DQN) Explained | Python Pytorch Deep Reinforcement Learning

Deep Q-Learning/Deep Q-Network (DQN) Explained | Python Pytorch Deep Reinforcement Learning

Reinforcement Learning, by the Book

Reinforcement Learning, by the Book

Make Python code 1000x Faster with Numba

Make Python code 1000x Faster with Numba

Boys... We Need to Talk

Boys... We Need to Talk

September 06 2024 Papua New Guinea looks forward to Pope Francis' arrival

September 06 2024 Papua New Guinea looks forward to Pope Francis' arrival

He EXPOSED 2 NFL Players w/ ONE HAND! ($10,000 Baltimore 1on1’s)

He EXPOSED 2 NFL Players w/ ONE HAND! ($10,000 Baltimore 1on1’s)

Reinforcement Learning with sparse rewards

Reinforcement Learning with sparse rewards

The Fastest Maze-Solving Competition On Earth

The Fastest Maze-Solving Competition On Earth

Neural Network Learns to Play Snake using Deep Reinforcement Learning

Neural Network Learns to Play Snake using Deep Reinforcement Learning

Training an unbeatable AI in Trackmania

Training an unbeatable AI in Trackmania

Why More People Dont Use Linux

Why More People Dont Use Linux

Reinforcement Learning: Machine Learning Meets Control Theory

Reinforcement Learning: Machine Learning Meets Control Theory

Building a neural network FROM SCRATCH (no Tensorflow/Pytorch, just numpy & math)

Building a neural network FROM SCRATCH (no Tensorflow/Pytorch, just numpy & math)

What are Genetic Algorithms?

What are Genetic Algorithms?

Reinforcement Learning from scratch

Reinforcement Learning from scratch

Пройди игру и скушаешь борщ (2024)

Пройди игру и скушаешь борщ (2024)

Я ЖЕ БЕРЕМЕННА#cat

Я ЖЕ БЕРЕМЕННА#cat

Protect the environment and do not litter#Short #Officer Rabbit #angel

Protect the environment and do not litter#Short #Officer Rabbit #angel

Слушайте трек "Гио Пика, MIRAVI - Мир" на всех музыкальных площадках 🔥

Слушайте трек "Гио Пика, MIRAVI - Мир" на всех музыкальных площадках 🔥

Я СДЕЛАЛ ГИГАНТСКУЮ ДУБАЙСКУЮ ШОКОЛАДКУ ВЕСОМ 110 КИЛОГРАММ

Я СДЕЛАЛ ГИГАНТСКУЮ ДУБАЙСКУЮ ШОКОЛАДКУ ВЕСОМ 110 КИЛОГРАММ

СКОЛЬКО СТОИТ ОДЕЖДА ИГРОКА PUBG MOBILE 😨

СКОЛЬКО СТОИТ ОДЕЖДА ИГРОКА PUBG MOBILE 😨

чистка пляжа - неожиданные находки

чистка пляжа - неожиданные находки

POV: Your kids ask to play the claw machine

POV: Your kids ask to play the claw machine