[DEPRECATED] Visual Maze Solving with Deep Reinforcement Learning in Keras | Detailed Explanation

Поделиться
HTML-код
  • Опубликовано: 8 сен 2024

Комментарии • 20

  • @ashishbhong5901
    @ashishbhong5901 Год назад

    loved it. Keep up the work. It was amazing.

  • @yashbhinwal2801
    @yashbhinwal2801 3 года назад

    you are awesome dude

  • @perlindholm4129
    @perlindholm4129 4 года назад

    Idea - Could you do a reinforcement learning where you learn MNIST numbers. A random path is optimized for best fit in viewing the pixels in each sample. Like moving a camera or moving an MNIST number in the camera view. Move the number so a simple model gets the best recognition. Maybe a spring constant function. One the resets its value but gets the right value when there is input to be processed.

    • @JackofSome
      @JackofSome  4 года назад

      That's a good idea, though I don't think it needs reinforcement learning. spatial transformer networks already work on this principle I believe

  • @fjolublar
    @fjolublar 3 года назад

    I'm really having a bad time understanding how the DRL model needs to be trained in regards to the Bellman equation. Simple Q-Learning is easy to comprehend with the q -table and continues update to the q-values per actions. But in Deep RL i can't seem to grasp how the model needs to be updated. My first thoughts was using DRL like a normal supervized learning problem. first creating a q-table from using Q-Learning and than using a deep model on that data and so trying to train this model for other data that the Q-Learning has not seen yet. But obviously this is not the case. Do you have any recommendations on what to read for me to understand more slowly what is supposed to happen regarding the model update phase of the training?

  • @rp88imxoimxo27
    @rp88imxoimxo27 3 года назад

    Next time may be you should create a model with Subclassing API which allows you to create any kinds of custom losses, metrics, etc instead of using a poor sequential model wrapped in one function lol. And for such models have you tried to use Batch Normalization or selu activation with lecun kernel norm to get much faster convergence?

    • @JackofSome
      @JackofSome  3 года назад

      Unsure if your comment is regarding the old Keras API or the new post tensorflow 2 API. TF2 makes this a breeze (and I will continue to use sequential models as they continue to be useful).

  • @zhaoyang6964
    @zhaoyang6964 4 года назад

    Really useful and I'm also struggling with DQN for solving mazes. Currently, I'm not using a target network since a lot of tutorials don't contain a target network as well. Does that really matter and will try to add it now. Thanks a lot

    • @JackofSome
      @JackofSome  4 года назад

      It matters a great deal. Check out my more recent streams where I do dqn from scratch in pytorch

    • @zhaoyang6964
      @zhaoyang6964 4 года назад

      @@JackofSome Thanks a lot, will check it out :)

  • @rajroy2426
    @rajroy2426 3 года назад

    i was wondering if you completed the robot docking problem , I like robotics a lot

    • @JackofSome
      @JackofSome  3 года назад

      Unfortunately no. I haven't had enough time to continue my RL work sadly.

    • @rajroy2426
      @rajroy2426 3 года назад

      @@JackofSome thanks a lot for your reply, I will give it a try, is it okay if ask you for some suggestion if I get stuck

    • @JackofSome
      @JackofSome  3 года назад +1

      Yeah that's fine

  • @mike_o7874
    @mike_o7874 2 года назад

    what does this line do?
    trainable_model.compile(optimizer=Adam(), loss=lambda yt, yp: yp)
    what is lambda yt
    and what is yp?

    • @JackofSome
      @JackofSome  2 года назад +1

      lambda is the keyword in python to define a function inline. I recommend reading up on it. It's pretty useful

    • @mike_o7874
      @mike_o7874 2 года назад

      @@JackofSome thanks!

  • @michaelbittar964
    @michaelbittar964 4 года назад

    Where is livestream number 2 i cant find it and i NEED IT

    • @JackofSome
      @JackofSome  4 года назад +1

      ruclips.net/video/_7D8W-uUSxw/видео.html
      I made it unlisted because of the unmitigated disaster it was. Trust me you don't need it 😅

  • @doonamkim7593
    @doonamkim7593 Год назад

    Why this is [DEPRECATED]? Is there a new method of Maze Solving with Deep RL?