Q-Learning Tutorial 3: Train Gymnasium MountainCar-v0 on a Continuous Observation Space

Поделиться
HTML-код
  • Опубликовано: 27 ноя 2024

Комментарии • 33

  • @johnnycode
    @johnnycode  Год назад +4

    Ready for Deep Q-Learning? ruclips.net/video/EUrWGTCGzlA/видео.html

  • @JordanMetroidManiac
    @JordanMetroidManiac 10 месяцев назад +5

    This is a great video for people who don't really want to bother with traditional Q-learning because it's almost always impractical due to the size of the Q table, but they want to see it actually solve a problem with a simple, step-by-step writing and explanation of the algorithm. That's why I think this video is one of the best (applied) machine learning videos out there. Straight and to the point, no fluff like all the popular videos are. Simply writing the algorithm out and explaining what each part does. You don't explain how and why Q learning works, but that's obviously not the point of the video. The point of the video is to connect the theory to an application of it. Thanks for doing that so well!

  • @BillPark-ey6ih
    @BillPark-ey6ih 3 месяца назад +2

    What I learned in this video:
    1. Discretizing continuous state space
    2. Cleanly organizing code: training and rendering flag.
    3. Displaying the learning result

  • @ApexArtistX
    @ApexArtistX Год назад +4

    thanks for gymnasium most youtube outdated tutorials is stuck with gym..

    • @revimfadli4666
      @revimfadli4666 Год назад

      What are the major improvements compared to old gym?

    • @johnnycode
      @johnnycode  11 месяцев назад

      @@revimfadli4666 I don't think there are major improvements between Gymnasium vs Gym. The support and bug fixes were transferred to another team, so they renamed Gym to Gymnasium.

    • @revimfadli4666
      @revimfadli4666 11 месяцев назад

      @@johnnycode oh OK thanks, so outdated tutorials aren't that outdated after all?

    • @johnnycode
      @johnnycode  11 месяцев назад

      @@revimfadli4666 That is correct😀

  • @Sitotaw101
    @Sitotaw101 Месяц назад

    I have been using a different coding structure using DQN to solve the mountain car v0 problem with discrete action. And the main problem was that every time I run this same code, I get different results, that is sometimes good and sometimes bad. What do you advice me. this problem is really embarrassing me. Thank you for this video and keep up the good work brother. I wish I can show you my code to discuss about it. But I think youtube does not allow us to do that. Any way I can't wait to see your response.

    • @johnnycode
      @johnnycode  Месяц назад

      I have a video using DQN on MountainCar-v0 using discrete actions:
      ruclips.net/video/oceguqZxjn4/видео.html
      If you wrote your own DQN code, you can compare it to mind and check for mistakes:
      ruclips.net/p/PL58zEckBH8fCMIVzQCRSZVPUp3ZAVagWi
      You can use Stable Baselines3's DQN as well:
      ruclips.net/video/OqvXHi_QtT0/видео.html

  • @andressacavalcante2546
    @andressacavalcante2546 4 месяца назад +1

  • @gotjucom
    @gotjucom 9 месяцев назад

    thanks dude this was very helpful.. subbed

  • @YoungjuPark86
    @YoungjuPark86 10 месяцев назад

    Hi, thank you for producing this fantastic video! I'm curious about the version of the python and the libraries you used. I'm currently facing issues downloading the pickle library due to environment solving failures :(

    • @YoungjuPark86
      @YoungjuPark86 10 месяцев назад +1

      Never mind, I solved it

  • @NeonThorax
    @NeonThorax Год назад +1

    Good concise video

  • @ElisaFerrari-q5i
    @ElisaFerrari-q5i 4 месяца назад

    is there a possibility that, due to the slippery flag, the agent chooses the best action (knowing the best path) but it falls in the hole?

    • @johnnycode
      @johnnycode  4 месяца назад +1

      Yes, the agent can fall into the hole when slippery is on even if it knows the best path. Think of "best path" as the path with the highest chance of success.

  • @luiscaamano
    @luiscaamano Месяц назад

    He just goes up n down pasting code, not teaching a thing, but the code seems to work. Need to pause a hundred times the ChatGPT-Voice of this guy

  • @AliRaza-mc1ob
    @AliRaza-mc1ob Год назад +1

    can you do the code using NN instead of Q Table in tensorflow or PyTorch please?

    • @johnnycode
      @johnnycode  11 месяцев назад

      Hi, in case you're still looking for a Deep Q-Learning video, I've recently released a detailed one, but is on the FrozenLake environment: ruclips.net/video/EUrWGTCGzlA/видео.html

  • @rajnikushwaha1459
    @rajnikushwaha1459 Год назад

    how you use pygame window here ?

  • @funnymoments6874
    @funnymoments6874 Год назад

    What is the version of gym library of this video? Is it the latest version?

    • @johnnycode
      @johnnycode  Год назад

      Yup, it’s the latest version of the Gymnasium library (as of recording, a few weeks ago). I have a link in the description for installation if you run into trouble. Note that the original Gym library is no longer maintained, the support has been moved to Gymnasium.

    • @funnymoments6874
      @funnymoments6874 Год назад

      @theavgdev I have file not found error in line 12

    • @funnymoments6874
      @funnymoments6874 Год назад

      When I change it to wb instead of rb permission error appears

  • @samarthkrishnamuthy9086
    @samarthkrishnamuthy9086 Год назад

    how much time did it take to complete 5000 episodes?

    • @johnnycode
      @johnnycode  Год назад

      I think it only took one minute.

    • @cr4zygleb621
      @cr4zygleb621 Год назад

      @@johnnycode
      Tell me how you speed up the training of the program? I'm trying on CarRacing-v2 and can't speed up the car. All races are in real time. so I will never train the model)

    • @johnnycode
      @johnnycode  Год назад +2

      @@cr4zygleb621 You must turn off the animation during training by setting render mode to None. For example: env = gym.make("CarRacing-v2", render_mode=None)

  • @ozgurgeylani6717
    @ozgurgeylani6717 11 месяцев назад

    where is the pkl ?

    • @johnnycode
      @johnnycode  11 месяцев назад

      Run the code with training turned on, then you'll see a new pkl file.

  • @HaulatuDAhiru
    @HaulatuDAhiru 8 месяцев назад

    your mail please

    • @johnnycode
      @johnnycode  7 месяцев назад

      Please drop your questions here.