Q Learning Intro/Table - Reinforcement Learning p.1

Поделиться
HTML-код
  • Опубликовано: 5 фев 2025

Комментарии • 317

  • @aravindsuresh8157
    @aravindsuresh8157 5 лет назад +166

    When i think about a topic, he posts it. Awesome!

  • @marsf7089
    @marsf7089 Год назад +5

    Great video and so much energetic presentation. I was learning reinforcement learning this week. And this is the only material that doesn't only talk about vague and abstract concepts. So much concrete and deliberated contents!

  • @hobby_coding
    @hobby_coding 4 года назад +5

    i watched this months ago didnt understand a thing now after watching david silver's course i finally can understand what he's talking about , if you are like me dont get frustrated just read more on the subject

    • @Fire6
      @Fire6 4 года назад

      Yeah I think this is not really for complete beginners aha

  • @vickymar3836
    @vickymar3836 5 лет назад +14

    There is an acute lack of good reinforcement learning study materials on the net (especially videos). I literally jumped from my seat.
    I want to binge watch this series.

  • @Stinosko
    @Stinosko 5 лет назад +141

    I'm more than happy to support this awesome channel! keep up the great work, i love your tutorials :-D

    • @Ilya_Sss
      @Ilya_Sss 4 года назад +6

      Thank you so much for supporting free education! You are a great man!

  • @ManuelMendez1
    @ManuelMendez1 4 года назад

    People clicking on the "skip ad" button: these people make money out of those ads also, avoiding this is like saying: "Thank you for taking your time to teach people like me, who otherwise would have to pay for this".

  • @saisritejakuppa9856
    @saisritejakuppa9856 5 лет назад +22

    The wait is over....the only reason I came into this AI field from electrical engineering is just by watching your videos instead of taking some random courses. ...keep going....Thanks a lot sentdex.

    • @AdityaGupta-je9vh
      @AdityaGupta-je9vh 6 месяцев назад

      Hey dude, it’s been 5 years now and a lot has changed, are you still doing AI?

  • @ammarshahzad9627
    @ammarshahzad9627 Год назад +3

    If someone is following this tutorial with the new gymnasium update you need to add new_state, reward, term, trun, _ = env.step(action), instead of new_state, reward, done, _. This should be followed by a
    if term or trun:
    done = True
    this will ensure that the env runs fine

  • @andreydev2132
    @andreydev2132 5 лет назад +31

    One of the most interesting topics for me. Please, continue! It would be very interesting to see self-driving car with Q-Learning (table / deep)

  • @Artificial_Intelligence_AI
    @Artificial_Intelligence_AI 5 лет назад +11

    I have completed several Machine & deep learning courses though these months (from Udemy, RUclips, coursera etc), and I even read some famous books about this field. I think your courses are in the top 3 easily, because they are a perfect combination between a well-conducted intuition approach and a fundamental programming part, even better executed.
    Congratulations for these amazing videos, you deserve our gratitude. I really hope you can get more subscribers during the following years, your content is still underrated.
    Regards from Spain.

    • @sentdex
      @sentdex  5 лет назад +2

      That's really awesome to hear!

  • @MrDan2512
    @MrDan2512 5 лет назад +11

    Just what I needed for my master thesis.

  • @ahmedhany5037
    @ahmedhany5037 5 лет назад +3

    I can't thank you enough for these awesome tutorials you give us . It is the most practical reinforcement learning guide I have ever seen. Please keep up with this AWESOME work .

    • @sentdex
      @sentdex  5 лет назад

      Happy to do it!

  • @lunapopo8415
    @lunapopo8415 2 года назад +37

    For the latest gym package, to avoid backward compatibility warnings
    1) define env = gym.make("MountainCar-v0", new_step_api=True, render_mode='human')
    2) remove env.render()

    • @atharvachouhan474
      @atharvachouhan474 2 года назад +1

      bro you literally saved my life thanks a lot

    • @alansabok7462
      @alansabok7462 2 года назад +20

      as pe November 2022, the env.step() is also producing more variables, so have to replace 10 line with
      new_state, reward, done, truncated,info = env.step(action)

    • @onlyshorts6837
      @onlyshorts6837 Год назад +2

      does any one of you have encounter a problem cuz when i write his project , it does run but nothing is shown , i am using python version 3.10 ?

    • @lommoberry7312
      @lommoberry7312 Год назад

      you need to import pygame@@onlyshorts6837

    • @Nxck2440
      @Nxck2440 Год назад +6

      The code working for me was (also required $ pip install pygame)
      import gym # OpenAI gym
      env = gym.make("MountainCar-v0", render_mode='human')
      env.reset()
      done = False
      while not done:
      action = 2 # go right
      new_state, reward, done, truncated, info = env.step(action)
      env.render()
      env.close()

  • @hyperistica
    @hyperistica 5 лет назад +3

    I just got started with reinforcement learning and your tutorial is really helpful. On a side note, I also love the way you laugh (that deep inhale gets me every time).

  • @jayhu6075
    @jayhu6075 5 лет назад +2

    Hi, first the switch from javascript to python and then give a topic about reinforcement. That is amazing....
    The learning curve that you explain make the live from a developer so easy and simple. Thank you. mr.Sentdex

  • @ayaanp
    @ayaanp 5 лет назад +1

    I LOVE THIS! I have been wanting to learn Reinforcement Learning and this is the start. Your videos are NEVER bad. You are teaching this 9 year old(me) with your website and youtube channel. I now know all python basics, AI, robotics, almost all because of YOU!

  • @Totial
    @Totial 5 лет назад +3

    Man you make learning so easy, i think you have no idea how much you are changing this world for good! So much tutorials out there are linked to you and so much ppl becoming able to reach their dreams because of you. Respect!! Keep up the amazing job

  • @abhinavpy2748
    @abhinavpy2748 5 лет назад +1

    Most awaited topic. And it comes from the one and only Sentdex!! Thanks a lot. Please make as many tutorials as possible.

  • @jorostuff
    @jorostuff 5 лет назад +2

    I feel like this guy knows everything. Whatever I google, he has a tutorial on that.

  • @scavallarin
    @scavallarin 2 года назад +2

    Incredible work, the best explanation i have found. It makes this concepts so easy to understand compared to many books on this topic that i have been studying. Thanks for you awsome work!!!

  • @michaelfrangos8587
    @michaelfrangos8587 5 лет назад

    You're the best. My simple networks are just not doing the job well enough. Perhaps this series will be what's needed.

  • @s16ray_
    @s16ray_ 5 лет назад +3

    Learned a lot from you.... Started machine learning from your channel only...

  • @priyankrajsharma
    @priyankrajsharma 5 лет назад

    Q learning is difficult to understand .. I read so many blogs before coming to your channel. You made it easy.

  • @renanbuchan1633
    @renanbuchan1633 2 года назад

    “We just do this! *shows big complicated equation* duh!”
    Earned a subscriber lol

  • @loukask.9111
    @loukask.9111 5 лет назад +52

    Dude how do you alway know what kinds of videos I need?! This is perfect!

    • @GabrielCarvv
      @GabrielCarvv 4 года назад +5

      He's made a secretive and expansive AI that monitors every single viewer

  • @varmhund
    @varmhund 2 года назад +12

    for others coming here in late 2022 struggling with the rendering due to module updates.
    import gym
    env = gym.make("MountainCar-v0", render_mode="human")
    observation, info = env.reset()
    done = False
    while not done:
    action = 2
    observation, reward, done, truncated, info = env.step(action)
    if done or truncated:
    observation, info = env.reset()
    env.close()

    • @danielma2824
      @danielma2824 2 года назад

      thank you

    • @shreyashsinha933
      @shreyashsinha933 Год назад

      Hi could you point to resource where i could find an updated version of this

    • @onlyshorts6837
      @onlyshorts6837 Год назад

      how on god green earth found the answer ?
      please

    • @dabunnisher29
      @dabunnisher29 9 месяцев назад

      Thank you so much.

  • @ahmedelsayedabdelnabyrefae1365
    @ahmedelsayedabdelnabyrefae1365 3 года назад

    you are great man actually ,you are my mentor now

  • @fuuman5
    @fuuman5 5 лет назад +5

    Uhh, just sitting on the toilet and the notification comes in. Some nice ML quality content from my favorite python buddy

  • @thesue112-v2r
    @thesue112-v2r 4 года назад +4

    "How do we do that? , We just do this *Shows the Q Function* , DUH!"
    That cracked me up xD

  • @ernestassimutis6239
    @ernestassimutis6239 5 лет назад +2

    Nice topic! Hope it will have at least 100 series. Thank you!

  • @prathamprasoon2535
    @prathamprasoon2535 4 года назад +1

    Yay! Thank you Sentdex for these brilliant tutorials.

    • @temukza
      @temukza 3 года назад

      Hey I know you twitter guy

  • @viktorkuzmanov3086
    @viktorkuzmanov3086 4 года назад

    Number one AI channel on yt by far

  • @MrDan2512
    @MrDan2512 5 лет назад

    I try to use DQN to plan an agent’s route in a dense moving crowd. My tools are UE4 and TF + Cuda. Can’t wait for the deep Q-learning video.

  • @mockingbird3809
    @mockingbird3809 5 лет назад +3

    Wow.....This is Video I Was Waiting For.....Thanks, Harrison.

  • @gunjanmimo
    @gunjanmimo 4 года назад +1

    you RL videos helped me a lot in my research work. Thank you. Make some videos on Unity Machine Learning agent, hope the audience will be benefited from these videos

  • @aradarbel4579
    @aradarbel4579 5 лет назад +2

    im so excited about this new series! good luck, will be looking for next episodes :D

  • @vibekdutta6539
    @vibekdutta6539 5 лет назад +3

    The thing I've been waiting for, you're awesome!

  • @Phateau
    @Phateau 5 лет назад

    Finally, I have been waiting for this. Please do a long series! Thank you

    • @sentdex
      @sentdex  5 лет назад +1

      It will be JUST the right length :D

  • @cruelworld4732
    @cruelworld4732 5 лет назад +3

    Please be quick with the next videos, I am working on a project and I am gonna need your help,
    Keep up the good work

  • @ebimonaca
    @ebimonaca 3 года назад

    Thank you for nice deep"Q"Learning video

  • @k-bala-vignesh
    @k-bala-vignesh 5 лет назад +1

    Yes! I have the hard copy of sutton barto. Now is the time to open it :)

  • @ruantwice
    @ruantwice 3 года назад

    You are an absolute boss. Thank you for the quality content!

  • @harkishansinghbaniya2784
    @harkishansinghbaniya2784 5 лет назад

    Just love your videos and explanations. I was just waiting for the Q-Learning Tutorial Series.

  • @vulthuryol8051
    @vulthuryol8051 5 лет назад +2

    15:34
    _"That's gotta be the best table I've ever seen"_
    _"So it would seem..."_

  • @arshshah1871
    @arshshah1871 4 года назад +4

    "paint is love, paint is life" -sentdex 2019

  • @franzweitkamp
    @franzweitkamp 5 лет назад +4

    Thank you very much for this. It would be cool to see qlearning applied to some little game like connect 4. Keep up the good work!

  • @Mvobrito
    @Mvobrito 5 лет назад +1

    Was waiting for this!

  • @martinprinceton9858
    @martinprinceton9858 3 года назад

    This is really a great explanation. I love this

  • @gamzeetuncay
    @gamzeetuncay 4 года назад

    it is so helpful my thesis, thanks a lot

  • @BarkanUgurlu
    @BarkanUgurlu Год назад +3

    As of December 2023 (Python 3.12)
    import gym
    env = gym.make("MountainCar-v0", render_mode='human')
    new_state, info = env.reset()
    #print(env.observation_space.high)
    #print(env.observation_space.low)
    #print(env.action_space.n)
    done = False
    while not done:
    action = 2
    new_state, reward, done, truncated, info = env.step(action)
    print(new_state)
    if done or truncated:
    new_state, info = env.reset()

    env.close()

  • @thomaswoo6276
    @thomaswoo6276 5 лет назад

    Can't wait for the next episode! Great work, and ofc thank you.

    • @sentdex
      @sentdex  5 лет назад

      Next one is out :D

    • @thomaswoo6276
      @thomaswoo6276 5 лет назад

      @@sentdex watched!!! and love you as yesterday!!!!!!!

  • @JustThomas1
    @JustThomas1 5 лет назад +1

    I've been waiting for this for quite some time as I found trying to get into qlearning from purely reading documentation to be a complete mess.
    Also on a different note I can't be the only one who listens to the tutorials while driving just because it's almost relaxing.

    • @sentdex
      @sentdex  5 лет назад

      I've heard from quite a few ppl that they just listen to listen lol.

  • @Sporkredfox
    @Sporkredfox 5 лет назад

    Oh, this is funny! I am currently going through your python-sc2 tutorial and might be attempting to include Q-Learning once I learn about it (I know what you said about it in the video about why you didn't use it)
    Looking forward to this tutorial! Thank you for the content!

  • @ahmedbenyoucef3238
    @ahmedbenyoucef3238 4 месяца назад

    Thank you very much, amazing work

  • @vigeshmadanan
    @vigeshmadanan 5 лет назад

    Excited for this tutorial series :D

  • @st00ch
    @st00ch 5 лет назад +2

    Omg! RL I'm so excite!

  • @AbhishekKumar-mq1tt
    @AbhishekKumar-mq1tt 5 лет назад +1

    Thank u for this awesome video and series

  • @ftmftm7627
    @ftmftm7627 3 года назад

    You are a legend man! Thank you

  • @berkc5323
    @berkc5323 4 года назад

    Amazing channel man, keep doing this!!!

  • @varuntotakura8139
    @varuntotakura8139 5 лет назад +2

    I guess you will be showing us many of the real-time examples which have a broad scope. Thank you..! :)

  • @Swiethart7
    @Swiethart7 5 лет назад +4

    Was really looking forward to you doing more RL stuff :)

  • @szajbon
    @szajbon 5 лет назад +11

    Great video! I have one suggestion though - consider mentioning that gym uses numpy arrays and not basic python lists. It might be confusing for someone that you basically divide a list by list and get another list - its a specific implementation of numpy.array that gives you that high-level convenience. I just stumbled on your video, so maybe you pointed that out in some other videos, but hey, for a newcomer it can be mind-bending after getting some weird bug after some time.

  • @ivmIndi
    @ivmIndi 5 лет назад

    Thanks for the tutorials.!(eagerly waiting for DQN cuz i am kinda stuck there.!)
    .
    loved your thinking about the education system.!

  • @ahmedgabr8009
    @ahmedgabr8009 5 лет назад

    Thanks for the great tutorial ! Can't wait for the next video !!!!!!!

  • @pujanagarwal7316
    @pujanagarwal7316 5 лет назад +21

    Can you upload a series on GAN. Really need to know the intuition behind it

  • @siamakvakili6349
    @siamakvakili6349 5 лет назад

    I really enjoy your lessons. Thank you very very much.

  • @RahulSoni-xv4cz
    @RahulSoni-xv4cz 4 года назад +3

    is there any good video that doesn't use gym ? searching around leads to gym only.

  • @ВасилийПупкин-ю6п2щ

    Thank you for the useful lessons, sentdex.
    It is very interesting to understand the problem of learning based on time series. This is when there is some record of the battle, and you need to train the algorithm on it to choose the best action. I would also want to understand how to prepare such time series for transmission to the algorithm. Something like that. Have a good day!

  • @CodeWithDerrick
    @CodeWithDerrick 5 лет назад

    Great, well explained intro. Thanks!

  • @lukerhoads
    @lukerhoads 5 лет назад

    Awesome content that is always new to me. Thanks!

  • @ELarivie
    @ELarivie 5 лет назад

    Sentdex you're the best!

  • @fuadkhan3449
    @fuadkhan3449 3 года назад

    Love your mug

  • @nagLostInEntropy
    @nagLostInEntropy 4 года назад

    Great video! Thank u so much!

  • @DonaldTrump101-o7d
    @DonaldTrump101-o7d 4 года назад

    Thanks a lot 😊 , very helpful

  • @girishkumar2759
    @girishkumar2759 5 лет назад

    That's what I was waiting for

  • @AmbarishGK
    @AmbarishGK 5 лет назад +6

    "Paint is love, Paint is life"

  • @MultiWolfxxx
    @MultiWolfxxx 5 лет назад

    Love this channel.

  • @Ruddradev
    @Ruddradev 5 лет назад +1

    Thank you for this tutorial. I knew the theory but your tutorial helped me put it to practice. Also for anyone looking for theoretical background into RL, check out David Silver's 10 lecture series on Reinforcement Learning.

  • @iliasp4275
    @iliasp4275 3 года назад

    hello, at 4:50 , isnt position and velocity vector values? Therefore shouldn't the state be 2 tuples , 4 values instead of two ints?

  • @whothefisyash
    @whothefisyash 5 месяцев назад +1

    Use this if u r facing trouble
    import gym
    env = gym.make("MountainCar-v0", render_mode="human")
    observation, info = env.reset()
    done = False
    while not done:
    action = 2
    observation, reward, done, truncated, info = env.step(action)
    if done or truncated:
    observation, info = env.reset()
    env.close()

  • @douglasferreira3506
    @douglasferreira3506 5 лет назад

    Finally!! You are the best

  • @dr.mikeybee
    @dr.mikeybee 5 лет назад

    This is very clear. Thank you.

  • @rdwansrhan3209
    @rdwansrhan3209 5 лет назад

    Great video, as always.

  • @user-or7ji5hv8y
    @user-or7ji5hv8y 5 лет назад

    amazing how you know so many things!

  • @mannycalavera121
    @mannycalavera121 5 лет назад

    Love the videos and a series, thanks for putting these out

  • @alazahir
    @alazahir 5 лет назад

    I was waiting for this... RL teached by you !! and I have commented even before seeing the video

  • @jackflynn3097
    @jackflynn3097 5 лет назад +1

    finally RL is here. I've been stuck with A3C recently. Hope one day you will cover it

    • @gianistatie207
      @gianistatie207 5 лет назад +1

      If you are looking for a continuous action space control, then you may also want to look into DDPG. Otherwise Q-learning and variations on the topic may be a good starting point.

  • @wahab487
    @wahab487 5 лет назад

    I think you might confuse the discount factor with the learning rate. The discount factor can be based on how the reward is distributed across time.

  • @sachinaugustine9023
    @sachinaugustine9023 3 года назад

    This is gold

  • @ntchindagiscard3870
    @ntchindagiscard3870 5 лет назад

    you are awesome man. I love tour channel

  • @rouhollahabolhasani1853
    @rouhollahabolhasani1853 5 лет назад +1

    wwwwwwwhat is goin on everybody!!
    love it!

  • @flosset9640
    @flosset9640 5 лет назад

    this is super cool

  • @annguyenquy1212
    @annguyenquy1212 8 месяцев назад

    can u suggest me some other videos/chanels also sharing about RL like this. This is the most funny and easy to understand video i have ever watched

  • @gautamj7450
    @gautamj7450 5 лет назад +2

    YEEESSSS!!!!

  • @tejasshah9881
    @tejasshah9881 5 лет назад

    Man, Thank you so much. I love you for this.

  • @Swiethart7
    @Swiethart7 5 лет назад +4

    Any chance you could do a tutorial on an actor critic or PPO algorithm after the DQN tutorial? ;) Maybe in the long term a tutorial on combining these algorithms with the unity environment.

  • @RandomShowerThoughts
    @RandomShowerThoughts 5 лет назад

    man i haven't seen Q learning videos at all before this

  • @hjchew9810
    @hjchew9810 5 лет назад

    Great job!

  • @stanleychen6710
    @stanleychen6710 2 года назад +1

    also we need pygame installed too for the gym to show us the actual environment

  • @bobsamuelson8130
    @bobsamuelson8130 5 лет назад

    Excellent!