Deep Reinforcement Learning Tutorial for Python in 20 Minutes

Поделиться
HTML-код
  • Опубликовано: 2 окт 2024
  • Worked with supervised learning?
    Maybe you’ve dabbled with unsupervised learning.
    But what about reinforcement learning?
    It can be a little tricky to get all setup with RL. You need to manage environments, build your DL models and work out how to save your models down so you can reuse them. But that shouldn’t stop you!
    Why?
    Because they’re powering the next generation of advancements in IOT environments and even gaming and the use cases for RL are growing by the minute. That being said, getting started doesn’t need to be a pain, you can get up and running in just 20 minutes working with Keras-RL and OpenAI.
    In this video you’ll learn how to:
    1. Create OpenAI Gym environments like CartPole
    2. Build a Deep Learning model for Reinforcement Learning using Tensorflow and Keras
    3. Train a Reinforcement Learning model using Deep Q Policy based learning using Keras-RL
    Github Repo for the Project: github.com/nic...
    Want to learn more about it all:
    Open AI Gym: gym.openai.com...
    Keras RL: keras-rl.readt...
    Oh, and don't forget to connect with me!
    LinkedIn: / nicholasrenotte
    Facebook: / nickrenotte
    GitHub: github.com/nic...
    Happy coding!
    Nick
    P.s. Let me know how you go and drop a comment if you need a hand!
    Music by Lakey Inspired
    Chill Day - • LAKEY INSPIRED - Chill...

Комментарии • 458

  • @coded6799
    @coded6799 3 года назад +34

    For your content, 6.5k subs are too little. I have been scouring the internet for reinforcement learning courses ever since AlphaGo beat the world champion, and today I found your video. And I'm glad I did.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +10

      Yooo, thanks so much! I've got a bunch more RL stuff coming soon!

    • @coded6799
      @coded6799 3 года назад

      @@NicholasRenotte Cool!

    • @freydunthanos3155
      @freydunthanos3155 3 года назад +3

      Seriously, I'm recommending this channel to my data science class

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +2

      @@freydunthanos3155 yesss, thanks so much!

    • @rmt3589
      @rmt3589 3 года назад +1

      A Go fan. Didn't expect to see another person of culture here.

  • @andreapalladino7999
    @andreapalladino7999 Год назад +1

    The best tutorial on how to start with reinforcement learning that I have ever seen!

  • @sagnikroy6405
    @sagnikroy6405 5 месяцев назад

    I watch your videos and feel like you taught us a very important topic like no one did. I do believe this is how no one shouldn't. Better to follow written documentations!!!

  • @prhmma
    @prhmma 3 года назад +54

    nice pace and simple work through, love it man.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +3

      Thanks so much 🙏! Got another run of RL tutorials coming up soon!

  • @raihankhanphotography6041
    @raihankhanphotography6041 3 года назад +2

    I am so glad I stumbled across your channel. Best tutorial ever! THANK YOU!!!

  • @mohammedbasheer581
    @mohammedbasheer581 3 года назад +1

    Thank you Nic! Very helpful of you to make such informative videos for all! Wish you lots of success and joy!

  • @BRUNO12059
    @BRUNO12059 3 года назад +1

    I am from Brazil and your video was very useful for me !!! I hope you to continue to make more videos like that. Great video !!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Glad you liked it @Bruno! Definitely, got a special one on Reinforcement Learning coming up!

  • @BlinkDrive555
    @BlinkDrive555 2 года назад +2

    In 2022, the code might not work well
    Instead of :
    from tensorflow._api.v1.keras import Sequential
    from tensorflow.keras.layers import Dense, Flatten
    from tensorflow.keras.optimizers import Adam
    you need to use :
    from tensorflow.python.keras import Sequential
    from tensorflow.python.keras.layers import Dense, Flatten
    from tensorflow.python.keras.optimizers import adam_v2

  • @supankanlavanathan463
    @supankanlavanathan463 Год назад +9

    Video Summary (Made with HARPA AI):-
    00:30 🧠 Core concept: "Area 51" summarizes Action, Reward, Environment, and Agent in reinforcement learning.
    01:00 🐍 Python setup: Use OpenAI Gym, TensorFlow, Keras to create and train reinforcement learning models.
    04:52 🏞 Gym environment setup: Import dependencies, set up the environment, and extract states and actions.
    08:27 🧠 Build deep learning model: Construct a model with TensorFlow and Keras.
    13:48 🤖 Train the model: Compile and train with KerasRL, monitor progress.
    15:06 🎯 Test the model: Evaluate performance in the Gym environment.
    17:30 💾 Save and reload weights: Save and reload model weights.
    19:49 🔃 Reuse the model: Rebuild, load weights for further testing or deployment.

  • @BrunoVasco
    @BrunoVasco 3 года назад +2

    Thanks man! Nice pace and objectiveness.

  • @omarsinno2774
    @omarsinno2774 3 года назад +3

    Really nice and simple explanation. Cheers!

  • @_FLOROID_
    @_FLOROID_ 2 года назад +16

    As far as I can tell this tutorial sadly is already outdated since some of the API has changed now and some functions may require different arguements. And updated version of this tutorial would be great!

  • @richard_franks
    @richard_franks 2 года назад +12

    tl;dr if you're watching this in 2022, make sure you pip install gym==0.17.1.
    I'm sure this is due to the age of this video/updated code being released, but I had the following errors in case anyone else comes across this.
    First was - ValueError: too many values to unpack (expected 4) - for the line n_state, reward, done, info = env.step(action). For some reason adding a 5th parameter so it looked like this - n_state, reward, done, info, test = env.step(action) - made it pass.
    Next was ValueError: Error when checking input: expected flatten_input to have shape (1, 4) but got array with shape (1, 2) on line dqn.fit(env, nb_steps=50000, visualize=False, verbose=1).
    I was able to fix this by downgrading to python 3.8, downgrading protobuf to 3.9.2, and explicitly installing the versions of all traces found in the pip install trace of the jupyter notebook. When I changed the gym version to the one found in the video, it allowed env.step(action) to actually take 4 parameters, instead of the 5th I had to add in to make it pass, and the code ran.
    After all that I went back to python 3.10, explicitly installed gym 0.17.0, then installed keras, keras-rl2, and tensorflow, and it worked again.
    Thanks for the video, the issues obviously aren't your fault, just wanted to pass this info off. I learned a ton about pip, library versions, and all kinds of other stuff in this process.

    • @CrossyChainsaw
      @CrossyChainsaw Год назад +1

      This worked for me aswell
      i downgraded protobuff which downgraded tensorflow aswell.
      After i upgraded tensorflow to the correct version and everything worked.
      I think the origin of the problem is not having the correct version of TensorFlow in the first place

    • @lotus.css_IV
      @lotus.css_IV Месяц назад +1

      TYSM YOU'RE SUCH A G

    • @nahiyanalamgir7056
      @nahiyanalamgir7056 Месяц назад +1

      If you can, try upgrading to gymnasium, a drop-in replacement for gym. gym is no longer maintained.

  • @Officialnorio
    @Officialnorio 3 года назад +2

    Hey there!
    I am having the same issue with *'Sequential' object has no attribute '_compile_time_distribution_strategy'* but in my case *del model* doesn't help at all. If i want to delete it before *model = build_model(states, actions) I receive the error that I want to refer to a var before declaring it (which makes total sense to me xD).
    Any ideas how to fix this? :)
    btw. this video is amazing! Keep the good work up :)

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Norio! Try deleting it then running the cell that creates the model again. I show it here: ruclips.net/video/hCeJeq8U0lo/видео.html

    • @Officialnorio
      @Officialnorio 3 года назад +1

      @@NicholasRenotte didn't work for me. But thanks for your help :/
      I now use Tensorforce and don't have any problems :D

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@Officialnorio awesome work! What did you think of Tensorforce, I checked it out earlier on but switched to stable baselines a little later on!

    • @Officialnorio
      @Officialnorio 3 года назад +1

      @@NicholasRenotte Sometimes my code threw some weird output but changing the agent-type fixed it.
      Tensorforce is pretty easy to use and does its job (so far) pretty well. I am using Tensorforce for my bachelor thesis about MTSP solutions :D

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@Officialnorio awesome, will need to give it a second chance!

  • @whataday3910
    @whataday3910 3 года назад +5

    Hey! Thanks for the video. I would love to see how I can solve a problem with my own environment. Or how to build a specific environment and an agent with specific actions. I am at the moment not familiar with OpenAI but I think it would be interesting to see something more custom. :)

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Heya @WhataDay, check this out: ruclips.net/video/bD6V3rcr_54/видео.html

  • @muditrustagi5775
    @muditrustagi5775 3 года назад +2

    great job man
    love from India !

  • @oleksiy2090
    @oleksiy2090 3 года назад +7

    I do not know what to say. I think closes words what can describe my feelings now are "wow, that was amazing and very very simple that even I understood what is going on there". Going to play with code and try to solve more problems. I wish I found your channel earlier. 👍🏻

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +3

      Thanks so much @Alex, you've found it now 😊! I've got way more reinforcement learning and game AI coming in the coming weeks!

  • @文泽宇-x6p
    @文泽宇-x6p 2 года назад +5

    Hi Nick .Thanks for your tutorial it really helps me a lot.However , i am getting an error saying :"ValueError: Error when checking input: expected flatten_input to have shape (1, 4) but got array with shape (1, 2)",So ,i am wondering why this error didn't happen in your case

    • @vietle6099
      @vietle6099 2 года назад

      I'm having the same issue

    • @evolutionXXVII
      @evolutionXXVII Год назад

      Did you ever find a solution to this issue? I'm having the same problem.

    • @GeraLdario
      @GeraLdario Год назад

      Install the package 'rl-agents==0.1.1'. It works for me.

    • @sebastianrada4107
      @sebastianrada4107 8 месяцев назад

      @@GeraLdario It worked!

  • @MK-ol9gv
    @MK-ol9gv 3 года назад +8

    I usually don't write comments on RUclips videos but wow! I've watched some of your videos and they are extremely helpful.
    The number of views on this video and subscribes on your channel are so underrated thx for the great content and hope u keep making good videos like this one!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Thanks so much for your kind words @M K! Truly appreciate it!

  • @xinanwang9379
    @xinanwang9379 2 года назад +6

    Hi Nick,
    Thanks for your tutorial, it really helped me kick off the field of RL.
    There is an issue of the keras-rl2 package you used, specifically the NAFAgent, which fails all the time even using the example given in the official repo. Could you please spare some time and take a look at it? Many thanks and wish your channel gets better and better!
    Best, Tony

    • @neelkanthbhavnagarwala6001
      @neelkanthbhavnagarwala6001 Год назад +2

      When I try to run "dqn.fit(env,nb_steps...)" command I am getting ValueError : Error when checking input : expected flatten_2_input to have shape (1,4) but got array with (1,2)
      can you please help me out??

  • @oneredpanda9933
    @oneredpanda9933 Год назад +1

    I keep running into errors because I don't have the right things downloaded. Ive been trying to to fix it for about an hour now and I can't figure it out! If anyone has done this more recent than 2020, and would be willing to help me I would greatly appreciate it. Thanks so much!

  • @dineshkrishnasamy1628
    @dineshkrishnasamy1628 Год назад

    Nice content. We're waiting for ML trader series... thank you

  • @navketan1965
    @navketan1965 10 месяцев назад

    SEEKING YOUR WISDOM, SIR, I have 2 choices--1) buy one given forex pair every new day at the open at market price wth take profit(TP) 50 pips and stop loss(SL)50 pips on one ticket as one order 2) second choice is to buy same pair but order is placed as pending order at the open of every new day--buy the same forex pair 100 pips below the open price as pending order with take profit of 50 pips and stop loss of 50 pips all on the same ticket.And after one year of every day trading which strategy is more likely to make any money?And what answer SUPERCOMPUTER would give to my question?

  • @bogdanoleinikov8002
    @bogdanoleinikov8002 3 года назад +8

    Thanks for explaining the code, I saw this example online already but with the step by step explanation of this scenario it was much better for learning while running the code alongside the video :)

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Bogdan, thanks so much! I'm building up to more sophisticated examples of RL. I'll be doing a lot more with different environments in the coming months!

  • @NingAaron
    @NingAaron Год назад +2

    Your video is amazing! You make learning RL fun! However, I have some questions, maybe some rookie-type questions, about the best strategy for reinforcement learning, can I just extract this part, such as a vehicle turning right at an intersection, and then turning left is its best path, Can I extract only this one path among many paths? Or is it possible to convert the results of RL into text? Does this RL training log include the actions selected for these trainings? Please take the time to take a look, thank you very much!

    • @salmankhalildurrani
      @salmankhalildurrani Год назад

      can you please guide me solve the problem I am getting while working on this example
      TyperError:
      _set_agent() missing 1 required positional argument: 'agent'

  • @khalidbinhida
    @khalidbinhida 2 года назад

    Excellent sir!

  • @MaximeAntoine97
    @MaximeAntoine97 4 года назад +4

    Awesome video! I just started my master in AI and seeing your videos helps a lot to remember a couple “key” things before the start of the semester!
    I also just started a YT channel, if you’re down we could maybe see how we could create something together, might be fun!
    Have a good day 👋🏼

    • @NicholasRenotte
      @NicholasRenotte  4 года назад

      Hey thanks co much @Maxime, glad you enjoyed the video!

  • @manishalifestyle7863
    @manishalifestyle7863 3 года назад +1

    Sir can you please help me in doing MountainCar-v0 and frozenLake as well because these not have same properties as cartpole

  • @GeraLdario
    @GeraLdario Год назад +1

    If you get "ValueError: Error when checking input: expected flatten_input to have shape (1, 4) but got array with shape (1, 2)", Install the package 'rl-agents==0.1.1'. It works for me.

  • @rezagolipour9821
    @rezagolipour9821 2 года назад +2

    Hi, thank you for the video.
    My question is :
    is there any specific reason behind you have installed Tensorflow 2.3.0?
    Can version 2.9.0 work without error?

    • @jakes-dev1337
      @jakes-dev1337 9 месяцев назад

      Did ya try it? Try things.

  • @RaviKumar-ub2ng
    @RaviKumar-ub2ng 3 года назад +4

    Hi, what an amazing video! You are a great teacher and you make the learning of RL fun! However, I have some question and it might be some rookies type of questions because i am not that experience with python. You said that we can reload the trained model, but how can i do it in VSC? Create a new Python file and import the one we created? And also, when i run the " _ = dqn.test(env, nb_episodes=15, visualize=True)" and want to change episodes(just for testing), it has to go through the process all over again, but in your case it just used the rewards already generated and printed it right away. These questions might be so easy that maybe someone in the comments can provide an answer. Thanks :)

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Should be able to reload the weights by running this when you open up again: dqn.load_weights('dqn_weights.h5f')
      Then to chaneg the number of episodes just change the number set to nb_episodes
      e.g.
      For 30 episodes run this: _ = dqn.test(env, nb_episodes=30, visualize=True)
      For 40 episodes run ghit: _ = dqn.test(env, nb_episodes=40, visualize=True)

  • @raencarve4
    @raencarve4 3 года назад +1

    Awensome job. Thanks

  • @varunsharma7706
    @varunsharma7706 3 года назад

    It brings tears to my eyes😂😂
    Awesome

  • @abulfahadsohail466
    @abulfahadsohail466 2 года назад +1

    ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none)
    ERROR: No matching distribution found for tensorflow
    this error is showing while instaling

  • @evolutionXXVII
    @evolutionXXVII Год назад +1

    FYI the pygame package needs to be installed for env.render() to work. Took me a little while to figure that one out.

    • @mmc3790
      @mmc3790 Год назад

      thank you!!!

  • @randomizer272
    @randomizer272 3 года назад +3

    Hello, Thanks for the great tutorial step by step video. Quick question. When I run dqn.fit(env, nb_steps = 50000, visualize = False, verbose = 1), I get this error: "'Sequential' object has no attribute '_compile_time_distribution_strategy'". How do I overcome this? and why did this happen?
    Thanks again

    • @randomizer272
      @randomizer272 3 года назад +3

      I checked your other video. Deleting the model and reloading the kernel works. This comment is for anyone with same issues

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +2

      Awesome work @Sriram, yep that's the solution the majority of the time!

    • @ts717
      @ts717 3 года назад

      @@randomizer272 thank you for your answer, i have the same problem. But my knowledge is still quite limited so i don't know how i delete my model and reload the kernel. Would be nice if you could explain it a little bit more. Thanks in advance!

    • @randomizer272
      @randomizer272 3 года назад +2

      @@ts717 You can just do a new line
      del model
      and create the model again. It worked for me fine. I will attach the video in which he explained about this error.
      ruclips.net/video/hCeJeq8U0lo/видео.html

    • @rodolpheredoute809
      @rodolpheredoute809 3 года назад +3

      in the section "def build_model" you have to change the line :
      model = Sequential()
      into : model = tensorflow.keras.models.Sequential()
      i checked and it seems that's because python can misinterpret it with keras, and not tensorflow's keras (but i have no clue why)
      this worked for me

  • @bwbs7410
    @bwbs7410 Год назад +3

    gym gui wont render 🙄

    • @ibrahimghaddar7877
      @ibrahimghaddar7877 Месяц назад

      try this "env = gym.make("FrozenLake-v1",render_mode="human")"
      substitute frozenlake by you env or any

  • @Mokaigo
    @Mokaigo Год назад +1

    nice try David Goggins ;)
    Thx Alot !

  • @saurrav3801
    @saurrav3801 3 года назад +1

    Bro can you make a detailed video and explanation of making chatbot using reinforcement learning

  • @Rose-ro7wz
    @Rose-ro7wz 3 года назад +1

    Thank you for the video, would you please make a video about DDPG?

  • @nanoluisi
    @nanoluisi 3 года назад +5

    AttributeError: 'Sequential' object has no attribute '_compile_time_distribution_strategy'
    when i try dqn.compile, any idea?
    i tried copying the code itself but the error continues.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @nn aa, definitely can help out!! Quick one, what version of tensorflow are you using? and how are you importing keras/tf.keras?

    • @hninpannphyu8567
      @hninpannphyu8567 3 года назад +1

      @@NicholasRenotte First of all, thanks a lot for your great tutorial videos. i have got the exact same error as @nn aa. I am importing as below. my TensorFlow version is 2.3.1. Could you please take a look into it? Thanks.

    • @hninpannphyu8567
      @hninpannphyu8567 3 года назад +1

      Would be great if you create the more RL learnings tutorials with custom environment rather than using OpenAI gym?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +5

      @@hninpannphyu8567 anytime! Can you try dropping the tensorflow. from your imports like so:
      OLD CODE:
      from tensorflow.keras.models import Sequential
      from tensorflow.keras.layers import Dense, Flatten
      from tensorflow.keras.optimizers import Adam
      TEST CODE:
      from keras.models import Sequential
      from keras.layers import Dense, Flatten
      from keras.optimizers import Adam

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@hninpannphyu8567 I'm actually planning some more RL stuff soon. Anything in particular you'd like to see?

  • @islam6916
    @islam6916 3 года назад +4

    Thank you so much ❤
    searched a lot for that kind of video and finally found a good one 👏

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +2

      Thanks sooo much! There's some more reinforcement learning stuff coming this week, hopefully a video on Atari and (assuming my GPU doesn't catch fire) one on CARLA!

    • @islam6916
      @islam6916 3 года назад +1

      @@NicholasRenotte looking forward to seeing that

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      @@islam6916 awesome stuff!!

  • @InteliDey
    @InteliDey 2 года назад +2

    Hi Nicholas, why did you use "linear" as the activation function in your last layer instead of "softmax"? How would it differ if I choose "softmax" as activation function instead of linear for this case? Will it be possible to mention this, please? Or may be make a video on it? (When to choose linear and softmax activation function for what type of target cases)

    • @xnyu254
      @xnyu254 2 года назад +1

      softmax is great for classification, but the experiment shown in the video is more of a regression problem. In this case, it makes more sense to use linear. Doesn't mean you can't use softmax, but your dqn will most likely don't work as you would expect it.

    • @70ME3E
      @70ME3E 3 месяца назад

      @@xnyu254 here you have two states as the output (either go left or go right). It _is_ a classification problem, and not a regression one at all.

  • @adrian46647
    @adrian46647 8 месяцев назад

    had issues with rendering on m2 chip. Those pips helped !pip install tensorflow==2.13.0
    !pip install gym==0.25.2
    !pip install numpy==1.24.4
    !pip install keras==2.10.0
    !pip install keras-rl2==1.0.5
    !pip uninstall protobuf -y & pip install protobuf==3.20.0
    !pip install "gym[classic-control]"
    !pip install tflearn
    !pip install ipywidgets
    !pip install matplotlib pyglet
    !pip install pygame
    !pip install numpy --upgrade

  • @abdelmalekdjamaa7691
    @abdelmalekdjamaa7691 3 года назад +3

    Hi 👋
    Can you make a Q learning agent with just Keras and Tensorflow ?
    Creating the agent seems more interesting ⚡

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +4

      Definitely, I've got the code 80% of the way there! Should be out in the coming weeks!

  • @imedcherif8134
    @imedcherif8134 3 года назад +1

    Thank you for nice videos

  • @sarnathk1946
    @sarnathk1946 Год назад

    Thanks , awesome 🙏🙏

  • @krvignesh6323
    @krvignesh6323 2 года назад +1

    Great tutorial Nic.. I was trying to implement this and encountered an error when I run the line "dqn.compile(Adam(lr=1e-3),metrics=['mae']
    Error: 'Sequential' object has no attribute '_compile_time_distribution_strategy'.
    Can someone help me resolving this?

  • @paulburnett1963
    @paulburnett1963 Год назад +1

    Sweet.. love the explanation.. That was a lot to take in but what a clean explanation...
    Thanks for the video.
    paul.

  • @Bobstrer
    @Bobstrer 3 года назад +2

    Hi Nicholas, Thank you so much for the great content! I'm running into an error "AttributeError: 'Sequential' object has no attribute '_compile_time_distribution_strategy'" I couldn't really find anything online to help me solve it, do you have any idea where this is from? thank you!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +4

      Heya @Olivier, try running del model, then rerunning the cell that creates your model.

    • @adrianchervinchuk5632
      @adrianchervinchuk5632 3 года назад +2

      @@NicholasRenotte it worked for me, but why it acts in such a strange way?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      @@adrianchervinchuk5632 I think there is conflict between tensorflow and keras. Seems to happy pretty frequently.

  • @ApexArtistX
    @ApexArtistX 3 года назад +1

    Searching for web browser ai gaming bot with opencv image recognition but no everyone is using the same old python game environment not practical for real life use case

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya, yeah true, this is sort of the beginnings of game AI using OpenCV. I'm planning stuff for next year that uses custom environments that can eventually be rolled out to larger scale games! These guys did it for Starcraft: github.com/deepmind/pysc2

  • @01bit
    @01bit 3 года назад +1

    this is perfect!!!

  • @Andy-rq6rq
    @Andy-rq6rq 3 года назад +3

    great tutorial! keep making more

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Thanks so much @Andy, definitely plenty more coming!

  • @jumiduss
    @jumiduss 2 года назад +2

    Commenting for the algorithm. Started looking into deep learning recently and eventually got here, great intro and explanations. Looking forward to the other videos

  • @sangeeth77
    @sangeeth77 2 года назад +2

    ValueError: too many values to unpack (expected 4) can you please help

    • @joaobentes8391
      @joaobentes8391 2 года назад

      i have the same exact erro can someone pls help! thanks

    • @hocgh
      @hocgh Год назад

      you can unpack only the first 4 values from the returned tuple and ignore the rest, so you can use the star: state, action, reward, next_state, *_ = env.step(action)

    • @joaobentes8391
      @joaobentes8391 Год назад

      @@hocgh i already got it thanks a lot!

  • @powerHungryMOSFET
    @powerHungryMOSFET 4 месяца назад

    Which udemy courses you would suggget to learn DRL?

  • @niomartinez
    @niomartinez 5 месяцев назад

    By anychance with the recent AI advancements the past year, is this still relevant or are there newer much easier way now? (I mean this is easy following though your video but there might be new technologies now that allows us to do this better? )

  • @btkb1427
    @btkb1427 3 года назад +4

    Hey thanks for the vid! it's great, I get the error AttributeError: 'Sequential' object has no attribute '_compile_time_distribution_strategy' when trying dqn.compile(...) I have the same version of tensorflow as you

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +2

      Heya @Bartlomiej, delete your model (i.e. del model) and then rerun the code and it should clear it up!

    • @vagnermartin4356
      @vagnermartin4356 3 года назад

      @@NicholasRenotte I also have this same problem. But when you say to delete your model. It is to add the code "env.close ()", because if that didn't work.

    • @dralbertus
      @dralbertus 3 года назад

      Hello! I still don't now why, but I solve this issue writing `del model` a cell below `def build_agent(model, actions):` and then ` model = build_model(states, actions)`. Regards!!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      @@vagnermartin4356 nope use 'del model', here'a an example where I do it: ruclips.net/video/hCeJeq8U0lo/видео.html I think there is conflict between Keras andtf.keras versions perhaps but this seems to resolve the error.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Thanks @@dralbertus!

  • @tommclean9208
    @tommclean9208 3 года назад +1

    If anyone had the same issue as me, using keras-rl saying that model has no attribute __len__, I just modified the model code to:
    def build_model(states, actions):
    model = Sequential()
    model.add(Flatten(input_shape=(1, states)))
    model.add(Dense(23, activation='relu'))
    model.add(Dense(23, activation='relu'))
    model.add(Dense(actions, activation='linear'))
    model.__len__=actions
    return model
    and it worked (notice the additional line model.__len__ = actions
    Probably not the best practice, but worked without having to downgrade tensorflow

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Thanks so much for helping out the fam @Tom!

  • @goktugozleyen9766
    @goktugozleyen9766 3 года назад +1

    Super video but i have a question. First i tried but i got an error then I copy your code and paste it. But i still got an error.
    Error : FailedPreconditionError: Could not find variable dense_5/kernel. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status=Not found: Resource localhost/dense_5/kernel/N10tensorflow3VarE does not exist.

    • @goktugozleyen9766
      @goktugozleyen9766 3 года назад +1

      I solved it. Use tensorflow.keras instead of keras

  • @vitotonello261
    @vitotonello261 3 года назад +1

    AttributeError: 'Sequential' object has no attribute '_compile_time_distribution_strategy' ???
    but great tutorial !!!

    • @vitotonello261
      @vitotonello261 3 года назад +1

      I used virtualenv (Python 3.8.2) with all installed dependencies on my local iMac, but same error occurs on the Google Colab notebook.
      On stackoverflow some say there's a difference between Keras and tf.keras - but I'm not experienced enough with TensorFlow to fix the problem.
      Error in line:
      dqn.compile(Adam(lr=1e-3), metrics=['mae'])

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Oh heya @Vito, you can solve this by deleting your model and rereunning your code. I show how to do this here: ruclips.net/video/hCeJeq8U0lo/видео.html I think there is conflict between Keras andtf.keras versions perhaps but this seems to resolve the error!

    • @vitotonello261
      @vitotonello261 3 года назад +1

      @@NicholasRenotte Thank you very much Nicholas. To me your channel is a real enrichment.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Anytime@@vitotonello261, glad you're enjoying it!!

    • @ashkanforootan
      @ashkanforootan Год назад

      run import rl lines before Sequential line

  • @ASDFAHED1985
    @ASDFAHED1985 3 года назад +1

    Thanks, it is great video

  • @McRookworst
    @McRookworst 3 года назад +9

    Great video! Got it up and running in no time. One question tough: What exactly does the value of 4 out of env.observation_space.shape[0] represent? Isn't the state supposed to be a pixel vector? Or is this some kind of abstraction openAI makes?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +11

      Heya @McRookworst, for CartPole we don't use a pixel vector (moreso used in the Atari envs). In CartPole the four values represented in from the observation space are: [position of cart, velocity of cart, angle of pole, rotation rate of pole].

  • @frankgiardina205
    @frankgiardina205 3 года назад +1

    I tried del model but i am getting an error in step 3 Keras symbolic inputs/outputs do not implement '__len__'.
    i researched in stack overflow and the answer was to downgrade to TF 1.14 don't want to do that. Any help greatly appreciated thanks

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya can you try 2.3.1, that's the version I'm using in the video!

  • @pranavraut3998
    @pranavraut3998 Год назад +1

    hello sir your video is really helpful ,but i m trying to run the code but at the time of pip install tensorflow it showing the error that ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none)
    ERROR: No matching distribution found for tensorflow
    please help

    • @juliangiles8130
      @juliangiles8130 9 месяцев назад

      try using a older version of python it worked for me

  • @BilalKhan-sx9eu
    @BilalKhan-sx9eu 3 года назад

    Best crash course ever :D

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Thanks sooo much @Bilal, check this out as well :) ruclips.net/video/Mut_u40Sqz4/видео.html

  • @MrDaFuxae
    @MrDaFuxae Год назад +2

    Hi Nocholas, you did a great job there, thanks for sharing your knowledge! I would like to mention, that in my case I had a problem running the code, because I got a value error in the line "n_state, reward, done, info = env.step(action)". Adding a fifth value "observation" on the left side (so that it looks like "n_state, reward, done, info, observation = env.step(action)" got the code up and running :-)
    Nevertheless your videos are really helpful and please keep going! You're doing an amazing job!

    • @FrancescoPalazzo26
      @FrancescoPalazzo26 Год назад

      dude I can't import the agents and policies, basically keras doesn't have rl.policy or even rl.agents, what should I do?

    • @Nobske
      @Nobske Год назад +2

      # new version with terminated and truncated
      episodes = 10
      for episode in range(1, episodes+1):
      state = env.reset() #initial for each episode
      terminated = False
      score = 0
      while not terminated:
      env.render() # render the CartPole
      action = random.choice([0,1]) # 0,1 left or right
      observation, reward, terminated, truncated ,info = env.step(action)
      score+=reward #based on our step we get a reward till it's done
      print('Episode:{} Score:{}'.format(episode, score))
      Docs
      observation (object) - this will be an element of the environment’s observation_space. This may, for instance, be a numpy array containing the positions and velocities of certain objects.
      reward (float) - The amount of reward returned as a result of taking the action.
      terminated (bool) - whether a terminal state (as defined under the MDP of the task) is reached. In this case further step() calls could return undefined results.
      truncated (bool) - whether a truncation condition outside the scope of the MDP is satisfied. Typically a timelimit, but could also be used to indicate agent physically going out of bounds. Can be used to end the episode prematurely before a terminal state is reached.
      info (dictionary) - info contains auxiliary diagnostic information (helpful for debugging, learning, and logging). This might, for instance, contain: metrics that describe the agent’s performance state, variables that are hidden from observations, or individual reward terms that are combined to produce the total reward. It also can contain information that distinguishes truncation and termination, however this is deprecated in favour of returning two booleans, and will be removed in a future version.

    • @MrDaFuxae
      @MrDaFuxae Год назад

      @@FrancescoPalazzo26 I had the same problem. In my case I could solve it by importing the modules from a different path, which is 'tensorflow.python'. so the import commands look Like 'from tensor flow.python.keras.models import sequential'.
      Hope this solves your problem!

    • @ashkanforootan
      @ashkanforootan Год назад

      episodes = 10
      for episode in range(1, episodes+1):
      state = env.reset()
      terminated = False
      score = 0
      while not terminated:
      env.render()
      action = random.choice([0,1])
      n_state, reward, terminated, truncated, info = env.step(action)
      score+=reward
      print('Episode:{} Score:{}'.format(episode, score))
      run these for new versions

  • @frankgiardina205
    @frankgiardina205 3 года назад +1

    Nicholas the the !pip list run from Google colab is too long to place in comments are there any modules in that should zero in on? I reset back to TF the downgrade to 2.3.1 threw the len error thanks again

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Frank, qq, what was the error you were receiving again. Lost the original chain. Also, just a heads up it'll be a bit of a pain to visualise the environment in Colab.

  • @zahrafathollahi1968
    @zahrafathollahi1968 3 года назад +1

    why is the number of params in each layer more than the (unit of layer n-1 multiply the unit of layer n )? each unit does not have weight

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      As in why are there more parameters than neurons? You'll have both weights per input feature plus the bias value included in each neron.

  • @ahmedwaly9073
    @ahmedwaly9073 3 года назад +2

    Wooow this is an awesome tutorial

  • @Someone-iq6lx
    @Someone-iq6lx Год назад +1

    Question: what is le-2 and le-3 in the part where you build the dqn: target_model_update=le-2???

    • @fardouk
      @fardouk 4 месяца назад

      So sad i got an error "is not defined" and i see that you don't have any answer ...

  • @salahhayajneh8059
    @salahhayajneh8059 7 месяцев назад

    Salam>>>Pease be upon you>> I want to learn RL

  • @pranavprasad5661
    @pranavprasad5661 3 года назад +2

    @Nicholas Renotte This is such a well-explained video! Thanks for making it, I was looking for something exactly like this. I wanted to know whether you can make a video on custom environments using different types of observation_spaces and action_spaces (Discrete, Box, Dict, MultiDiscrete). I am trying this for a problem and I'm struggling a bit to understand how to use Dict and MultiDiscrete, most examples use Box and Discrete.

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Thanks @Pranav, definitely will do! Got it on the list!

  • @EliteCubingAlliance
    @EliteCubingAlliance Год назад

    Im just tryna find out where the 51 comes from in Area 51

  • @mingyucai6559
    @mingyucai6559 3 года назад +1

    Thanks for the greate video. When I run the code 'dqn.fit(env,nb_steps=50000, visualize=False, verbose=1)', I got the error "get_recent_state() missing 1 required positional argument: 'current_observation'"

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Mingyu, what do you get if you print out env?

  • @Tprakh-iw6qt
    @Tprakh-iw6qt 8 месяцев назад

    Very helpful but could not understand how to visualize the Cart Pole animation. Please let me know how to visualize it

  • @alishafii9141
    @alishafii9141 2 месяца назад

    I like you, your video and your teach. keep go on

  • @najibnajib772
    @najibnajib772 3 года назад +1

    epic

  • @tzhern
    @tzhern 3 года назад +2

    short and clear! thanks a lot!

  • @Skull-Waste-Gaming
    @Skull-Waste-Gaming 4 месяца назад

    wtf am i suposed to code this on i have zero idea

  • @jaydevsinhzankat8872
    @jaydevsinhzankat8872 3 года назад +1

    i am having one error in training part
    """AttributeError: 'Sequential' object has no attribute '_compile_time_distribution_strategy'"""

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Heya, try deleting the model and then rerunning the cell e.g. del model then rerun the model creation cell.

    • @jaydevsinhzankat8872
      @jaydevsinhzankat8872 3 года назад

      @@NicholasRenotte it solved after sometime but thx for reply. 🤗🤗

  • @MaximePerrain
    @MaximePerrain 3 года назад +1

    Hi Nicholas, thanks for all your greats video.
    i've problem with this line of code :
    model = build_model(states, actions)
    only integer scalar arrays can be converted to a scalar index
    and
    Error converting shape to a TensorShape: only integer scalar arrays can be converted to a scalar index.
    do you have an idea of what can be the issue?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      If you sample your states what does it look like? Are they non-integer values?

  • @fabrizioantonazzo3113
    @fabrizioantonazzo3113 3 года назад +1

    bravo esposizione swmplice

  • @goktugozleyen9766
    @goktugozleyen9766 3 года назад +1

    Greetings. I have a question again. How can i contact you ?

  • @Patiencelad
    @Patiencelad 3 года назад +2

    Great video. Thanks for explaining everything with the step by step. Excellent Job!

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Anytime! A heap more rl videos coming, it's going to be a big focus this year!

  • @kiyoutakaayanokouji3701
    @kiyoutakaayanokouji3701 2 месяца назад

    bro is saving me

  • @sanjubasnayake7429
    @sanjubasnayake7429 3 года назад +1

    I Got This Error When Compiling The Model
    AttributeError: 'Sequential' object has no attribute '_compile_time_distribution_strategy'
    Plese Help Me !
    Thank You

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Heya @Sanju, try deleting the model and then rerunning the cell e.g. del model then rerun the model creation cell.

    • @sanjubasnayake7429
      @sanjubasnayake7429 3 года назад

      Thank you worked

  • @frankgiardina1360
    @frankgiardina1360 3 года назад +1

    Thanks Nicholas i use colab and will try TF 2.3.1

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Awesome @Frank, let me know how you go!

    • @hardikkamboj3528
      @hardikkamboj3528 3 года назад +1

      Hey Frank, I am also using Colab but facing a lot of difficulties in rendering gym. can you please help me with that @Frank

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      @@hardikkamboj3528 hmmm, I normally avoid colab for that reason. Rendering anything outside of the notebook is problematic. You can try training on Colab then rendering on your local machine.

    • @hardikkamboj3528
      @hardikkamboj3528 3 года назад

      @@NicholasRenotte thanks a lot mate. I have started reinforcement learning, following your videos. Awesome work mate, really appreciate it

    • @hardikkamboj3528
      @hardikkamboj3528 3 года назад +1

      @@NicholasRenotte I will try it

  • @professorparadox9826
    @professorparadox9826 3 года назад +2

    which python version you use,,,,,,

  • @sdtcuce
    @sdtcuce 3 года назад +1

    Wow! such a wonderful lessons with practical example. I loved it. I want to learn more about self control action mechanism for multivariate industrial control using RL. Kindly put some light on it

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Nice, got more RL stuff coming in the weeks coming @Suvankar!

  • @siddharthmanumusic
    @siddharthmanumusic 4 месяца назад

    Thank you!!! You rock! Such a well made video! Short and fully informative.

  • @yalcinimeryuz5414
    @yalcinimeryuz5414 2 года назад +1

    Great video! However, I am getting an error saying "TypeError: only integer scalar arrays can be converted to a scalar index" on the line "model.add(Flatten(input_shape=(1, states)))". How do I solve this?

    • @thiagobastani6663
      @thiagobastani6663 Год назад

      The variable states is probably not an integer scalar array. are you testing the same eviroment?

  • @jorisbonson386
    @jorisbonson386 Год назад

    What's the point of a 'tutorial' where you explain almost NOTHING in detail of what you're doing?? Think you've missed the point buddy...

  • @fatemehkiaie7612
    @fatemehkiaie7612 3 года назад +1

    Hi. Great Video. I am wondering to know if it is possible to create a custom environment?

    • @NicholasRenotte
      @NicholasRenotte  3 года назад +1

      Sure is! Check this out @Fatemeh: ruclips.net/video/bD6V3rcr_54/видео.html

  • @jorgepacheco3407
    @jorgepacheco3407 3 года назад +1

    I have a little problem running this, i tried in Jupyter same as you but when i clicked in run an error appears saying "GL NOT FOUND" and it happened running the first step of this video, help plz :c

    • @NicholasRenotte
      @NicholasRenotte  3 года назад

      Heya @Jorge, was there a larger error you can share?

    • @jorgepacheco3407
      @jorgepacheco3407 3 года назад

      @@NicholasRenotte ImportError Traceback (most recent call last)
      /srv/conda/envs/notebook/lib/python3.6/site-packages/gym/envs/classic_control/rendering.py in
      24 try:
      ---> 25 from pyglet.gl import *
      26 except ImportError as e:
      /srv/conda/envs/notebook/lib/python3.6/site-packages/pyglet/gl/__init__.py in
      94
      ---> 95 from pyglet.gl.lib import GLException
      96 from pyglet.gl.gl import *
      /srv/conda/envs/notebook/lib/python3.6/site-packages/pyglet/gl/lib.py in
      148 else:
      --> 149 from pyglet.gl.lib_glx import link_GL, link_GLU, link_GLX
      /srv/conda/envs/notebook/lib/python3.6/site-packages/pyglet/gl/lib_glx.py in
      44
      ---> 45 gl_lib = pyglet.lib.load_library('GL')
      46 glu_lib = pyglet.lib.load_library('GLU')
      /srv/conda/envs/notebook/lib/python3.6/site-packages/pyglet/lib.py in load_library(self, *names, **kwargs)
      163
      --> 164 raise ImportError('Library "%s" not found.' % names[0])
      165
      ImportError: Library "GL" not found.
      During handling of the above exception, another exception occurred:
      ImportError Traceback (most recent call last)
      in
      6
      7 while not done:
      ----> 8 env.render()
      9 action = random.choice([0,1,2,3,4,5])
      10 n_state, reward, done, info = env.step(action)
      /srv/conda/envs/notebook/lib/python3.6/site-packages/gym/core.py in render(self, mode, **kwargs)
      238
      239 def render(self, mode='human', **kwargs):
      --> 240 return self.env.render(mode, **kwargs)
      241
      242 def close(self):
      /srv/conda/envs/notebook/lib/python3.6/site-packages/gym/envs/atari/atari_env.py in render(self, mode)
      150 return img
      151 elif mode == 'human':
      --> 152 from gym.envs.classic_control import rendering
      153 if self.viewer is None:
      154 self.viewer = rendering.SimpleImageViewer()
      /srv/conda/envs/notebook/lib/python3.6/site-packages/gym/envs/classic_control/rendering.py in
      30 If you're running on a server, you may need a virtual frame buffer; something like this should work:
      31 'xvfb-run -s \"-screen 0 1400x900x24\" python '
      ---> 32 ''')
      33
      34 import math
      ImportError:
      Error occurred while running `from pyglet.gl import *`
      HINT: make sure you have OpenGL install. On Ubuntu, you can run 'apt-get install python-opengl'.
      If you're running on a server, you may need a virtual frame buffer; something like this should work:
      'xvfb-run -s "-screen 0 1400x900x24" python '

  • @dimitheodoro
    @dimitheodoro 2 года назад +1

    When hitting line:
    env.render()
    it says: Cannot connect to "None"

    • @NicholasRenotte
      @NicholasRenotte  2 года назад

      Running in Colab? Might need to try on desktop.

    • @dimitheodoro
      @dimitheodoro 2 года назад

      @@NicholasRenotte yes in Collab.I Thanks i will try!

  • @lahaale5840
    @lahaale5840 3 года назад +1

    Nice introduction. It seems the DQN method is value-based even you are using BoltzmanQPolicy. BoltzmanQPolicy is like epsilon-greedy, a method to balance exploitation and exploration. Methods like DPG, PPO, A2C, and DDP can be considered as policy-based methods.

  • @yasinsahin2962
    @yasinsahin2962 3 года назад +1

    I also wanna study master degree in computer engineering as a control engineer graduated. My topic will be RL which includes control theory and machine learning. Any advice? It is a good path to go?

    • @NicholasRenotte
      @NicholasRenotte  2 года назад +1

      Sounds awesome! Check out Linear Programming and MIPs as well. This might kick you off rom an RL side ruclips.net/video/Mut_u40Sqz4/видео.html, there's also some amazing stuff from DeepMind.

  • @samselvaraj8171
    @samselvaraj8171 2 года назад +1

    You havent explained this well :/

    • @NicholasRenotte
      @NicholasRenotte  2 года назад

      Oh, my bad man. What did you need clarified? Checked this out? ruclips.net/video/Mut_u40Sqz4/видео.html

  • @KingErasmos
    @KingErasmos Год назад

    This is interesting but there’s quite a lot of assumed knowledge for a tutorial. dqn mae etc. you zipped through it so fast. Was cool to watch tho.