Deep Reinforcement Learning for Walking Robots - MATLAB and Simulink Robotics Arena
HTML-код
- Опубликовано: 11 июл 2024
- Sebastian Castro demonstrates an example of controlling humanoid robot locomotion using deep reinforcement learning, specifically the Deep Deterministic Policy Gradient (DDPG) algorithm. The robot is simulated using Simscape Multibody™, while training the control policy is done using Reinforcement Learning Toolbox™.
In this video, Sebastian outlines the setup, training, and evaluation of reinforcement learning with Simulink® models. First, he introduces how to choose states, actions, and a reward function for the reinforcement learning problem. Then he describes the neural network structure and training algorithm parameters. Finally, he shows some training results and discusses the benefits and drawbacks of reinforcement learning.
You can find the example models used in this video in the MATLAB Central File Exchange: bit.ly/2HBxe79
For more information, you can access the following resources:
- Reinforcement Learning Tech Talks: bit.ly/2HBzMlS
- Blog and Videos: Walking Robot Modeling and Simulation: bit.ly/3JTs0ST
- Paper: Continuous Control with Deep Reinforcement Learning: bit.ly/2HAkJsp
- Paper: Emergence of Locomotion Behaviours in Rich Environments: bit.ly/2HBuTsO
--------------------------------------------------------------------------------------------------------
Get a free product Trial: goo.gl/ZHFb5u
Learn more about MATLAB: goo.gl/8QV7ZZ
Learn more about Simulink: goo.gl/nqnbLe
See What's new in MATLAB and Simulink: goo.gl/pgGtod
© 2019 The MathWorks, Inc. MATLAB and Simulink are registered
trademarks of The MathWorks, Inc.
See www.mathworks.com/trademarks for a list of additional trademarks. Other product or brand names may be trademarks or registered trademarks of their respective holders. Наука
Hi @Sebastian Castro I am working on DRL algorithms to grasp objects with a robotic arm with gripper. So, do you have any recommendations for me?
Thanks in advance,
Very good Job keep a head
Hey, I want to implement clipped double deep Q-learning for task allocation in cloud resources (VMs).
Is it possible to use MATLAB and Simulink for network simulation?
Very useful
How can this demo be opened? The walkingRobotRL2D command does not seem to work. Thanks!
Great work, thank you, I will try to implement an agent that makes MPPT for PV arrays based on this.
And please send it to me. we can start coapration.
Is there any new version of this for matlab 2020b?
Why do you use the previous action to calculate the reward and not the current action?
making a video showing something ready is very good. But it would be better still one or more videos, or even a course on how to apply neural networks in control systems with examples that start from scratch. Because I've been trying without success to apply a controller made of neural networks to any transfer function.
dear sir
i want to implement to mobile balancer robot. how to modelling and convert to RL until robot balance?
and at maltlab just simulation? and how to actual to robot balancer mobile?
thank you for this helpful video. I just have a question. How did you plot the robot during training? I need to see how my model act during training. I appreciate any reply.
Don't use the parallel function. You could try to comment the last if block in createDDPGOptions.m from line 27 to 30.
they used simscape multi body
how do i play the trained network
How to set initial conditions
Do I have to buy the Reinforcement Learning Toolbox? Why is that not directly included in my Matlab-License.......
Yes, Reinforcement Learning Toolbox is one of the requirements for this example. The list of required products is shown in the File Exchange/GitHub links.
I know that this is required. Was more a question of why and if there is no way to use the code without buying?
Great short tutorial!! I wanna implement Q learning algorithm in the Agent. Do RL toolbox has Q learning algorithm? If you can make another such tutorial on Q learning implementation, that would be great. Thanks in Advance!!
Yes, the toolbox has Q learning and Deep-Q Network (DQN) algorithms. I picked DDPG since I wanted a continuous action space vs. the discrete options provided by those other algorithms.
Thanks! But I couldn't find the Q learning algorithm in MATLAB 2018b. Is Q learning algorithm available only in 2019a/b?
@@AtriyaBiswasReinforcement Learning Toolbox is new in R2019a, so that would make sense.
Thanks!! @@roboticseabass I have to get it installed.
Hi @Sebastian Castro. I want to build an agent with three action variables and the action variables are discrete. Suppose action variable 'A' has 4 discrete values A = [0, 650, 1500, 4500]; action variable 'B' has 6 discrete values B = [0, 25, 50, 75, 100, 125]; and action variable 'C' has 7 discrete values C = [-25, -12, -5, 0, 5, 12, 25];
Should I write the code for "actInfo = rlFiniteSetSpec([0, 650, 1500, 4500];[0, 25, 50, 75, 100, 125];[-25, -12, -5, 0, 5, 12, 25])" ?? I couldn't find any matlab example showing how to write the code for actInfo when using more than one discrete action variable.
Hi Sebastian, great video! I have a question, when i try to use pretrained agent from the example i run into an error, could you help me with that please? The error is:
MATLAB System block 'walkingRobotRL3D/RL Agent/AgentWrapper' error occurred when invoking 'outputImpl' method of 'AgentWrapper'. The error was thrown from '
'M:\matlab_2019b\toolbox
l
l\+rl\+agent\AbstractPolicy.m' at line 133
'M:\matlab_2019b\toolbox
l
l\simulink\libs\AgentWrapper.m' at line 113'.
Invalid observation type or size.
Invalid observation type or size.
Dot indexing is not supported for variables of this type.
Thanks in advance!
I noticed this too with the 3D example as I tested some updates in MATLAB R2019b. So, I generated some new walking agents but have not published them yet. Email us at roboticsarena@mathworks.com and I can send you an agent file that works.
Great video.
I’ll do a project in my University to control continuous process with reinforcement learning, do you some advice to give me?
Based on my experience, my #1 tip is: Create a good reward function that penalizes jumping around the upper/lower limits of your possible action space. Otherwise, you just get an on-off controller.
Do you use genetic algorithm?
No, this is the Deep Deterministic Policy Gradient (DDPG) reinforcement learning algorithm.
If you want, there's an earlier video in our series that shows Genetic Algorithms for joint waypoint optimization: ruclips.net/video/-dEX1SZOZEY/видео.html
Can I use Reinforcement Learning in Matlab withtout this toolbox, only with simulink and script?
Not unless you implement a lot of the functionality yourself. Neural network representations, RL algorithms, etc.
3:33 is not accurate. The actor is backproped by policy gradient while critic is backproped by TD
Thanks -- this was an oversimplification of the general gist of DDPG for beginners.
Indeed, the critic loss is found using temporal differencing (TD) using target networks to make this a little less unstable during training! And the actor loss is simply the negative critic estimate (negative because you want to maximize Q-value, or minimize negative Q-value).
However, in both cases, this loss *is* then backpropagated through the respective actor/critic networks to update their parameters.
7:07 I get an arror that says "Unrecognized function or variable 'numObs'." when I run "createDDPGNetowks.m" How can I solve this problem?