DeepGait: Planning and Control of Quadrupedal Gaits using Deep Reinforcement Learning (Presentation)
HTML-код
- Опубликовано: 3 июл 2024
- Presentation @ ICRA 2020:
We train neural-network policies for terrain-aware locomotion,
which respectively plan and execute foothold and base motions
over challenging terrains in simulated 3D environments using
both proprioceptive and exteroceptive measurements.
Journal article accepted to IEEE Robotics and Automation Letters (RA-L) and IEEE International Conference on Robotics and Automation (ICRA) 2020 in Paris, France:
Paper: ieeexplore.ieee.org/abstract/...
Preprint: arxiv.org/abs/1909.08399
Video by Vassilios Tsounis
Music is "Always Then" courtesy of The KVB
www.thekvb.co.uk/ Наука
Looking forward for part 2 !
Love it. Thank You!
Nice work!
wow ... nice results!
7:44 looks straight out of an 80's retro game where people ride robots instead of cars
这个真牛,讲的比较深入了
Hi Vassilios!
I am in Love ! 😍❤️
Waiting for part two
Waiting so much to see these anymals in action.
good work
i have a question! how to get terrain information? IMU? camera(vision), lidar? i wonder how it is
thank you in advance
Is there a way were I could get access to the rviz configuration for the 80s theme? Looks very cool!
This visualization was made in raisimOgre so unfortunately there is no easy-to-use configuration to share. Stay tuned for when we release the code though.
what software do you use for simulations?
why is this better than model mpc with pure math optimization? is it just better because it can learn to handle noisy contacts?
Just had a quick look at your paper, great work and thanks for sharing.
Quick question: For GP controller, is it right that you sample from the distribution of the policy until a feasible action found? What if the probability of a feasible sample is very low in a certain situation?
Thanks, we are glad you enjoyed it. During deployment we do not need to re-sample until it's valid, we only need a compute the mean of the policy's distribution to generate phase plans. That's the point of formulating the MDP in order to train the policy with RL; instead of using it like in sampling-based-planning methods, we train the parameterized policy distribution with RL so it learns to always output valid phase transitions.
@@leggedrobotics Thanks, understood.
@@leggedrobotics Hi, Is it possible to fast forward the learning process so that the robot can spend 1 million years learning in only a few weeks?
This is very good, does this code have open source,Thank you very much!
In which software are these 3D simulations done
This work uses the RaiSim physics engine that was developed in-house. Link: raisim.com/
:)
is there any coding to share?
Unfortunately not yet. We do plan to open-source the code later this year though.
@@leggedrobotics that shall be great contribution!
@@leggedrobotics what is the constraint for the speed at which It walks? Does it have to go at that speed or that as fast as possible?
Nice work. never give Up. Also, I want to be youtube friends xP
Great work. I hope "part 2: back with vengeance" is a reference to Last Ninja 2 (ruclips.net/video/Gfkk9BnFB7w/видео.html)