3 Train a Unity RL Env using Stable Baselines3!

RL Hugh

Просмотров 2,3 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 29 авг 2024
Follows on from • 1 Control Unity from P... In this video we create a gym env in Python, that wraps the Unity reinforcement learning environment we created earlier. Then we train using stable baselines 3
This uses the Peaceful Pie library, which is free and opensource, under an MIT license, github.com/hug...
The first video in this series is at studio.youtube... . If you want to jump right in, we are starting from the project at github.com/hug... , which is a Unity project. The python scripts are in the 'python' sub-folder. You will need to create a python 3.10 environment, and 'pip install peaceful-pie'.
The project at the end of this video will look a little like github.com/hug... .
Full playlist: • Control Unity from Python

Комментарии • 32

@raphaelbaur4335 Год назад ⁺¹
Thank you, I also like your consistent use of type hints!
@MechaCat123 Год назад
Thank you for the RL videos, very enjoyable and informative! Keep it up!
@rlhugh Год назад
Thank you! Very happy to hear this. Let me know if there are ideas youd like to see me try please.
@juleswombat5309 Год назад
Excellent Tutorial. I now have more freedom in the choice and implementation of RL Algorithms with my Unity Environments.
I just need
a) Study the SB3 documentation a bit more on Saving the trained PPO model, Reloading using it in Inference mode. And early stopping of the Training, once performance has been achieved.
b) I think can then work out some Curriculum based learning, to progress the Training on re advanced scenarios or scene levels. Unity ML-Agents has this set kinda built in.
c) Work out if any Agent Team Play & Training is possible with multiple collaborating agents can be trained (c.f. you mentioned the dungeon escape scenario, elsewhere)
BTW I lost my Simulation Timer Ball. The bounces get bigger and bigger, so it comes back to bounce about as often as Halley's comet
@keyhaven8151 3 месяца назад ⁺¹
Nice work!!! But we can not find the "Step" method in the unity_comms.py. Did you miss anything when uploading the file?
@rlhugh 3 месяца назад
wow the repo at github.com/hughperkins/peaceful-pie-video-resources has been prviate all this time :O
@rlhugh 3 месяца назад
is this the code yo uare looking for? github.com/hughperkins/peaceful-pie-video-resources/blob/d7dcd50edef768f6bbc8cae5ac38eb43a4a8b129/3_train_rl_env/python/train_rl.py#L23-L28
@keyhaven8151 3 месяца назад ⁺¹
@@rlhugh wow! Nice！Thanks！This can help us a lot!
@slithercode4219 Год назад
Would love to see tutorials for more realistic use cases!. Great work though.
@rlhugh Год назад
Alright. What level of detail are you looking for? What kinds of uses are you interested in? Eg Dungeon Escape? Racing a car around a track? One of the other mlagents examples? Something else?
@slithercode4219 Год назад
@@rlhugh Wow I honestly didnt expect that you would reply so quickly! I was thinking more in the lines of stock trading, rl agent that helps you memorize flashcards, as well as using an rl agent for recommendations. I can provide an example that I built if you would like, be warned though I did use only tensorflow since im not good with stable-baselines.
@rlhugh Год назад
@@slithercode4219 ah, that sounds quite specialized. I'm mostly targeting game scenarios for now. Have to specialize somehow. I can't learn everything.. The flashcard game sounds potentially interesting. If you have a url to an open source GitHub repository for that, I might take a look. No promises though (and I sort of roped myself into looking into trying to use gpt for npcs, which is taking a lot longer than I thought it would take :P I got something simple working in a couple of hours. But the gap between "something simple working" and a "maybe interesting video" could easily be a couple of months :P )
@keyhaven8151 3 месяца назад
Thank you ! You mentioned at the end of the video that Unity is single threaded and can be achieved by encapsulating multiple agents. How is this achieved?
@rlhugh 3 месяца назад
Does this example help? github.com/hughperkins/peaceful-pie/tree/main/examples/DungeonEscape
@keyhaven8151 3 месяца назад
@@rlhugh Thank you. But this seems very complicated, and we don't know where to start. Can you please provide a video explanation?
@rlhugh 3 месяца назад
Good idea!
@keyhaven8151 3 месяца назад ⁺¹
@@rlhugh Thank you very much for accepting my opinion. The video you made is very friendly to beginners, and I look forward to your video!
@user-hv5un3oz3b Год назад ⁺¹
Great tutorial, I've successfully implemented my own single-agent training, but I'm having issues trying to multi-agents. My idea is to control different agents through different ports, but this will have some errors in unity. Does the current peacepie not support multi-agents?
@rlhugh Год назад
Use a single port. Peaceful pie is just a networking protocol basically. On the python side, train however you want. There's an example of training multiple agents using a single policy at github.com/hughperkins/peaceful-pie/tree/main/examples/DungeonEscape I should probably write an example where each agent explicitly gets its own policy
@user-hv5un3oz3b Год назад ⁺¹
@@rlhugh Thank you very much, looking forward to your future works.
@nicbleu Год назад
i wonder what are the advantage of using stable base line 3 with unity and do you think it is possible to do this but for unreal engine ?
@rlhugh Год назад
Well, it's not just stable baselines 3: you have the full power to use any python code and libraries you want. Torch and so on. This is mostly aimed at researchers and machine learning engineers, who might already be used to using python for machine learning.
@rlhugh Год назад
Unreal engine is a totally different engine, which is outside of my expertise.
@user-ge2jw7eb3d Год назад
Could you please provide the versions of gym and stable-baselines3 used in the virtual environment in your video?Thank you!
@user-ge2jw7eb3d Год назад
😀
@rlhugh Год назад
Looks like gym==0.21.0, and stable-baselines3==1.6.2
@user-ge2jw7eb3d Год назад
Thank you!@@rlhugh
@chafficplugins Год назад
HI, the source code is not available on github. It seems to be private.
@rlhugh Год назад
github.com/hughperkins/peaceful-pie/
@maxmciver629 10 месяцев назад
How to run Unity builded project from python?
@rlhugh 10 месяцев назад
You can use the subprocess module to launch new processes from python.

Следующие

Автовоспроизведение

2 Create Unity RL env WITHOUT mlagents! [v2, no music; shorter transitions]