3 Train a Unity RL Env using Stable Baselines3!
HTML-код
- Опубликовано: 29 авг 2024
- Follows on from • 1 Control Unity from P... In this video we create a gym env in Python, that wraps the Unity reinforcement learning environment we created earlier. Then we train using stable baselines 3
This uses the Peaceful Pie library, which is free and opensource, under an MIT license, github.com/hug...
The first video in this series is at studio.youtube... . If you want to jump right in, we are starting from the project at github.com/hug... , which is a Unity project. The python scripts are in the 'python' sub-folder. You will need to create a python 3.10 environment, and 'pip install peaceful-pie'.
The project at the end of this video will look a little like github.com/hug... .
Full playlist: • Control Unity from Python
Thank you, I also like your consistent use of type hints!
Thank you for the RL videos, very enjoyable and informative! Keep it up!
Thank you! Very happy to hear this. Let me know if there are ideas youd like to see me try please.
Excellent Tutorial. I now have more freedom in the choice and implementation of RL Algorithms with my Unity Environments.
I just need
a) Study the SB3 documentation a bit more on Saving the trained PPO model, Reloading using it in Inference mode. And early stopping of the Training, once performance has been achieved.
b) I think can then work out some Curriculum based learning, to progress the Training on re advanced scenarios or scene levels. Unity ML-Agents has this set kinda built in.
c) Work out if any Agent Team Play & Training is possible with multiple collaborating agents can be trained (c.f. you mentioned the dungeon escape scenario, elsewhere)
BTW I lost my Simulation Timer Ball. The bounces get bigger and bigger, so it comes back to bounce about as often as Halley's comet
Nice work!!! But we can not find the "Step" method in the unity_comms.py. Did you miss anything when uploading the file?
wow the repo at github.com/hughperkins/peaceful-pie-video-resources has been prviate all this time :O
is this the code yo uare looking for? github.com/hughperkins/peaceful-pie-video-resources/blob/d7dcd50edef768f6bbc8cae5ac38eb43a4a8b129/3_train_rl_env/python/train_rl.py#L23-L28
@@rlhugh wow! Nice!Thanks!This can help us a lot!
Would love to see tutorials for more realistic use cases!. Great work though.
Alright. What level of detail are you looking for? What kinds of uses are you interested in? Eg Dungeon Escape? Racing a car around a track? One of the other mlagents examples? Something else?
@@rlhugh Wow I honestly didnt expect that you would reply so quickly! I was thinking more in the lines of stock trading, rl agent that helps you memorize flashcards, as well as using an rl agent for recommendations. I can provide an example that I built if you would like, be warned though I did use only tensorflow since im not good with stable-baselines.
@@slithercode4219 ah, that sounds quite specialized. I'm mostly targeting game scenarios for now. Have to specialize somehow. I can't learn everything.. The flashcard game sounds potentially interesting. If you have a url to an open source GitHub repository for that, I might take a look. No promises though (and I sort of roped myself into looking into trying to use gpt for npcs, which is taking a lot longer than I thought it would take :P I got something simple working in a couple of hours. But the gap between "something simple working" and a "maybe interesting video" could easily be a couple of months :P )
Thank you ! You mentioned at the end of the video that Unity is single threaded and can be achieved by encapsulating multiple agents. How is this achieved?
Does this example help? github.com/hughperkins/peaceful-pie/tree/main/examples/DungeonEscape
@@rlhugh Thank you. But this seems very complicated, and we don't know where to start. Can you please provide a video explanation?
Good idea!
@@rlhugh Thank you very much for accepting my opinion. The video you made is very friendly to beginners, and I look forward to your video!
Great tutorial, I've successfully implemented my own single-agent training, but I'm having issues trying to multi-agents. My idea is to control different agents through different ports, but this will have some errors in unity. Does the current peacepie not support multi-agents?
Use a single port. Peaceful pie is just a networking protocol basically. On the python side, train however you want. There's an example of training multiple agents using a single policy at github.com/hughperkins/peaceful-pie/tree/main/examples/DungeonEscape I should probably write an example where each agent explicitly gets its own policy
@@rlhugh Thank you very much, looking forward to your future works.
i wonder what are the advantage of using stable base line 3 with unity and do you think it is possible to do this but for unreal engine ?
Well, it's not just stable baselines 3: you have the full power to use any python code and libraries you want. Torch and so on. This is mostly aimed at researchers and machine learning engineers, who might already be used to using python for machine learning.
Unreal engine is a totally different engine, which is outside of my expertise.
Could you please provide the versions of gym and stable-baselines3 used in the virtual environment in your video?Thank you!
😀
Looks like gym==0.21.0, and stable-baselines3==1.6.2
Thank you!@@rlhugh
HI, the source code is not available on github. It seems to be private.
github.com/hughperkins/peaceful-pie/
How to run Unity builded project from python?
You can use the subprocess module to launch new processes from python.