AI Plays Trackmania - Map5 2:04:91

Linesight

Просмотров 9 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 7 сен 2024
The AI is trained via reinforcement learning.
Game: Trackmania Nations Forever (TMNF)
Map: tmnf.exchange/...
Replay (.gbx file): drive.google.c...

Комментарии • 66

@lordnoom4919 Год назад ⁺⁴¹
nice that it even figured out a rammstein hit to start a drift. Good work right here
@wazthatme Год назад ⁺⁴
This makes me so happy to see I could watch AI learning to play games all day
@m.i.c.h.o Год назад ⁺⁷
It could, in fact, neo slide.
Writual.
@gugus8081 Год назад ⁺⁴
This is impressive, I'm not even sure I can beat that RTA... Keep it up !
@exlpt2234 Год назад ⁺⁴
This is insane, great work!
@linesight-rl Год назад ⁺³
Thanks a lot!
@heavysaur149 Год назад ⁺⁶
I wonder what inputs are put in ? Is it based of field vision (like what is sees via the camera) or is it based of coordinates (it already knows all the map and can view his position on it) ?
And do you input what it outputs the frame before ? (like to know if he continues his drift or not)
And speed ? rotation of the car + where it goes (to know if he drifts) ?
I have so many questions
@linesight-rl Год назад ⁺¹¹
Inputs contain a screenshot of what is displayed by the game, the relative position of a few checkpoints on the centerline of the circuit in front of the car, the agent's previous action, the car's speed, and the direction of the gravity vector.
@Metcoler Год назад
Nice work! It had to take a lot of effort. It is very impressive, that car can initialize a drift and drives very close to walls, and hit apexes like nothing. Big respect for this piece of work. Keep it up!
@linesight-rl Год назад
Thanks a lot! It has indeed been a lot of work, and we're still working on it! Next steps includetraining on more varied and less boring maps. We'll post progress videos along the way 🙂
@okty8372 Год назад ⁺⁴
is the AI able to generalize it's "driving skills" to other maps ? Amazing work btw ! (i m really interested in IA and love TM, so it's perfect content for me :) )
@linesight-rl Год назад ⁺⁴
We'll find out soon enough :)
@gaiekkurvanov1841 Год назад ⁺⁶
Which algorithm is used ?
@linesight-rl Год назад ⁺⁵
This is value-based reinforcement learning.
We use a mixture of Implicit Quantile Networks, with N-steps and dueling networks. We also implemented Prioritized Experience Replay, Persistent Advantage Learning, Noisy layers for exploration and Quantile options (QUOTA), but those bricks are currently not used.
@Arcsinx Год назад ⁺¹
Crazy ! Yosh should see this
@jorishenger1240 Год назад ⁺⁵
What if you let this AI loose on E02
@linesight-rl Год назад ⁺¹
I guess we'll have to try :)
@jorishenger1240 Год назад ⁺¹
@@linesight-rl would be amazing to see, would the small jumps be a problem?
@linesight-rl Год назад ⁺²
@@jorishenger1240 We're currently testing on more complex maps. Neither jumps, slopes, borderless roads seem to be a problem.
@jorishenger1240 Год назад ⁺²
@@linesight-rl amazing to see that tech has come so far that this is done by a person, not even a company or smth. So cool
@Linck192 Год назад ⁺¹
Why did you make the AI output these groups of inputs instead of 4 values, one for each direction?
@linesight-rl Год назад ⁺¹
This is a requirement of DQN-like methods : each action is associated with a single value, and you pick the action with the highest value. DQN does not handle picking multiple actions at the same time.
@Gryffins90 Год назад ⁺¹
Excellent project that I always wanted to try myself. I've seen your other response to the other comment asking for contribution. I'm also interested in helping (data scientist myself) so get in touch if you're willing to extend the team. I've a 2080ti available at home.
One suggestion for future video is to also show the keyboard input (only the 4 keys) in addition to the tree of input as it is more similar to how human display their inputs.
@Stunde0Null0 Год назад ⁺⁴
Wirtual taking a L. kekw
@corbanizer7376 Год назад
Keep on going dude. This is sick
@rFey Год назад ⁺¹
Idk if this would be possible but i would love to see another angle to take ML/AI with trackmania. Feed it thousands of TASes or WRs on a bunch of maps with lots of different turns, block combinations, drifts whatnot and then see if it can get good times on real maps. My layman brain sees this as way more complicated so it probably is but yknow a man can dream
@linesight-rl Год назад ⁺⁴
What you are describing is called "supervised learning" where an AI is fed expert information and tries to reproduce the behavior of that expert.
In this video, we use another technique called "reinforcement learning" where the AI does not need to receive good runs, it is able to learn alone.
Supervised learning is generally easier, but has the drawbacks that it requires huge amounts of replays and that it will never become better than the expert it tries to mimic.
Reinforcement learning may be more difficult, but it can theoretically find strategies that were never shown to him.
@rFey Год назад
@@linesight-rl My idea was to use the information from supervised learning on random maps the AI hasn't "seen" but then i realized that wouldn't work if you couldn't also feed it block information or make some wild machine vision solution 🤔
@RadiantDarkBlaze Год назад
@@linesight-rl Is it possible to do something like starting a training run for a map as supervised learning, then switching the same training run to reinforcement once it reaches a certain fitness on the supervised part; so that it can surpass the player who provided the replays for that map for the supervised part as it goes about the reinforcement part?
@ryans3979 Год назад ⁺¹
@@RadiantDarkBlaze The idea you have does exist, it's typically called pre-training or sometimes bootstrapping. It's where you train a model with one method (so supervised learning could work), and then it has somewhat of a baseline behavior. In the case of supervised learning it might learn how to imitate some of the various tech that TASs use. Then, you can further train it using a different method to allow it to refine itself and improve past its current level.
The issues with that strategy are that, like linesight mentioned, you'd have to feed it a massive amount of replays. It's likely you don't have thousands upon thousands of TAS runs for a single map, so you'll need to feed it random TAS runs of other maps. If you do that, you have to deal with negative transfer, where what the tech and skills it learns from other maps might interfere, you don't want it trying to use glitches that are impossible or useless on a simple map like this. It's harder to make a generalized AI than it is a specific AI, and that's what you'd be doing with the supervised learning. That's a broader task than this AI which is just running on a very simple map. It could work in theory though, it's just more time consuming and more computationally expensive to implement.
@RadiantDarkBlaze Год назад
@@ryans3979 Would something like taking a single good human replay, and putting it through the brute-forcer tool while saving every single tiny improvement to eventually gather 10k+ technically-unique replays work for generating a supervised learning set for a map? Or is there express reason 10k+ (human or TAS) replays of a specific map are needed? I do think it's necessary to only train a specific net on a single specific track, I was never thinking the idea could be used for making a generalized all-rounder net.
@PassiveIZ Год назад
Thats crazy after just 2700 runs and 30hrs
@OPEK. Год назад ⁺¹
I’m interested to see how it handles random ramsteins and landing bugs tbh
@pinipilla Год назад
Those are not random, trackmania physics are deterministic its just changes too much with a little input change, which is not a problem for a machine
@eddyreising6567 Год назад
very impressive work!
@barakeel Год назад ⁺¹
What was the reward when it was not able to finish the track yet?
@linesight-rl Год назад ⁺²
Simple question, simple answer: nothing. Neither a reward nor a punishment.
This will likely trigger the question "what's the reward then ?". It's mostly progress along the track.
I think we'll start to add voice-overs or some explanations in the next videos, look out for them :)
@livingroom5899 Год назад
Better than I will ever be.
@lucacu3587 Год назад ⁺²
any wirtual vid watchers here???
@user-dh8oi2mk4f Год назад ⁺¹
Yes
@ibozz9187 Год назад ⁺²
Are those neoslides or normal drifts?
@lordnoom4919 Год назад
looks to me like most are release drifts
@fontur5119 Год назад
most of them are neoslides
@pekatour Год назад
@@lordnoom4919 Aka neo drift
@lordnoom4919 Год назад
@@pekatour nope u dont need to release during a neo. Since neo = steering --> stop steering --> start braking --> steer again. All while holding down acceleration
@pekatour Год назад
@@lordnoom4919 mb
@pixelmalfunction1772 10 месяцев назад
is ur ai on the leaderboard the 1 with the sub 1:50 cause that would be impressive if it found cut and if it didnt then i beat the ai by 2 sec but it prob did
@vjproject Год назад
Porque no se ve completamente las marcas de neumaticos? Modificado o baja calidad😅
@curcodes Год назад
really good work
@curcodes Год назад
I got a challenge: next time, test your future AI on this. And see if reach 2.04 in
@Sagosmurfen Год назад
Neo slide god!! 😮
@xtraz9814 Год назад
Hello people from Wirtual videos
@11DowningStreet Год назад
how does this work? it looks really cool
@ArrakisMusicOfficial Год назад
What GPU? :)
@linesight-rl Год назад ⁺¹
Nvidia 3060
@ArrakisMusicOfficial Год назад ⁺¹
@@linesight-rl How did you manage to get it learn so quickly? 2900 runs is ridiculously low amount for how good it got. You must have used very good priors, how did you do it? Careful reward modelling? Or really good initial policy? Or really good exploration policy? What RL method did you use? :)
@201pulse Год назад
Hi linesight I'm an experienced data scientist and I would be interested in helping and contributing to this project. Some time I actually wanted to do the same so I might have some cool ideas. Are you interested?
@linesight-rl Год назад
Hi, thank you for your interest. While it is always helpful to have another person's perspective, this is a rapidly evolving 2-person project. At least in the short term, we prefer to keep it small.
We will probably have a more open approach in the future and welcome contributions. You're welcome to ask again in a few videos' time!
@linesight-rl Год назад
How should we contact you when we are more open to contributions?
@Queen_Elizabeth249 Год назад ⁺¹
I wonder if KarjeN could defeat this AI
@user-go5ee4cs3c Год назад ⁺¹
At first i thought human would be faster, but the length of the map...
@Queen_Elizabeth249 Год назад
@@user-go5ee4cs3c true
@mk-ej3cz Год назад ⁺¹
For sure he could
@hayabusa10055 Год назад
@@mk-ej3cz as of now yes easily, but there's no telling how far it can be pushed maybe even to the point that AI makes TAS runs itself without your help
@zillion8954 Год назад
now train it on a acual map
@ozzehh Год назад
wirtual sucks compared

Следующие

Автовоспроизведение

How the Trackmania TAS competition went (20 submissions)