@@IrishAnonymous01 That's because you need a programming interface to interact with the game. In some cases, coding the game from scratch is just way easier !
Yeah true, it helps the AI as if it does something wrong (like hit a wall or fall off), the feedback is pretty instant, making it clear what the AI needs to avoid!
@@CaptainBuggyTheClown He *could* give the AI a preconceived idea of what the tracks are is, but that defeats the purpose of having the AI Learn Mario Kart. Not just how to play Mario Kart, but learning how to recognize what bottomless pits are, optimize driving, and hell, even learn what driving *is* based on the 2 goals given to it. He’s not teaching the AI what Mario Kart is, the AI is learning by itself to win by wanting to maintain speed, and Strive for first place. If I understood him right, the AI might not even know what a race is, as much as it’s just trying to keep 2 numbers up very high, and is determining what actions will allow it to do that with the best possible advantage.
In most of the Mario Kart stuff I've done with AI, it always seemed to do low tricks surprisingly well which always baffled me, as I have no idea why I learned that so fast
@@aitango Yeah, it does hop a lot at the start so maybe it's just something it does on accident then realises it can get the most amount of points there by lowtricking. It's so cool how your reward system makes it unintentionally learn it though lol
That would be so hard for it to learn. Using a shroom would give the AI huge reward (Thats a lot of speed), so it would most likely just use the 3 shrooms one after the other every time.
For a bit of reference, sub 1 minute on this track is a decent time for a human in time trials, and a rather good time without powerups in vs mode, so a mid 1:06 is intriguing for sure, definitely showing some potential room for improvement but also demonstrating some clear progress to go with the gaps in it's methods.
The second to last clip was slightly faster, managing to get 1:05. There's definitely still room for improvement though, mainly missing the odd mini-turbo and missing the boost panel at the end
Of course it heavily depends on the task and the model architecture, but humans will generally learn MUCH faster. The AI however has the much higher ceiling. For real-time usecases that is. If the AI can play millions of games in the same time a human plays one it's obviously different
@@bobkreme2175yea 80 hours is such a large sample size for the AI to body us. I have I feeling we win first hour or two, then the AI starts beating us with the tricks we showed it
It wouldn't be a fair comparison at all. If we are talking about a regular game (let's say mario cart), the human player (even if this person has never seen mario cart before) would start with prior knowledge, since the game is designed to be easily understood by humans. The AI on the other hand would not understand anything, and play like a human when the screen is disconnected, but maybe quicker on changes in state. If a human player played as if this was a full time job for 2 weeks (80 hrs), there would be an improvement, but the progress would probably look fairly linear (or would maybe flatten a bit if completely new to the game), while the AI would have a much larger rise in the first few hours, which would flatten quite much after just a few hours. The change in the AIs curve would most likely make the human curve look constant in comparison. If we would do this in a more fair manner, the "game" should have a few buttons that could be pressed, and randomized pixel graphics that change depending on the inputs and timings, with a scoring system that gives points based on some determined criteria, which in the start is seemingly random to both the player and the AI. In this case the AI would win consistently over the player almost certainly all the time with a scoring gap that would increase with every single run.
It's fascinating to see how much AI has improved as a whole, not only in the game itself but in general. 5 years ago we wouldn't have been able to do this much with AI already. Also you're criminally underrated, keep it up!
Yeah its pretty amazing, it feels like new AI research is coming out constantly and it just keeps getting better and better so fast. Thank you so much, I really appreciate the kind words!
Simply: a electric powered programmable calculator, calculating how to get the most reward numbers by trying every possible input. It's not really much more than a algorythm that saves it's inputs and at what time it has to do said inputs to get the highest reward number possible.
This is just factually wrong. Here's a MUCH more advanced mario kart AI by Sethbling (actually learning the *game* not just a track) from 4 years ago. ruclips.net/video/Tnu4O_xEmVk/видео.html and 3 years before that he did the same thing but with Super Mario World. This stuff (and much more advanced) has been available for over a decade easily.
With this manually set up reward system, it would be impossible because it wouldn't even be optimal for the AI. It technically isn't learning how to beat the track in the fastest way possible, it is learning how to maximize the handcrafted reward function.
i'm very happy this popped up in my recommendations; i'm on the spectrum and i've had pretty consistent mario kart hyperfixation periods since about 2017, and recently I've been very interested in the process of machine learning, so this video was basically a perfect match of two of my special interests. target audience reached! :)
I'm really glad you liked the video! I remember years ago something similar happened to me! Back around 2019 I was really into StarCraft 2, so when Google released an AI to play it just as I was getting into AI it absolutely blew my mind!
There's a good chance that was on purpose, as getting wheelie bumped causes a massive drop in speed and therefore reward. I actually trained other mario kart AIs which avoided wheelieing for a really long time because of this
Ok, but right as you change the map the ai starts screwing up again. People need to start training ai’s to actually use the positioning of the walls, jumps, etc. to make them actually play instead of just following a list of instructions that was mutated until perfection.
I’m really curious how the AI was able to take input from the game. Was it using computer vision to actually process the whole screen in real time or was it integrated with the game engine in some way that let it get position data? I’m especially curious how it dealt with knowing where the other CPUs were and responding to bumps from them
It would have to be using some kind of memory hooks. Whole screen processing is very slow and probably couldn't handle processing a race in real-time. It also wouldn't know what speed it was going just by screen alone.
I'm curious too. It'd be very underwhelming if the AI is essentially blind and doing this by trial and error. It seems like sometimes it just runs into a wall.
Having the AI receive greater rewards based on its speed seems to work great. It sounds like common sense, but every other attempts I've seen at training racing AI just sets checkpoints and rewards for crossing them, no variable to increase the reward if the time between checkpoints is lower or the speed higher.
Would love to see more details about how intermediate rewards were determined and how game state was encoded, like was it seeing a downscale greyscale screen grab, or peeking at position and map data in emulator memory?
I’m really curious about two things. 1) How we're rewards determined? You showed us how you received rewards as a human, but I'm left curious about how rewards are calculated. 2) What difficulty are the CPUs set to? And how did that affect it? Really neat to see the results, but so many more questions have arrived!
I was wondering exactly the same. It feels as though the AI is just reverse-engineering the reward algorithm, where the algorithm needs to be known in advance in order to determine "successful driving". I think I've missed something in the explanation here.
Cool video. It's nice to see how sophisticated these AIs are getting, especially during the livestream you did. I think it would be interesting to train an AI with other players online with Wiimmfi. Might have to check if that's allowed first, though
Thanks! Glad you saw the livestream, I wasn't sure how it would go! Yeah I'm not sure, I think the AI would definitely get banned for cheating since it would probably set off any bot detection that exists. Would be really cool to try though, would love to get in touch with some of the Wiimmfi Devs and try and make it happen though
@@aitango Hi AI Tango, Wiimmfi dev here. RUclips seems to remove my comments for some reason (spam filter) - feel free to contact me through the info on my youtube profile.
The fact that self learning AI has become an innovation for success, a lot of gaming companies are using it to provide a much more Dynamic and Random experiences for gamers. Can't wait to see how it evolves. This video is a great way to show, Practice makes perfect. A real life skill that must be taken into account! ❤
@@aitango Honestly most artificial intelligence in games use random number generation but never actually learn themselves they just use outcomes defined by numbers not outcomes defined by what it actually knows and what it doesn't. I feel like learning itself is a much better way of it to become overly powerful over current methods used to day with artificial intelligence in video games. it's a hard prospect to grasp but if you ever decide to reach out for some company or get a job for this specifically, please let us know about it, As I would definitely like to follow you on that journey!
@@aitango Some companies are already working on that like Sony AI, Unreal Engine Learning Agents... But please keep in mind we don't necessarily want our butts kicked as players 😂 Sometimes, predictible AIs are funnier to play against !
We are gonna try and post a video every 2 weeks on Fridays, thank you for the support. Takes a while to train the AI, create voice overs and edit. We really appreciate the love. 🙏
Absolutely incredible! Question: does this AI only work for Ghost Valley 2? It seems like with its learning process, it would brute force other tracks rather than making decisions based on the track itself. It would be interesting to see an AI that can learn new tracks quickly!
Thanks! So this AI was only trained for Ghost Valley 2, so would likely struggle if I put it down on another track. If however it was trained on many different tracks at once, the AI would likely start to understand how tracks really work, rather than a specific track, so could probably try tracks its never seen before! This is definitely something I'm looking to do in the future!
@@aitangoOooh, you could have one version of this AI (Player 1) learn every track in sequence, and each race put it up against a fresh AI in the Player 2 slot to see how much P1's previous training helps or hinders learning new tracks.
I want to see how an AI would cope on a track with more randomness such as koopa cape or toad's factory. I would also like to see how would fare with items on.
Would love to see a tutorial series on how you could get something like this running on your own PC. It would be so interesting to try this with all sorts of different variables. Great content man, very interesting!
This is really cool! Is the AI actually seeing the visuals / have access to knowing its position and the position of other racers or is it just running based on sequence and rewards? (is that a correct term?)
Really glad you like it! Yes, the AI is learning from looking at the screen, the same information people use to play the game. It only knows the position of the other racers from its current placement (1st, 2nd etc), and the minimap. The AI actually only uses rewards when training, not in the actual decision making process. This means that once the AI is trained, no rewards are needed for it to drive, just the screen!
I really wish I knew how to do this! You are amazing and you have a great things ahead of you for yt! Keep it up and Godspeed. You have earned a new subscriber
surely given enough time to train (and mushrooms) the ai could beat the world record? is this theoretically possible even if it were to take months of training? also keep up these videos i love ai and mario kart you're the goat
I don't think so, the WR has some insane strategies, I don't the current reward system would be able to do it. Even with a more advanced reward system, the AI can get stuck and not evolve or take higher risks because that would drastically affect the reward it receives, so it is very unlikely a AI like this could beat the WR.
I can verify this is pretty accurate. Most AIs do sadly hit a plateau at some point, rather than just continuing to improve forever. There are however other AI models out there designed to use much more data which probably could beat world records, but running these models is only really possible for big companies with huge amounts of compute like google and openai
I know very little about Mario Kart strats but this was a chill and fun video to watch also kudos for putting music credits in the description, now I've added a few to my spotify lmao
This video is cool I wonder what would happen if you put the AI on other courses; would it perform well because it learned the basics from this track or would it have difficulties with basic things that it took for granted before?
It makes sense that it would be able to use what it learned here on certain tracks that have no obstacles on them. It has basic driving skills, understands the benefits to drifting, popping wheelies, tricks, and shortcuts, and actively avoids falling off. I feel like when things like Toad's Factory's stampers or Moonview Highway's cars come into play, it would need a lot more training before it wouldbe ready.
Yeah this reply is pretty accurate; it might be able to drive a little on other tracks since it understands the basics, but would definitely struggle with anything too different to what its seen before during training. If the AI was trained on multiple different tracks though, there's a decent chance it would be able to play new tracks straight away since it would start to have a good general knowledge of the game
This is sick, I wonder how much better it would be if it went on for say 200 hours or watching Time trials also seeing it learn items would be cool, thanks for making this video
Really glad you like it! I always wonder that too! Many AIs usually hit a plateau after a while, but this one looks like it was still improving from the graph, so would've been really cool! Perhaps in my next video you might see some items :)
I really like how throughout you were while explaining it in a very simple way. I was able to understand what you were saying completely. I’m fascinated with AI, but most people don’t explain it in simple terms, thus making it hard for me to understand what they are saying. Thanks for explaining it in a simple way so I could enjoy the video.
I wonder how the ai might be affected if you gave it a slight punishment for rapidly changing its prediction, or if that's something you can feasibly program. Ideally it would force it to commit more to its choices, thus making it take more direct lines while retaining the ability to change course at the last second to avoid a collision. As I think about it though, I think it might just make it more difficult for it to learn how to avoid crashes, since it may punish early avoidance measures.
There is actually an algorithm that attempts to do something similar this, called Advantage Learning which looks to increase the gaps between the predictions, forcing it to change its choice less. I really like the idea though, as for games like Mario Kart where the constant action swapping is really detrimental, it would definitely help!
@@renakunisaki Well, the issue is that while it loses speed to what we know to be faster, there's almost a sort of, what's it called, a local optima or something? While the ai COULD commit more to turns, committing even slightly more would have little improvement but increase the chances of a crash. It's entirely possible that the ai is stuck trying to optimize this tradeoff, and without shifting reward/punishment values, it may never get out of that rut. In order to improve without that, it must intentionally take actions which it knows will cause more crashes. It doesn't want to crash, so it doesn't try to improve.
Just proof that practice makes perfect. Even computers understand this. Really neat to see it actually lapping the CPU's in the end. Imagine if the CPU's in game were at this level.
Amazing video! I'm curious to see what happens if we modify the reward function as performance improves to assist the AI in learning new elements or overcoming plateaus, such as taking the yellow boost at the end. This could involve manually adjusting the reward function or using some form of meta-learning.
Watching this AI learn to drive in Mario Kart is like watching me try to be the best player: lots of crashes, occasional moments of brilliance, and always hitting a wall when things get tough
I like how when it goes for the trick jump, it hits the middle point of the choice between the left turn and right, then decides right based on that reward difference. It's really robotic and shows that it hits a very specific spot where it knows to turn for maximum reward based on the direction and turn reward of it versus the fastest turn being heading straight for it from that angle. Since it has to consider all directions when it's gaining the reward for optimal efficiency due to the "predicting" reward system it follows. I most noticed it in the Final Ai segment on the first laps.
How does this AI think about the track? Is it spatially aware, or just it just learn the best combination of buttons to press to get the highest reward? Could it be put into another track and do well, or would it have to start from scratch?
I think giving the AI rewards for uncommon events would make it even better, like lapping a racer (which should increase the score exponentially for each racer passed), or letting it keep track of its personal records and giving it points for each time they break it, bc at the end the ai isn’t learning the track to get better but rather it’s learning the track to figure ways to optimize the amount of points it could get
It actually wouldn't, at least not without a substantially larger and more complex model, and that would take much more computation to deal with. Rewards work better the more frequent, immediate, and relevant they are (this is actually true of human learning too.) You could say beating a previous time is the most relevant thing possible, but it's also the least frequent, least immediate thing possible, because it happens once at the end of the race. How does the AI relate beating a previous time to all it's previous work? It has to "think", "OK. I did something better this time, but what was it?" How does it figure out which thing or things it did that lead to beating a previous time?
To me, this feels like a parent teaching their kid how to do something. Your AI is clearly learning not just from the track layout, but also from the game's CPUs. Obviously your AI is far more advanced than the CPUs' AI, but it seems to be taking what the CPUs do and combining that info with the track layout to come up with the fastest route. The CPUs are like the parents, teaching their kid, your AI, to do something in the hopes of the kid surpassing them one day. Loved the video! I'd love to see you do more with this AI, and push it's limits!
Really glad you enjoyed the video, and I hope to do lots more with the AI too! My dream as that the AI will be able to surpass my own skill level in any game I play!
Excellent video. I was really hoping you'd go a little bit deeper on the inputs you were feeding the AI. Does it only know its XYZ coordinates? Or even that? I was confused what inputs the AI was using to lower its predicted reward when it flew off the ledge. Or was that simply from the Z coordinate? Did I answer my own question?
Thanks, I glad you like it! The short answer is that this AI uses the screen's pixels to learn, rather than any coordinates, so it has to learn from the same information any human would have. If you want to see a little more about how the AI works, check out my last video (Mario Bros is too easy for insane AI). Even though its on New Super Mario Bros, the way the input works is the same, and it shares lots of common features with this AI!
@@aitango I watched it afterward, and I indeed figured that you had applied the same system to this one. Super awesome that you have this working with only "visual" input. Did you ever show what the actual the image the AI sees looks like, or was the crunched up clip we saw just an approximation?
The input from you saw in the new super mario bros video was the actual image the AI is given. The only difference is that the AI is given the previous 4 frames rather than just 1, to allow it to infer motion from the images
It'd be interesting to give it a cart and see if it learns things like rapid hopping (forgot the exact name, but it's when you hop every other frame). Could be useful for finding future shortcuts and/or quicker routes. It'd also be interesting to give it a limitation of "human realistic" inputs (20 single button presses per second or less, unable to consistently tap every other frame) to see if there are faster human viable routes that just havent been found yet. This could do some cool time trials research, wish I could run it cause I have so many ideas 😅
Yeah I know what you mean, would be really cool if it could learn that! The AI does currently have a limitation of only being able to choose a new action every 4 frames, or around 15 times per second (900apm for starcraft nerds). Obviously this is way faster than a human, but maybe not fast enough for the rapid hopping!
*turns on items*
Perhaps in my next video your wish will be granted ;)
@@aitango if you are ready, so am I
@@aitangomake it learn mushroom peaks
@@aitangoYou know the rules, and so we are
@@KaishidowNice try but those aren’t the lyrics
I appreciate how you do these on actual software and not just code your own software version of the game to simplify the task.
Coding again Mario Kart Wii wouldn’t be easier
@@GunnerSiIva yes but a lot of these "AI learns game" channels code a simplified version of the game that they can integrate their AI with easier
Truuue, lowkey annoying when they do that
@@Manelneedsanameyeah, cause its not really beating mariokart if its not even the real mariokart, is it?
@@IrishAnonymous01 That's because you need a programming interface to interact with the game. In some cases, coding the game from scratch is just way easier !
Not having offroad on this track really seems to make it easier for the AI. cool stuff!
Yeah true, it helps the AI as if it does something wrong (like hit a wall or fall off), the feedback is pretty instant, making it clear what the AI needs to avoid!
@@aitango have you though about how it might deal with items and if it could learn how to use them as well?
@@aitango Can you not program it to detect the dif maps roads and whats off road?
@@CaptainBuggyTheClown
He *could* give the AI a preconceived idea of what the tracks are is, but that defeats the purpose of having the AI Learn Mario Kart.
Not just how to play Mario Kart, but learning how to recognize what bottomless pits are, optimize driving, and hell, even learn what driving *is* based on the 2 goals given to it.
He’s not teaching the AI what Mario Kart is, the AI is learning by itself to win by wanting to maintain speed, and Strive for first place.
If I understood him right, the AI might not even know what a race is, as much as it’s just trying to keep 2 numbers up very high, and is determining what actions will allow it to do that with the best possible advantage.
What is a reward so far as its concerned?
it's so crazy to see the AI even go for the lowtrick during the shortcut after 48-80 hours of training lol
In most of the Mario Kart stuff I've done with AI, it always seemed to do low tricks surprisingly well which always baffled me, as I have no idea why I learned that so fast
@@aitango Yeah, it does hop a lot at the start so maybe it's just something it does on accident then realises it can get the most amount of points there by lowtricking. It's so cool how your reward system makes it unintentionally learn it though lol
With the right rewards it might figure out the instant finish glitch.
The fact it pulled the shortcut lff at 12 hours of training is still very impressive
Whats so crazy about it?
I love how aroused the reward function gets right before they cross the finish line on their final lap.
I gave it a big reward for finishing, so it really makes the reward spike which I guess for an AI counts as arousal haha
how do i delete someone else's comment
@@LilacMonarch says the (admittedly adorable) furry
@@LilacMonarchsays the (creepy) furry
@@LilacMonarchsays the (neutral as the furry is a stranger) furry
I think it'd be pretty interesting to see it compete against the staff ghost of that track once it learns to use items such as mushrooms.
That would be so hard for it to learn. Using a shroom would give the AI huge reward (Thats a lot of speed), so it would most likely just use the 3 shrooms one after the other every time.
If jts the weapons the ai would spam it but he would add a reward chart for that too
love how the reward bars perfectly sync with the music for a second at 6:34-6:36
I didn't even notice that haha
As the editor I can tell you this is a beautiful coincidence
@@aitango do you watch the videos though or when they are done they are done
It actually syncs almost perfectly at 6:46 - 6:48 as well 🤯
Conclusion: After 80 hours, AI becames Verstappen.
DU DU DU DU MAX VERSTAPPEN DU DU DU DU
Back off or we all crash protocol
For a bit of reference, sub 1 minute on this track is a decent time for a human in time trials, and a rather good time without powerups in vs mode, so a mid 1:06 is intriguing for sure, definitely showing some potential room for improvement but also demonstrating some clear progress to go with the gaps in it's methods.
The second to last clip was slightly faster, managing to get 1:05. There's definitely still room for improvement though, mainly missing the odd mini-turbo and missing the boost panel at the end
Have an AI and a human learn a game at the same time.
Would be very interesting to see how the learning curves differ.
Chess...
Average human after 80 hours - "wait, en passant is actually a rule?"
Engine after 80 hours -
*able to beat the best human in the world*
@@bobkreme2175"Send me your god, I must consume their ELO"
Of course it heavily depends on the task and the model architecture, but humans will generally learn MUCH faster. The AI however has the much higher ceiling.
For real-time usecases that is. If the AI can play millions of games in the same time a human plays one it's obviously different
@@bobkreme2175yea 80 hours is such a large sample size for the AI to body us.
I have I feeling we win first hour or two, then the AI starts beating us with the tricks we showed it
It wouldn't be a fair comparison at all. If we are talking about a regular game (let's say mario cart), the human player (even if this person has never seen mario cart before) would start with prior knowledge, since the game is designed to be easily understood by humans. The AI on the other hand would not understand anything, and play like a human when the screen is disconnected, but maybe quicker on changes in state. If a human player played as if this was a full time job for 2 weeks (80 hrs), there would be an improvement, but the progress would probably look fairly linear (or would maybe flatten a bit if completely new to the game), while the AI would have a much larger rise in the first few hours, which would flatten quite much after just a few hours. The change in the AIs curve would most likely make the human curve look constant in comparison.
If we would do this in a more fair manner, the "game" should have a few buttons that could be pressed, and randomized pixel graphics that change depending on the inputs and timings, with a scoring system that gives points based on some determined criteria, which in the start is seemingly random to both the player and the AI. In this case the AI would win consistently over the player almost certainly all the time with a scoring gap that would increase with every single run.
It's fascinating to see how much AI has improved as a whole, not only in the game itself but in general. 5 years ago we wouldn't have been able to do this much with AI already. Also you're criminally underrated, keep it up!
Yeah its pretty amazing, it feels like new AI research is coming out constantly and it just keeps getting better and better so fast. Thank you so much, I really appreciate the kind words!
Depends. This is hardly different from MarI/O, an AI that happened years ago.
Simply: a electric powered programmable calculator, calculating how to get the most reward numbers by trying every possible input. It's not really much more than a algorythm that saves it's inputs and at what time it has to do said inputs to get the highest reward number possible.
nah this stuff was definitely possible when I took an AI class over 10 years ago
This is just factually wrong. Here's a MUCH more advanced mario kart AI by Sethbling (actually learning the *game* not just a track) from 4 years ago. ruclips.net/video/Tnu4O_xEmVk/видео.html and 3 years before that he did the same thing but with Super Mario World. This stuff (and much more advanced) has been available for over a decade easily.
Imagine leaving it on for thousands of hours and learning how to do the Ultra Shortcut
The what
That's sadly improbable fue to how the rewards are set up
With this manually set up reward system, it would be impossible because it wouldn't even be optimal for the AI. It technically isn't learning how to beat the track in the fastest way possible, it is learning how to maximize the handcrafted reward function.
@@myithspa25 look up history of unltrashortcuts by Summoning Salt
it seems to reset based on if it crashed, meaning that it might not even be able to do that
i'm very happy this popped up in my recommendations; i'm on the spectrum and i've had pretty consistent mario kart hyperfixation periods since about 2017, and recently I've been very interested in the process of machine learning, so this video was basically a perfect match of two of my special interests. target audience reached! :)
I'm really glad you liked the video! I remember years ago something similar happened to me! Back around 2019 I was really into StarCraft 2, so when Google released an AI to play it just as I was getting into AI it absolutely blew my mind!
pretty cool the ai learned to drive like someone really trying to avoid wheelie bumps
There's a good chance that was on purpose, as getting wheelie bumped causes a massive drop in speed and therefore reward. I actually trained other mario kart AIs which avoided wheelieing for a really long time because of this
This is probably one of the more original ideas I've seen about MKWii, I have not seen any other video like this but this is great!
Ok, but right as you change the map the ai starts screwing up again. People need to start training ai’s to actually use the positioning of the walls, jumps, etc. to make them actually play instead of just following a list of instructions that was mutated until perfection.
i hate to say it but you got a point
I’m really curious how the AI was able to take input from the game. Was it using computer vision to actually process the whole screen in real time or was it integrated with the game engine in some way that let it get position data? I’m especially curious how it dealt with knowing where the other CPUs were and responding to bumps from them
+1
When looking at 0:40 , I assume they've used the package dolphin-memory-engine and has access to read the memory.
It would have to be using some kind of memory hooks. Whole screen processing is very slow and probably couldn't handle processing a race in real-time. It also wouldn't know what speed it was going just by screen alone.
I'm curious too. It'd be very underwhelming if the AI is essentially blind and doing this by trial and error. It seems like sometimes it just runs into a wall.
+1
thanks so much, the music, the clips, the talking, so good, this absolutly made my day better
So glad you enjoyed it, great to hear such kind comments!
Having the AI receive greater rewards based on its speed seems to work great. It sounds like common sense, but every other attempts I've seen at training racing AI just sets checkpoints and rewards for crossing them, no variable to increase the reward if the time between checkpoints is lower or the speed higher.
The AI just collects and replicates gameplay from the average funky kong user
No way this video has so little views! You put so much effort into this, keep up the good work, definitely subbing.
Thank you so much, always great to hear! Will look to keep making more content
Me too man
I’ll check the other videos
Thanks, will really help me out!
Would love to see more details about how intermediate rewards were determined and how game state was encoded, like was it seeing a downscale greyscale screen grab, or peeking at position and map data in emulator memory?
Great video as usual, like i have mentioned in a community post making the AI learn the track moonview highway would be a great video imo
Thanks! Yeah Moonview would definitely be a great track to try as the AI would have to try so hard to learn to avoid cars haha
Having it use an outside drift kart on moonview highway would be sick
This is some real quality Mario Kart Wii content! Glad this popped up in my recommendations, looking Forward for part 2 :D
Really glad you like it! I'll make sure to make the next part as good as possible!
Next video should be out on Friday 1st September
good job Ai-chan. im proud of you for improving so much ❤
You have no idea how long I’ve been waiting for something like this to happen!!
I like how the reward skyrockets at the end of lap 3, as if it’s getting excited
Yeah that final lap finish is like a drug to the AI haha
I’m really curious about two things. 1) How we're rewards determined? You showed us how you received rewards as a human, but I'm left curious about how rewards are calculated. 2) What difficulty are the CPUs set to? And how did that affect it?
Really neat to see the results, but so many more questions have arrived!
I was wondering exactly the same.
It feels as though the AI is just reverse-engineering the reward algorithm, where the algorithm needs to be known in advance in order to determine "successful driving".
I think I've missed something in the explanation here.
I kind of just want to know what difficulty they were set to
I wanted to know that too, how are rewards calculated?
Hey, great video as always ! That's be cool to see you race the AI, maybe someday !
Thanks! I'll have to give it a try at some point
Always gotta love q learning, it can either work, or never work, I appreciate the time that was needed to complete this
Yeah you really never know haha. Even with advanced variants of Q-Learning like this one, things can still be a bit unpredictable
Cool video. It's nice to see how sophisticated these AIs are getting, especially during the livestream you did. I think it would be interesting to train an AI with other players online with Wiimmfi. Might have to check if that's allowed first, though
Thanks! Glad you saw the livestream, I wasn't sure how it would go! Yeah I'm not sure, I think the AI would definitely get banned for cheating since it would probably set off any bot detection that exists. Would be really cool to try though, would love to get in touch with some of the Wiimmfi Devs and try and make it happen though
@@aitango Hi AI Tango, Wiimmfi dev here. RUclips seems to remove my comments for some reason (spam filter) - feel free to contact me through the info on my youtube profile.
@@aitango you can set up a private room and livestream the bot racing to encourage people to join to try and race the ai
Training an AI is kinda like teaching a dog to do tricks.
Make sure to walk your dog every day.
The fact that self learning AI has become an innovation for success, a lot of gaming companies are using it to provide a much more Dynamic and Random experiences for gamers. Can't wait to see how it evolves. This video is a great way to show, Practice makes perfect. A real life skill that must be taken into account! ❤
I've always thought that would be such a cool idea, infact I would love to do this for job if it becomes popular! Really glad you liked it!
@@aitango Honestly most artificial intelligence in games use random number generation but never actually learn themselves they just use outcomes defined by numbers not outcomes defined by what it actually knows and what it doesn't. I feel like learning itself is a much better way of it to become overly powerful over current methods used to day with artificial intelligence in video games. it's a hard prospect to grasp but if you ever decide to reach out for some company or get a job for this specifically, please let us know about it, As I would definitely like to follow you on that journey!
Will do! I think it could lead to so much more interesting gameplay so would love to be apart of it
Well said
@@aitango Some companies are already working on that like Sony AI, Unreal Engine Learning Agents... But please keep in mind we don't necessarily want our butts kicked as players 😂 Sometimes, predictible AIs are funnier to play against !
So glad he finally lapped someone right at the end!
Yeah I was really happy it got that, the second to last clip technically finished in a faster time, but lapping the CPU was too satisfying!
Keep it up, love these vids
Thanks, will do! Really glad you're enjoying it!
This was a good ass video man! Keep doing what you do and success will come your way 🙏
We are gonna try and post a video every 2 weeks on Fridays, thank you for the support. Takes a while to train the AI, create voice overs and edit. We really appreciate the love. 🙏
Fascinating video man, I wonder how far this can go if someone put lots of time into perfecting this AI. Nice work.
Thanks a ton!
Everyone’s welcoming Donkey Kong’s cousin to Mario Kart and everybody’s happy until 8:00
The concept of rewards being so vague here reminds me of that one Tumblr post where it goes like “can I get a burger with uhhh ingredience?”
Lmfao going straight into the wall at the beginning
Absolutely incredible! Question: does this AI only work for Ghost Valley 2? It seems like with its learning process, it would brute force other tracks rather than making decisions based on the track itself. It would be interesting to see an AI that can learn new tracks quickly!
Thanks! So this AI was only trained for Ghost Valley 2, so would likely struggle if I put it down on another track. If however it was trained on many different tracks at once, the AI would likely start to understand how tracks really work, rather than a specific track, so could probably try tracks its never seen before! This is definitely something I'm looking to do in the future!
@@aitango not sure how hard it would be to do, but love this idea and we should make it happen.
@@aitangoOooh, you could have one version of this AI (Player 1) learn every track in sequence, and each race put it up against a fresh AI in the Player 2 slot to see how much P1's previous training helps or hinders learning new tracks.
Really good video, excited to see where this channel is going
I want to see how an AI would cope on a track with more randomness such as koopa cape or toad's factory. I would also like to see how would fare with items on.
Would love to see a tutorial series on how you could get something like this running on your own PC. It would be so interesting to try this with all sorts of different variables. Great content man, very interesting!
This is really cool! Is the AI actually seeing the visuals / have access to knowing its position and the position of other racers or is it just running based on sequence and rewards? (is that a correct term?)
Really glad you like it! Yes, the AI is learning from looking at the screen, the same information people use to play the game. It only knows the position of the other racers from its current placement (1st, 2nd etc), and the minimap. The AI actually only uses rewards when training, not in the actual decision making process. This means that once the AI is trained, no rewards are needed for it to drive, just the screen!
@@aitango incredible! Thanks for letting me know and keep up the great work!
I really wish I knew how to do this! You are amazing and you have a great things ahead of you for yt! Keep it up and Godspeed. You have earned a new subscriber
Thanks you so much, it really means a lot to hear such kind words!
surely given enough time to train (and mushrooms) the ai could beat the world record? is this theoretically possible even if it were to take months of training? also keep up these videos i love ai and mario kart you're the goat
I don't think so, the WR has some insane strategies, I don't the current reward system would be able to do it.
Even with a more advanced reward system, the AI can get stuck and not evolve or take higher risks because that would drastically affect the reward it receives, so it is very unlikely a AI like this could beat the WR.
@@tdiogo_gamer very unlikely or impossible?
@@billztr I don't exactly know, but if it isn't impossible, it is EXTREMELY unlikely.
I can verify this is pretty accurate. Most AIs do sadly hit a plateau at some point, rather than just continuing to improve forever. There are however other AI models out there designed to use much more data which probably could beat world records, but running these models is only really possible for big companies with huge amounts of compute like google and openai
This channel is criminally underrated!
Lots of passion and hard work are going into these videos, glad to see you enjoying them.
How long do you leave it training until it starts speedrunning 🤔
I know very little about Mario Kart strats but this was a chill and fun video to watch
also kudos for putting music credits in the description, now I've added a few to my spotify lmao
This video is cool
I wonder what would happen if you put the AI on other courses; would it perform well because it learned the basics from this track or would it have difficulties with basic things that it took for granted before?
It makes sense that it would be able to use what it learned here on certain tracks that have no obstacles on them. It has basic driving skills, understands the benefits to drifting, popping wheelies, tricks, and shortcuts, and actively avoids falling off. I feel like when things like Toad's Factory's stampers or Moonview Highway's cars come into play, it would need a lot more training before it wouldbe ready.
Yeah this reply is pretty accurate; it might be able to drive a little on other tracks since it understands the basics, but would definitely struggle with anything too different to what its seen before during training. If the AI was trained on multiple different tracks though, there's a decent chance it would be able to play new tracks straight away since it would start to have a good general knowledge of the game
first video ive seen by you, definitely watching more this is great
Next video should be out in the 1st September, we try and post every other Friday.
This is sick, I wonder how much better it would be if it went on for say 200 hours or watching Time trials also seeing it learn items would be cool, thanks for making this video
Really glad you like it! I always wonder that too! Many AIs usually hit a plateau after a while, but this one looks like it was still improving from the graph, so would've been really cool! Perhaps in my next video you might see some items :)
I really like how throughout you were while explaining it in a very simple way. I was able to understand what you were saying completely. I’m fascinated with AI, but most people don’t explain it in simple terms, thus making it hard for me to understand what they are saying. Thanks for explaining it in a simple way so I could enjoy the video.
then just learn how to code AI :P
It's interesting how the AI skips the boost arrow just before the finish line, every single lap.
By the time it reached the end it learned to stick to the middle so it had no chance to encounter it.
incredible video. cannot wait to see more from this channel ❤
Thank you so much!! Will try my best to keep new videos coming!
You should see how long it would take for the AI to beat a world record in mario kart!
underrated channel this is so interesting also
200cc rainbow road vs the wr ghost non tas would be sick
Thanks! That sounds like quite the challenge, might have to work up to that one haha
Imagine if you could replace the original CPUs with this better AI what hard challenges would come out
Yeah I would love to see, I could imagine Mario Kart RUclipsrs doing vs AI challenges and stuff like that
@@aitangoYou might want to reach out to the MKW modding community to see if its feasible. That idea is overflowing with potential.
You can already race ghosts of other fast players through CTGPR
the music and the ai racing funky kong surprisingly gives me nostalgia of mariokart time trial youtube vids back when the game first came out
now put it on a different course and see what happens
What is my purpose?
You pass the butter
I wonder how the ai might be affected if you gave it a slight punishment for rapidly changing its prediction, or if that's something you can feasibly program.
Ideally it would force it to commit more to its choices, thus making it take more direct lines while retaining the ability to change course at the last second to avoid a collision.
As I think about it though, I think it might just make it more difficult for it to learn how to avoid crashes, since it may punish early avoidance measures.
There is actually an algorithm that attempts to do something similar this, called Advantage Learning which looks to increase the gaps between the predictions, forcing it to change its choice less. I really like the idea though, as for games like Mario Kart where the constant action swapping is really detrimental, it would definitely help!
In theory the game already does it since you lose speed.
@@renakunisaki Well, the issue is that while it loses speed to what we know to be faster, there's almost a sort of, what's it called, a local optima or something? While the ai COULD commit more to turns, committing even slightly more would have little improvement but increase the chances of a crash. It's entirely possible that the ai is stuck trying to optimize this tradeoff, and without shifting reward/punishment values, it may never get out of that rut. In order to improve without that, it must intentionally take actions which it knows will cause more crashes. It doesn't want to crash, so it doesn't try to improve.
I can watch your videos the whole day
Good to hear! I'll have to make more so you don't run out of videos to watch!
Honestly the most rudimentary AI training video I've seen in 3 years
Crazy that the AI lapped Rosalina in the last race! Nice work bro!
Just proof that practice makes perfect. Even computers understand this. Really neat to see it actually lapping the CPU's in the end. Imagine if the CPU's in game were at this level.
Amazing video! I'm curious to see what happens if we modify the reward function as performance improves to assist the AI in learning new elements or overcoming plateaus, such as taking the yellow boost at the end. This could involve manually adjusting the reward function or using some form of meta-learning.
Watching this AI learn to drive in Mario Kart is like watching me try to be the best player: lots of crashes, occasional moments of brilliance, and always hitting a wall when things get tough
We all have to learn somehow!
This is amazing content! I will be sticking around for more like this.
Glad you enjoy it, and good to hear you’re looking forward to more!
For all the AI generated images masquerading as art and other ways AI has been used for evil, you're the one using it for good.
Aamazing video!
Thanks, I'm really glad you like it!
I like how when it goes for the trick jump, it hits the middle point of the choice between the left turn and right, then decides right based on that reward difference. It's really robotic and shows that it hits a very specific spot where it knows to turn for maximum reward based on the direction and turn reward of it versus the fastest turn being heading straight for it from that angle. Since it has to consider all directions when it's gaining the reward for optimal efficiency due to the "predicting" reward system it follows. I most noticed it in the Final Ai segment on the first laps.
How TAS’ are born
We need a mod that replaces the cpus with 11 of these ai funky kongs lmao
Absolute unit of a music taste man. I was jamming out the entire video. Oh, the AI thing is cool too!
Glad you enjoyed the music! And of course the AI too haha
Imagine that the CPU of the game has an extreme difficulty with this training
How does this AI think about the track? Is it spatially aware, or just it just learn the best combination of buttons to press to get the highest reward? Could it be put into another track and do well, or would it have to start from scratch?
The CPU in the next Mario kart game:
Nintendo hire me please haha
Think you proved just how vital proper training is. Awesome video.
I think giving the AI rewards for uncommon events would make it even better, like lapping a racer (which should increase the score exponentially for each racer passed), or letting it keep track of its personal records and giving it points for each time they break it, bc at the end the ai isn’t learning the track to get better but rather it’s learning the track to figure ways to optimize the amount of points it could get
It actually wouldn't, at least not without a substantially larger and more complex model, and that would take much more computation to deal with.
Rewards work better the more frequent, immediate, and relevant they are (this is actually true of human learning too.)
You could say beating a previous time is the most relevant thing possible, but it's also the least frequent, least immediate thing possible, because it happens once at the end of the race.
How does the AI relate beating a previous time to all it's previous work? It has to "think", "OK. I did something better this time, but what was it?" How does it figure out which thing or things it did that lead to beating a previous time?
To me, this feels like a parent teaching their kid how to do something. Your AI is clearly learning not just from the track layout, but also from the game's CPUs. Obviously your AI is far more advanced than the CPUs' AI, but it seems to be taking what the CPUs do and combining that info with the track layout to come up with the fastest route. The CPUs are like the parents, teaching their kid, your AI, to do something in the hopes of the kid surpassing them one day.
Loved the video! I'd love to see you do more with this AI, and push it's limits!
Really glad you enjoyed the video, and I hope to do lots more with the AI too! My dream as that the AI will be able to surpass my own skill level in any game I play!
That's a dream I'm gonna watch you reach. You earned a subscriber today, my friend.
NICE VIDEO!! surprisingly low amount of views given how cool this video was!! you're absolutely going to go very far in RUclips :D
Thank you, I really appreciate it!
"Mario kart but every CPU learns and shares a collective consciousness"
Would love to see this on more tracks, great vid
The very first clip makes for a really impactful start 😂 Driving straight into a wall
I'm so happy to see a video like this, they always remind me of witnessing the birth of Neuro-Sama
Would love to see more tracks and items on! Video was done very well. New sub👍
Glad you liked it, hope it was fun and engaging.🙂
This video made my day so much better
I'm really glad to hear that!
I would like to see this AI try other courses without any training on the new course (but still without a wipe)
Excellent video. I was really hoping you'd go a little bit deeper on the inputs you were feeding the AI. Does it only know its XYZ coordinates? Or even that? I was confused what inputs the AI was using to lower its predicted reward when it flew off the ledge. Or was that simply from the Z coordinate? Did I answer my own question?
Thanks, I glad you like it! The short answer is that this AI uses the screen's pixels to learn, rather than any coordinates, so it has to learn from the same information any human would have. If you want to see a little more about how the AI works, check out my last video (Mario Bros is too easy for insane AI). Even though its on New Super Mario Bros, the way the input works is the same, and it shares lots of common features with this AI!
Oh wait, I just realised you left a comment on that video haha
@@aitango I watched it afterward, and I indeed figured that you had applied the same system to this one. Super awesome that you have this working with only "visual" input. Did you ever show what the actual the image the AI sees looks like, or was the crunched up clip we saw just an approximation?
The input from you saw in the new super mario bros video was the actual image the AI is given. The only difference is that the AI is given the previous 4 frames rather than just 1, to allow it to infer motion from the images
Really good video, please dont stop making videos
Don't worry, there's definitely more coming!
The new Mario Kart should have a new option like this in the Grand Prix selection, "150cc harder AI"
Sick super cool video
Glad you liked it!
Cool video. That's quite impressive driving for an AI.
How about you do a race against your AI? It would be very cool to see who can do it better.
This is so cool! Great video
I’m interested to do if it will ever discover stuff like ultra shortcuts/glitches to match or beat the human wr on different tracks
It'd be interesting to give it a cart and see if it learns things like rapid hopping (forgot the exact name, but it's when you hop every other frame). Could be useful for finding future shortcuts and/or quicker routes. It'd also be interesting to give it a limitation of "human realistic" inputs (20 single button presses per second or less, unable to consistently tap every other frame) to see if there are faster human viable routes that just havent been found yet. This could do some cool time trials research, wish I could run it cause I have so many ideas 😅
Yeah I know what you mean, would be really cool if it could learn that! The AI does currently have a limitation of only being able to choose a new action every 4 frames, or around 15 times per second (900apm for starcraft nerds). Obviously this is way faster than a human, but maybe not fast enough for the rapid hopping!
Oh yeah this one doing numbers. Amazing video!!!!
Thank you so much, glad you enjoyed!
Maybe you shouldn’t use the most op kart combo if you want to compare AI and Cpus
I noticed the ending is the ai lapping the CPUs for the first time. Nice touch
Thanks!