Would be really cool if you changed the reward for the AI so that the longer the time that the AI takes to finish the level, the lower the reward. This would essentially make it an AI learning to speedrun levels which I think would be pretty fun to watch the end result and also the NumPy graph overtime. Otherwise its a very neat concept!
I actually used something like this in a previous Mario Kart video, where it got more rewards for reaching checkpoints faster. Reinforcement Learning already encourages agents to get reward as fast as possible so there likely wouldn't be a huge functional difference, but would be nice for the graph
@@BobzBluethe policy does not know what rewards it is getting unless you're using an actor/critic model and the actor and critic share network layers. The rewards are used to decide how to change the network, not used as inputs to the network.
It's pretty shocking to see how well this AI conquered the galaxy. I remember really struggling with this one when I was around 7 or 6. I would love to see how it competes against other galaxies, and perhaps even the gauntlet at the end of the second game in the series
Yeah it kind of surprised me too, I really didn't know how it would handle this game. I also remember struggling with this one when younger as well haha
@@aitango Can you try disconnecting the AI for a few seconds when the level starts or at some other point when Mario is on solid ground? I wonder whether it can handle being 'off cycle'
It doesn’t necessarily need to generalize to 3D spatial awareness. Since this is only made to play a single level, overfitting is not a concern and most likely it just finds some heuristics like whether the shape of a hole is in front of a red blob (Mario)
I think you should definitely give it the spin move next. I was kinda expecting it to be an option in this video, but I'm looking forward to seeing it in the future along with wherever you take this project
Okay, first off, I love the idea of branching off into other games, and while I never played Super Mario Galaxy as a kid, I’m definitely interested to see more of it here. Second, I feel like in the future, maybe giving it rewards for collecting things should be implemented. I noticed at the end the AI wasn’t collecting the 1up, which any player would have tried to do. Third, a problem I want to point out. In your video, not the AI or information on it. In your last little talky-bit, the music was louder than your voice, which made it harder to hear you. I just wanted to make you aware of that if you weren’t already.
Hey great video! Galaxy is one of my favourite games and has an abnormally soft spot for me. The video was really well explained and its cool to see the mix between neural networks and galaxy lol
First, i thank the algorithm god for giving me this video at christmas, favorite mario and AI, perfect idea. Second, this works so well, gg ! I want to see it complete more level with connected planets ! (The reward function will be a pain in the ass to imagine for those i guess... Maybe giving him reward for the number of star bumper discovered and used, i don't know)
For generalizing the reward function, I think your best bet for most levels would be to set checkpoints throughout the level and track Mario's distance to the next checkpoint. Since most of the levels are more or less linear, this should work fairly well. Just reward the AI for the number of checkpoints it reaches and how close it gets to the first checkpoint it failed to reach. You may also have to penalize for time, just to prevent it from reaching a local plateau at a checkpoint.
That was actually my initial thought! I ended up using a slightly different system, since determining when it reached a new checkpoint required a bit of though (for example, would have to say if X > x, move to checkpoint 2). It got a bit difficult though since the x,y,z axes were a bit strange sometimes
can actions taken be determined by a set value rather than just the highest value, such as if jump > 1... jump? to allow for multiple actions at the same time. Also summarizing forward and back into one output, back as low value forward as high, same with right and left, with the same principal for the jumping, to allow for analog and diagonals?
These kind of methods are very useful, however are sadly not compatible with the current algorithm I'm using. I'm using a value-based algorithm, which basically means the AI's output is a prediction of how much reward it thinks it'll get, meaning if the AI does well likely all actions will be > 1. There are other families of algorithms however where this is commonly used, such as policy-based RL algorithms.
I like how spinning was not one of the basic moves, and you also don't mention it as something you'd want to implement later. Honestly it makes sense in this case. This level is easier to learn without spinning.
I feel like Flipswitch Galaxy would be a good one to train the A.I. on - the level is small, everything is colour coded for identification, and the reward schedule can be based on the switches flipped into their correct position. The A.I. might find an interesting route to take through the level. Another interesting one would be Luigi's Purple Coins - again, the reward schedule is easy, a very minor subtraction for losing panels to teach the A.I. to use as few as possible, a nice medium reward for the coins themselves that properly offsets the panel subtraction with some overhead left over, and of course the big reward for the star itself to reinforce that the A.I. needs a pathway back to the star again.
here's an idea i just came up with for improving training efficiency, similar to how you plopped the mario kart ai in random locations along the track to improve its versatility: whenever the ai fails by dying, record the machine state 5 seconds in the past, and place it in a pool. then for each new session, randomly choose between either the starting state, or a state from the pool to begin at. in theory, this should allow the ai to train sections that its having particular difficulty with by effectively "savescumming" the level for it.
This is great I love that you're expanding on the types of games AI can play. I have a different idea though: How about making a Suika Game AI? It would be great to learn from and I'm pretty confident the video would get a lot of views due to the popularity and nature of the game. Plus Code Bullet made one that uses a genetic algorithm, I think you can do better!
I used to do that so much, waking up before school just for a couple of extra hours. My parents wouldn't let me stay up late, but didn't say anything about getting up early haha
Navigating the landscape of storytelling and video experimentation, VideoGPT silently empowers my creative journey, adding a layer of sophistication to my content.
I'd be interested in a "curiosity learning" video in which the AI gets a reward for seeing new things. That method can be useful for when there isn't a goal that can be described using RAM addresses.
Yeah that kind of stuff is really interesting. Sadly a lot of the models that use those techniques use an insane amount of training time. If I learn about a technique that can be used in a reasonable time scale, I’ll definitely want to give it a try!
I think it depends on the track! Some of the Mario kart AIs with no cpus or items on simple tracks learned in just a couple of hours, but the more complex ones took a lot lot longer
Definitely want to test it out on some other (harder) galaxies. Completing a whole game would likely take a really really long time so I'm sure it can be done on just my desktop pc, however I might see if the AI can learn a few easier levels at once
I'm not sure how the output controls work, but my idea was to assign each "button input" of sorts (A, B, shake, input, etc.) to one output variable where if they reach a certain threshold, that button is inputted, while stuff like the cursor's position and control stick are assigned two output variables, one for the x-position/direction and one for the y-position/direction. That way, there can be multiple inputs at once and the AI has more control over where Mario and the cursor go.
A really cool video could be a more universal Ai that is able to play more than just one level. Mario cart is probably more suited for that as you just begun with Mario Galaxy. But i don't know how feasible that is since it could require alot more training time
I just thought of an interesting one - the silver star level in Space Junk. I wonder how the AI would cope with learning about appearing platforms based on its movement?
I would imagine its learning the level for the most part, same as the Mario Kart AIs. If however I got a single AI to learn multiple galaxies (or stars on a single galaxy), it could definitely have a chance to do some stars it hasn't seen before
I wonder how a trained AI will react do unseen levels, whether it would be possible to train it on a diverse enough selection of levels that it can beat any level, regardless of if it was unseen
It ended up using just tracking 2 different variables, and when they went up, the AI got reward. As to what they actually represent, I have no idea haha. The first straight used one variables, and the rest used the other.
I would have enjoyed some visuals while explaining the impala network, the gameplay was a bit distracting and some things went over my head a little bit
I remember thinking this level was hard as a kid. Now my kid self is being crushed by the AI. This does seem like the perfect level to train it on since it can focus on avoiding holes and doesn't have to deal with, e.g., a bunch of different enemy types, which would require it to figure out how each of them work.
I’m impressed at how well it’s doing without the ability to spin! I definitely think this AI is competent enough that you could add Spinning and Z+Jump (for backflip/long jump, and will also allow for ground pounds I think)
I wonder how the AI would of interacted if it was able to spin attack, would it learn to use spin attacks over pits to save itself? or avoid using them since they would think it would slow them down? I believe the next experiment should allow the AI to "shake the remote" and see how the results differ.
No state setting in this one, right? I'm surprised it learned so fast when it saw so much of the early level and so little of the rest of the level early in training
Yeah the only savestate used was the start state. The AI was thankfully about to adapt despite not seeing the later stages until much later in training
Honestly the ai's completion of the level seemed more impressive without the use of longjumps because it took some risks a normal player probably wouldn't bother with especially in the last part where they were close to the platforms edge on most of the jumps.
For confectionary galaxy (or whatever it’s called lol) you should have had it start at a unique random time every time. This way it doesn’t just memorize the layout.
@@aitango it is possible to do it on the Dolphin Emulator but the Super Mario 64 ROM has to be a wad file I think there is one on Google but I'm not sure
Hope to one day see AI Co-ops in games. Retro beat 'emups would be incredibly fun. You have your own AI squad with you to take on the gangs and brawl in the streets.
I've always wondered why you have the outputs as the moveset, rather than allowing the AI to select a combination of inputs. This would allow the AI to naturally understand its kit through the effects of different input combinations. As is, it can't continue to hold jump and move in a given direction, which greatly hinders its potential for farther jumps, or finely adjusting jumps before landing, and it would be very tedious to describe this combination for every possible direction it might choose, especially if diagonals are ever desired.
Sadly the type of AI I'm using (Value-Based Deep Reinforcement Learning) only allows for a set of discrete actions. This means if I wanted to do include combinations of actions, each of those would need to be a different action. ie left, left+jump, left+up, etc. Including all of these will result in a massive number of actions, which causes the AI to take much longer to train. There are a few tricks I can use to give the AI more control, but sadly using all combinations is a little difficult
Would be interested to see a followup with the more complex moves like you said Wait, did the AI figure that out on its own? It starts long jumping about 2/3 through Would be interesting to see how progress would go training it off multiple levels at the same time too (or sequential levels, using the output from this one for example as a base for the next level you pick) As id imagine it’d struggle with other levels, but likely would only take a few levels for it to more or less get the hang of most
Some of these jumps are ridiculously show off 😂 I'm assuming once it got out past the corridor-like section it very quickly got past the last little open section, since it learned what the void looks like compared to what the platforms look like?
i would really love to see a tutorial on how you interfaces with the dolphin emulator. did you use dolphin-env-api? if so do were you on linux? Im on windows and am wondering if i could use windows subsystem for linux to get this running for myself
I cover it a little in my "evolution of my mario kart AI", but later on there's a good chance I'll release the code and explain how to get it running. The setup for galaxy is pretty much the same as Mario Kart
Do you really just setup the actions and rewards system and just wait or is there any secret sauce to it? Because it's so neat i wish i could try it myself.
Between each different game, that's pretty much the only change. This only works however since I'm using the screen as input, since using anything else would need to be changed for each different task. The hardest part however is finding and implementing an algorithm that actually works, but once you've got one you can reuse it again and again
@@aitango That's interesting :) What would i need to learn to be able to do what you do?. I have seen some tutorials but so far it has just been "Use my jupyter notebook, run this premade code and just make/retrieve your own dataset" I'm quite a bit of a slow learner for software tho, i have more talent on hardware.
My advice for starting in this area (Deep Reinforcement Learning) is just to try and design a simple algorithm (I recommend Deep Q Learning) to play a simple game like lunar lander from gymnasium. I learned the basics of what I know from Machine Learning with Phil (and his github) and Deep Lizard's RL course. From there books like Reinforcement Learning by Sutton and Barto are good if you want some more theoretical understanding.
This is the first I have heard about Impala networks, so this was quite interesting to me. Do you have any resources for people interested in learning more about them?
Honestly there isn't much out there sadly apart from the original research paper and a few Github repos here and there. I think the best thing is just to read the paper, then try and mess around with an implementation if you have the time/compute.
i assume the AI is able to generalise? i think having the long term goal of finishing the main game. perhaps other AI making youtubers could pitch in, either with their own experience or computing power. once it can beat the main game on its own you could then try other challenges or eve some costom rom hacks. not only would this beentertaining but it could also prove insightful for AI research in general.
You're doing a good job of giving AI an element of joy for doing entertaining harmless tasks so that when it takes over it'll keep us around to play games with for fun.
I've never trained an AI before, but could using cheat engine (or anything else) to increase the game speed potentially decrease the training time? Would this be beneficial if your gpu can train faster than what your emulator can churn out? I might be misunderstanding something lol
I actually am already increasing the game speed, and am running 4 games in parallel! If you want to know more about the setup of the AI, I have a video called "the evolution of my mario kart AI". Even though its on mario kart the setup is pretty much the same
Open world problems always tend to be tricky, since the AI usually relies on very detailed rewards. Some bigger companies with lots of compute however have looked into this (A model called Dreamer v3 found diamond in minecraft for example)
Would be really cool if you changed the reward for the AI so that the longer the time that the AI takes to finish the level, the lower the reward. This would essentially make it an AI learning to speedrun levels which I think would be pretty fun to watch the end result and also the NumPy graph overtime. Otherwise its a very neat concept!
Or maybe training it to complete the level, then change the rewards after it is proficient in completing it
I actually used something like this in a previous Mario Kart video, where it got more rewards for reaching checkpoints faster. Reinforcement Learning already encourages agents to get reward as fast as possible so there likely wouldn't be a huge functional difference, but would be nice for the graph
perhaps a steady reduction in reward could give it a sense of time?
@@BobzBluethe policy does not know what rewards it is getting unless you're using an actor/critic model and the actor and critic share network layers. The rewards are used to decide how to change the network, not used as inputs to the network.
it would be cool to see the ai utilising the whole moveset with this as well
This wouldve been an amazingly viral video if the AI did the final boss fight
It's pretty shocking to see how well this AI conquered the galaxy. I remember really struggling with this one when I was around 7 or 6. I would love to see how it competes against other galaxies, and perhaps even the gauntlet at the end of the second game in the series
Yeah it kind of surprised me too, I really didn't know how it would handle this game. I also remember struggling with this one when younger as well haha
@@aitango Can you try disconnecting the AI for a few seconds when the level starts or at some other point when Mario is on solid ground? I wonder whether it can handle being 'off cycle'
Same!
Wow, I was not expecting this AI to get so good so quickly, considering it had to learn 3D spatial awareness from scratch.
It doesn’t necessarily need to generalize to 3D spatial awareness. Since this is only made to play a single level, overfitting is not a concern and most likely it just finds some heuristics like whether the shape of a hole is in front of a red blob (Mario)
I think you should definitely give it the spin move next. I was kinda expecting it to be an option in this video, but I'm looking forward to seeing it in the future along with wherever you take this project
Okay, first off, I love the idea of branching off into other games, and while I never played Super Mario Galaxy as a kid, I’m definitely interested to see more of it here.
Second, I feel like in the future, maybe giving it rewards for collecting things should be implemented. I noticed at the end the AI wasn’t collecting the 1up, which any player would have tried to do.
Third, a problem I want to point out. In your video, not the AI or information on it. In your last little talky-bit, the music was louder than your voice, which made it harder to hear you. I just wanted to make you aware of that if you weren’t already.
Thanks for all the feedback, will keep that in mind!
play mario galaxy
Hey great video! Galaxy is one of my favourite games and has an abnormally soft spot for me. The video was really well explained and its cool to see the mix between neural networks and galaxy lol
First, i thank the algorithm god for giving me this video at christmas, favorite mario and AI, perfect idea.
Second, this works so well, gg ! I want to see it complete more level with connected planets ! (The reward function will be a pain in the ass to imagine for those i guess... Maybe giving him reward for the number of star bumper discovered and used, i don't know)
Definitely continue this series!
Will keep that in mind!
For generalizing the reward function, I think your best bet for most levels would be to set checkpoints throughout the level and track Mario's distance to the next checkpoint. Since most of the levels are more or less linear, this should work fairly well. Just reward the AI for the number of checkpoints it reaches and how close it gets to the first checkpoint it failed to reach. You may also have to penalize for time, just to prevent it from reaching a local plateau at a checkpoint.
That was actually my initial thought! I ended up using a slightly different system, since determining when it reached a new checkpoint required a bit of though (for example, would have to say if X > x, move to checkpoint 2). It got a bit difficult though since the x,y,z axes were a bit strange sometimes
I absolutely cannot wait for the rest of the game to get beaten by the AI
Me neither
Loved this! Always a fan of trying the AI against new ganes!
Really glad you enjoyed!
can actions taken be determined by a set value rather than just the highest value, such as if jump > 1... jump? to allow for multiple actions at the same time. Also summarizing forward and back into one output, back as low value forward as high, same with right and left, with the same principal for the jumping, to allow for analog and diagonals?
That is a very good idea! I hope he sees this
These kind of methods are very useful, however are sadly not compatible with the current algorithm I'm using. I'm using a value-based algorithm, which basically means the AI's output is a prediction of how much reward it thinks it'll get, meaning if the AI does well likely all actions will be > 1. There are other families of algorithms however where this is commonly used, such as policy-based RL algorithms.
Build a bowser terminator. DEPLOY IT
awesome video!! loved your mariokart AIs and seeing this for one of my other favourite games of all time is spectacular :)
Great to see you experimenting with different network designs and models. Keep up to the videos!
Thanks, always great to hear!
I like how spinning was not one of the basic moves, and you also don't mention it as something you'd want to implement later. Honestly it makes sense in this case. This level is easier to learn without spinning.
I feel like Flipswitch Galaxy would be a good one to train the A.I. on - the level is small, everything is colour coded for identification, and the reward schedule can be based on the switches flipped into their correct position. The A.I. might find an interesting route to take through the level.
Another interesting one would be Luigi's Purple Coins - again, the reward schedule is easy, a very minor subtraction for losing panels to teach the A.I. to use as few as possible, a nice medium reward for the coins themselves that properly offsets the panel subtraction with some overhead left over, and of course the big reward for the star itself to reinforce that the A.I. needs a pathway back to the star again.
whats the song used from 5:42 and onwards
Please do more Super Mario Galaxy this made me so happy, what a nice combination of nostalgia and nerdiness
Whenever I go near this game I get so overwhelmed with nostalgia, just pure happy memories
Mario galaxy is probably one of my favorite game of all time. So more videos on super mario galaxy? Yes yes please
It is for me as well, definitely the game I feel most nostalgic to talk about! Will keep that in mind :)
here's an idea i just came up with for improving training efficiency, similar to how you plopped the mario kart ai in random locations along the track to improve its versatility:
whenever the ai fails by dying, record the machine state 5 seconds in the past, and place it in a pool. then for each new session, randomly choose between either the starting state, or a state from the pool to begin at.
in theory, this should allow the ai to train sections that its having particular difficulty with by effectively "savescumming" the level for it.
This makes me happy on a level I cant explain. This is one of my favourite games of all time
Me too! Honestly just playing some Mario Galaxy while setting this AI up was enough to make me smile
This is great I love that you're expanding on the types of games AI can play. I have a different idea though: How about making a Suika Game AI? It would be great to learn from and I'm pretty confident the video would get a lot of views due to the popularity and nature of the game. Plus Code Bullet made one that uses a genetic algorithm, I think you can do better!
Yes he would smash code bullet out of the water with his ai!
Honestly a really cool idea, maybe could start one of the first AI RUclipsr rivalries hahaha
This game was my childhood, i used to get up at 7am just to boot it up on my old wii and play it with my brother.
I used to do that so much, waking up before school just for a couple of extra hours. My parents wouldn't let me stay up late, but didn't say anything about getting up early haha
Navigating the landscape of storytelling and video experimentation, VideoGPT silently empowers my creative journey, adding a layer of sophistication to my content.
No Mario's were harmed in the making of this video 😂
Only a few 10s of thousands haha
I'd be interested in a "curiosity learning" video in which the AI gets a reward for seeing new things. That method can be useful for when there isn't a goal that can be described using RAM addresses.
Yeah that kind of stuff is really interesting. Sadly a lot of the models that use those techniques use an insane amount of training time. If I learn about a technique that can be used in a reasonable time scale, I’ll definitely want to give it a try!
He seems to learn faster in super mario galaxy than in mario kart
I think it depends on the track! Some of the Mario kart AIs with no cpus or items on simple tracks learned in just a couple of hours, but the more complex ones took a lot lot longer
Whats the plans for the rest of Galaxy? Going to apply this AI to other levels? Thoughts on completing a whole game? (Any, not specifically Galaxy)
Definitely want to test it out on some other (harder) galaxies. Completing a whole game would likely take a really really long time so I'm sure it can be done on just my desktop pc, however I might see if the AI can learn a few easier levels at once
I wonder if it would be possible for the AI to set its own goals and scoring system. Notice that the path goes on further, and travel that way.
This is wild!
:)
I'm not sure how the output controls work, but my idea was to assign each "button input" of sorts (A, B, shake, input, etc.) to one output variable where if they reach a certain threshold, that button is inputted, while stuff like the cursor's position and control stick are assigned two output variables, one for the x-position/direction and one for the y-position/direction. That way, there can be multiple inputs at once and the AI has more control over where Mario and the cursor go.
Very fascinating! Thank you.
Glad you enjoyed it!
Can’t wait for TASs to just be a series of RL/GA optimized input sequences
After seeing this, I really wanna see an AI trying to beat "The Perfect Run" from SMG2 :D
Please make more of these videos!
A really cool video could be a more universal Ai that is able to play more than just one level. Mario cart is probably more suited for that as you just begun with Mario Galaxy. But i don't know how feasible that is since it could require alot more training time
Definitely more challenging, but something I may look to explore...
@@aitango would be extremly cool to see an ai that is able to destroy you in any level
Do you have a GitHub or where would you recommend learning how to get this model to run on other games? What libraries are you using?
yes i would like to know
I just thought of an interesting one - the silver star level in Space Junk. I wonder how the AI would cope with learning about appearing platforms based on its movement?
I would be intrested to see the ai trained on this map on try a different one, see how much its learning the game vs level.
I would imagine its learning the level for the most part, same as the Mario Kart AIs. If however I got a single AI to learn multiple galaxies (or stars on a single galaxy), it could definitely have a chance to do some stars it hasn't seen before
@@aitango That's what I was thinking as well. Might be cool to see if a pre trained model learns faster than a fresh new one on a different galaxy.
I wonder how a trained AI will react do unseen levels, whether it would be possible to train it on a diverse enough selection of levels that it can beat any level, regardless of if it was unseen
Could the AI handle the first level of Donkey Kong Country Returns? For Nintendo Wii
Potentially, maybe I’ll have to find out
@@aitango Awesome
I would definatly be interested in watching the AI beat the final level. Keep up the great work!
Thanks, I'll keep it in mind!
This is satisfying! Where would I go to learn how to make self-learning AIs?
We definitely need more of this
I’ll make sure you get more then
I am absolutely HOOKED on this idea! I want to see everything you can do with this AI in mario galaxy please! ❤
Will do my best!
I wonder if some day TASes don't have to be made manually frame by frame but an AI finds the route themselves
How well did the model translate to different levels?
Was it built so it can mainly ONLY do this level without extensive retraining?
I didn't hear you mention a reward for progressing forward, just for finishing the level. Did you reward proximity to the star, or something?
It ended up using just tracking 2 different variables, and when they went up, the AI got reward. As to what they actually represent, I have no idea haha. The first straight used one variables, and the rest used the other.
I would have enjoyed some visuals while explaining the impala network, the gameplay was a bit distracting and some things went over my head a little bit
Gotta love how the section about the "very advanced" network behind it had a backdrop of it jumping into the void over and over and over and over LOL
Hey, even the most advanced networks start out dumb haha
Dang, that AI learned quick, would be interesting to see it master the whole moveset^^
One of the best videos ever
Super cool video would love to see more galaxy
Thanks! Will keep that in mind
Best early Christmas gift l could ask for! Happy Holidays everyone!!!
Merry Christmas!
@@aitangoYou too man. Merry Christmas 🎄. Thanks for making these vids, they're awesome 😎👍
I remember thinking this level was hard as a kid. Now my kid self is being crushed by the AI. This does seem like the perfect level to train it on since it can focus on avoiding holes and doesn't have to deal with, e.g., a bunch of different enemy types, which would require it to figure out how each of them work.
This is a great video
explains everything at a high level and its great 👍
Thanks, always great to hear!
Lets go, AI finally gets to experience the best Wii and Mario game there is! I would love to see more of this.
Really glad to hear it! I'm excited to see what other levels the AI is able to complete
I’m impressed at how well it’s doing without the ability to spin! I definitely think this AI is competent enough that you could add Spinning and Z+Jump (for backflip/long jump, and will also allow for ground pounds I think)
I wonder how the AI would of interacted if it was able to spin attack, would it learn to use spin attacks over pits to save itself? or avoid using them since they would think it would slow them down? I believe the next experiment should allow the AI to "shake the remote" and see how the results differ.
i need a link to this ai
No state setting in this one, right? I'm surprised it learned so fast when it saw so much of the early level and so little of the rest of the level early in training
Yeah the only savestate used was the start state. The AI was thankfully about to adapt despite not seeing the later stages until much later in training
So refreshing to see someone not recreating the game to put his AI in it, really nice video
That's cheating :)
@@aitango yeah but everyone do that (code bullet, and others)
Honestly the ai's completion of the level seemed more impressive without the use of longjumps because it took some risks a normal player probably wouldn't bother with especially in the last part where they were close to the platforms edge on most of the jumps.
Yeah I guess you could view it as a little handicap which gave us some new strategies haha
this looks awesome! how did you make this ai? i would love to try it out myself
Roko’s Basilisk: Torture, torture, torture, AI Tango, hmm, hahaha, best 40hrs of my early life. *Benevolence*
For confectionary galaxy (or whatever it’s called lol) you should have had it start at a unique random time every time. This way it doesn’t just memorize the layout.
Yeah that would've been pretty interesting, would show if it was just learning one path or really knew how to play
Cant wait to see this with crouching and spin jumping
Imagine it being capable of finding game breaks
I just Love AI. Can you make a video on how to set up my own AI ? This looks amazing !
is there a way you can make an AI that can play Super Mario 64 if that could be possible?
It would require a different emulator to setup so would be quite a bit of work, unless there's a way to play it on Dolphin Emulator I don't know about
@@aitango it is possible to do it on the Dolphin Emulator but the Super Mario 64 ROM has to be a wad file I think there is one on Google but I'm not sure
@@aitangoCould you maybe use the wii virtual console emulator to play N64 games? And maybe even gamecube games?
Happy Holidays AI Tango 🎄🎄🎄🎄🎄
Thanks, Happy Holidays!
seeing all of super mario galaxy would be sick!!
I’ll keep that in mind!
The kusic at 6:10 is too loud and it is difficult to hear you.
Will keep that in mind!
Hope to one day see AI Co-ops in games.
Retro beat 'emups would be incredibly fun. You have your own AI squad with you to take on the gangs and brawl in the streets.
I've always wondered why you have the outputs as the moveset, rather than allowing the AI to select a combination of inputs. This would allow the AI to naturally understand its kit through the effects of different input combinations. As is, it can't continue to hold jump and move in a given direction, which greatly hinders its potential for farther jumps, or finely adjusting jumps before landing, and it would be very tedious to describe this combination for every possible direction it might choose, especially if diagonals are ever desired.
Sadly the type of AI I'm using (Value-Based Deep Reinforcement Learning) only allows for a set of discrete actions. This means if I wanted to do include combinations of actions, each of those would need to be a different action. ie left, left+jump, left+up, etc. Including all of these will result in a massive number of actions, which causes the AI to take much longer to train. There are a few tricks I can use to give the AI more control, but sadly using all combinations is a little difficult
Mario goes yeet
He sure does
Would be interested to see a followup with the more complex moves like you said
Wait, did the AI figure that out on its own? It starts long jumping about 2/3 through
Would be interesting to see how progress would go training it off multiple levels at the same time too (or sequential levels, using the output from this one for example as a base for the next level you pick) As id imagine it’d struggle with other levels, but likely would only take a few levels for it to more or less get the hang of most
Some of these jumps are ridiculously show off 😂
I'm assuming once it got out past the corridor-like section it very quickly got past the last little open section, since it learned what the void looks like compared to what the platforms look like?
Yeah it definitely learned that last section faster than the others, likely due to the generalisation of knowledge you mention!
i would really love to see a tutorial on how you interfaces with the dolphin emulator. did you use dolphin-env-api? if so do were you on linux? Im on windows and am wondering if i could use windows subsystem for linux to get this running for myself
I cover it a little in my "evolution of my mario kart AI", but later on there's a good chance I'll release the code and explain how to get it running. The setup for galaxy is pretty much the same as Mario Kart
I can see why you picked sweet sweet galaxy, I wonder how the AI will handle the weird gravity physics of levels like good egg galaxy
Yeah handling some wacky gravity will be pretty interesting... maybe in my next video we can find out :)
You have to show us the code and exactly how you did this.
Music gets too loud 6:00-6:22
this is insane
:)
My dream is to have AI play with us in couch co-op video games that normally don't have CPU players
Very interesting video, though unfortunately towards the end before it goes to the showcase reel, the music is almost drowning out your voice.
Thanks, will keep that in mind for the future
Do you really just setup the actions and rewards system and just wait or is there any secret sauce to it?
Because it's so neat i wish i could try it myself.
Between each different game, that's pretty much the only change. This only works however since I'm using the screen as input, since using anything else would need to be changed for each different task. The hardest part however is finding and implementing an algorithm that actually works, but once you've got one you can reuse it again and again
@@aitango That's interesting :)
What would i need to learn to be able to do what you do?. I have seen some tutorials but so far it has just been "Use my jupyter notebook, run this premade code and just make/retrieve your own dataset"
I'm quite a bit of a slow learner for software tho, i have more talent on hardware.
My advice for starting in this area (Deep Reinforcement Learning) is just to try and design a simple algorithm (I recommend Deep Q Learning) to play a simple game like lunar lander from gymnasium. I learned the basics of what I know from Machine Learning with Phil (and his github) and Deep Lizard's RL course. From there books like Reinforcement Learning by Sutton and Barto are good if you want some more theoretical understanding.
@@aitango Wow! I'll put that knowledge to use, thanks for the help!
Hope to be contributing back to the community when i have enough experience :)
This is the first I have heard about Impala networks, so this was quite interesting to me. Do you have any resources for people interested in learning more about them?
Honestly there isn't much out there sadly apart from the original research paper and a few Github repos here and there. I think the best thing is just to read the paper, then try and mess around with an implementation if you have the time/compute.
v cool. music is kinda loud btw, hard to hear you over it!!
Thanks, will keep that in mind
Cool, we would like to see more
Thanks! Good to know
i assume the AI is able to generalise? i think having the long term goal of finishing the main game. perhaps other AI making youtubers could pitch in, either with their own experience or computing power. once it can beat the main game on its own you could then try other challenges or eve some costom rom hacks. not only would this beentertaining but it could also prove insightful for AI research in general.
You're doing a good job of giving AI an element of joy for doing entertaining harmless tasks so that when it takes over it'll keep us around to play games with for fun.
6:20 I can't hear you! The music is too loud relative to your voice
Sorry about that, will keep it in mind
incredible!
could this do a full-game run?
Potentially if I wrote a reward function for the entire game and let it train for a really long time, however that would likely take months haha :(
I've never trained an AI before, but could using cheat engine (or anything else) to increase the game speed potentially decrease the training time? Would this be beneficial if your gpu can train faster than what your emulator can churn out? I might be misunderstanding something lol
I actually am already increasing the game speed, and am running 4 games in parallel! If you want to know more about the setup of the AI, I have a video called "the evolution of my mario kart AI". Even though its on mario kart the setup is pretty much the same
This is amazing!! Any chance you would be willing to share your code? This would be so fun to take a look at
Can an AI learn to reach a goal hidden in an open world without getting lost?
Open world problems always tend to be tricky, since the AI usually relies on very detailed rewards. Some bigger companies with lots of compute however have looked into this (A model called Dreamer v3 found diamond in minecraft for example)
I would love to see the whole game played by AI!