AI Learns to Play Super Smash Bros
HTML-код
- Опубликовано: 20 сен 2024
- AI agent learns to play Super Smash Bros. Melee using pseudo-imitation learning!
Source Code: github.com/eff...
Libmelee: github.com/alt...
Kofybrek's Video: • Training my First NEUR...
DQN Paper: arxiv.org/pdf/...
The Unsuccessful Melee Paper: web.stanford.e...
The Successful Melee Paper: arxiv.org/pdf/...
Yes, my avatar is different now: • The Haircut
-----------------------------------------------------------
Support Me (THANK YOU!): eff.sh/support
Discord: eff.sh/discord
GitHub: github.com/eff...
Interesting.
I'm a former competitive Melee/Project Melee player, and I think I might be able to help you understand why some of the behaviors were the way that they were here.
I'm guessing the fact that you de-emphasized the actions associated with wave dashing meant that the AI didn't really learn how to jump. I'm also guessing that the reason your AI did so many smash attacks was because a lot of players use the C stick to do aerial attacks. The C stick is basically a macro for smash attack (up, down, left, right input with an associated A press).
So my guess is that this is the reason why you see your AI standing at certain spots under the CPU spamming Smash attacks. It's probably trying to imitate the pro player's behavior of jumping at the CPU and doing an aerial, but you've basically told it never to jump because of wavedashing. Instead, it's skipping the jump input and going straight for the C stick.
Alternatively, there's only different kinds of attacks on the ground (tilt versus smash) so a player would never need to do a non-smash attack in the air. If it's not taking C stick inputs as a reference, then it might just be seeing aerial attacks come out with the same inputs that a smash attack would come out on the ground and doing them that way. Either way, the AI sure seems to love doing smash attacks instead of aerials. xD
Damn, thanks for your reply, I think you just found what was holding this project back…The training does count c-stick movements so that definitely tracks. One of the inputs to the network is whether or not it is in the air, but I think you’re right in that it *would* jump but because it’s incentivized not to it just kinda goes “idk smash attack maybe?” I’m trying to think and the only way I really see around that would be to collect a bunch of data of games without waving washing, and feed that through without the reweighting. Definitely would be an interesting project with and it would probably get better results, but I don’t know where I could get that much data. Maybe one day with a big enough audience I could ask people to send in any replays they have from before they learned about waveshining. Either way, great insight!
@@effdotsh Perhaps you could make a macro for wavedashing.
Also it would be cool if you made another AI for the character select screen, and it can attempt to counterpick you.
Ha an insta-counterpick bot would be hilarious. Also a wavedash macro would also probably work... wish I thought of that. It does potentially introduce the problem that the AI can't "change its mind" about wavedashing, but even then it would be better off. Now about that counterpick bot...
@@effdotsh
Id love to help with any insight whatsoever. If there’s anything you want to know about the game that may or may not help, I’d love to answer.
I know nothing about coding or programming an AI, but I spent a metric ton of time on practicing Melee.
Hopefully my comment made sense since I’m coming at this basically ONLY from the info from your video with literally zero prior knowledge on the topic other than actually playing the game.
I think it’s also a testament to your ability to break down all the intricacies of this process into a format that even I could understand well enough give some feedback.
@@effdotsh
Can you elaborate on “can’t change its mind” about wave dashing?
Do you mean to say that it can’t decide to wave dash and then un-decide to do it once it’s started inputting the actions, or do you mean that it wouldn’t be able to alternate between wave dash lengths similar to how you could only program the AI to recover at a 90 or 45 degree angle?
8:03 As a Fox main i can confirm this gameplay is optimal
Should've just ended the video there tbh
i loved that part where the AI said "its smashin' time" and smashed everyone in the room
😳
Phrasing
shut up bot
I wish I was in that room when it smashed everyone 😳
Better moment was when link joined and said "oh boy smooching time"
Great job! I can see you have a real passion for this stuff, keep it up!
I always thought it would be cool to have a learning AI start from scratch with a basic goal like get the opponent into the blast zone as fast as possible and see what it comes up with.
Would it discover new tech? Would it learn to wavedash?
Would it tech-chase?
It would be a super useful learning tool, especially if you gave it human reaction time and input speed.
But I can’t imagine how complicated and intensive that would be.
This part of the AI just jumping reminds of that old Copypasta of people saying that they let the bots in ̶C̶o̶u̶n̶t̶e̶r̶-̶S̶t̶r̶i̶k̶e̶ Quake playing by themselves during the weekend and on Monday the person saw that the bots were not moving "as they discovered that if they did not fight no one would need to die" - Oh! The old time of the internet. All fake but still gold.
are you talking about the old 4chan post? because I could have sworn it was about quake.
@@braxaculee You are right! I went to my school friends (as this is a +15 years memory) and they said it was about Quake (I imagined it was CS as it was huge part of my school culture)
@@SilasC that one is a classic. eerie and unsettling yet believable.
Some information that might help you:
In terms of complex controller inputs, there is a controller called a “boxx” which is entirely analog, and keeps the important values mapped to a modifier (such as shield drops, up tilts, and forward tilts). There is also a way to turn a keyboard into a “boxx” style controller, with all the necessary inputs to play the game at a high level.
Secondly, Slippi automatically saves all game replays unless specified otherwise, so top players probably have hundreds (if not thousands) of replays just sitting on their hard drives. Reaching out to them could be a good way of expanding your database by a large amount
I’d love to watch a follow up video that aims to improve on the current model, so keep up the good work!
I second the Slippi recommendation. My little brother is in the smash community and pretty much literally all of them use it. I don't actually know how much data you need for this kind of thing, but that _has_ to be enough!
It’s entirely digital** not analog
Being a competitive smash player, this is super dope to see being done. I know nothing about machine learning, but I am curious if it would be possible to do an evolutionary style ai for melee where it's able to learn what to do in specific situations. Melee has many states of play in order to account for, you have neutral, advantage, punish and edge guarding for some examples. Each of those states have moves that are better for those specific situations. I wonder that since you tried to make an ai replicate a pro player, without the ai understanding the context of the move, it turned out the way it did. Maybe an evolutionary model might be able to learn the context of what moves are good in specific states of the game?
Wow, fascinating! Awesome work dude :)
(btw I really appreciate the additional information in the subtitles, it's highly appreciated. And they're not auto-generated so that's neat too)
The “code bullet wannabe” bit really got me lol
Nice job man, I got this from my recommended. Hope your channel blows up!
🤞day one and this is already my most viewed video so even if it stops here I'm already super happy
@@effdotsh Awesome man! Hopefully the algo keeps it going for you. Best of luck and subscribed 🤞
We now need to create a tournament where different AI's compete against each other. To make the best computer player.
Yeah it would be pretty cool! This is something I thought about as I was making this project, and def wanna follow through with once I have a big enough audience and a budget for prizes.
Alt-F4 has made an unbeatable Fox AI, "smashbot", of course it's all hard coded to do something in every situation so it's very different but definitely check it out!
He is perfectly crouch canceling ( crouching just before getting hit negate recoil to and hit stun) thats insane
As a Statistics and Data Science major, it’s interesting to see a well explained crossover between my fav game,Smash, and my fav field DL/ML!!
why are all the most interesting videos made by the most underrated youtubers 🙏😭
Funny Code Bullet reference, way to take advantage of his poor deprived audience, including me 😅 I'll subscribe for that
your videos will only get better!! keep up the consistency
I have been thinking about this idea for a while so i am glad to see you gave this a try.
dude this is actually so sick. you might get this same comment a lot here (and also you might already know this) but you should consider looking into slippi if you're interested in ever doing more ai stuff for this game in the future from the actual community. slippi offers the ability to play online with other real players so it would be interesting to see how the ai fares against actual players online. there's a discord too where a lot of people are working on ai for melee and machine learning for stats and all that. melee is my favorite game and i'm so happy you've joined this community and learned to love our game. thanks for making this video man this is awesome
I've just gotten into SSBM content recently so not sure if that's why the algorithm blessed me with this awesome video, but I loved it! Great stuff Code Bullet Wanna Be :)
"It didn't work, it just resorted to continuously jumping". I don't know man.... that sounds a lot like Melee Champion H-Box.
Lol that bug spiral is so relatable. Cheers for not giving up dude, cool vid!
this vid has to blow up, great dedication
I think this video is awesome and I love everything about it and think the channel is cool! I think you defo have room to grow content creation wise but you have my subscription broski :)
incredibly underrated video and channel, you’ve earned my subscription. i hope you keep up the brilliant content :)
4:52 that's me!
Cool video!
Why is this so much work for 500 subs, bro this is incredible
off topic, but the sync between your words and your avatar's mouth movement is fucking nuts. great video.
edit from the future: just saw the ai lip sync video and after seeing that its somehow made even more impressive.
Ai is impressive, it will probably be able to do most anything in future!
Dude! You should have way more than 800 subs! Keep up the fantastic work!
For a second at thought you said "their philosophies", which probably would be very effective against the falcos
This brings back memories of playing melee with my friends (all older than me) and one was really good until a new challenger appeared that was just as good but also competed in world tournaments. Playing against those two and the other friend (who really liked to use heavy characters compared to fox and Falco for the other two) really helped me get pretty good. I mained Link btw, it was the only character versatile enough to compete with a heavy tank and two fast speed demons.
action vs self.action is the most relatable thing ever
Watching stuff like this just makes me more amazed with the human mind.
how some people can make a science of a 20 year old nintendo game. Lets look 20 years in the future, you see me visiting a master class at the ai Professor online University about quake champions and its philosophy.
criminally undersubbed channel
oh awesome this got sm more views
This was interesting even though i couldnt understand most of the technical details
If you intend to do a sequel to this, I think it would be a good idea to make the control stick inputs more discrete and give it more options in terms of angle and whatnot. I think that you should do it as a polar coordinate instead of x/y like the controller output does. Sticking with your number system, you could make it so that angle is determined with 1-12, and distance from center 1-3, letting it do tilts and angles while keeping overhead simple. It should be somewhat easy to convert the training data from xy to polar
That's a neat idea!! I'm about 105% sure there's about 10 different ways I've seen to convert between [the two most common] coordinate systems that I learned in math class but never had a use for, I don't see why they wouldn't work here
@@idontwantahandlethough were does the extra 5% comes from?
I have only ever seen one polar/Cartesian conversion method
This is really cool, subbed
Love the content, hope youtube recommends this to more people.
"you codebullet wannabe" DAMN THE SELF ROAST
The way i would do this is i would first bind all the controls to something the ai could interact with. then, i would make a reward system that positively rewards dealing damage, taking a stock, combos, and character specific options like using super armor or a reflector, and negatively reward taking damage, having its sheild broken, being off stage, and losing a stock. then i would program more niche things such as how to utilize platforms or the ledge. this may eventually cause the ai to become campy and wait for time, so to prevent this, ill make it so if it goes to time with even stocks or a stock deficit, it loses all points if it results in a loss.
Hahaha Mistakes like that are why I spend twice as long on projects.
I once thought a 5 was an S for my table name in SQL terminal and was wondering for hrs what was wrong with my tables.
Good video, hope to work with & learn more from you. :)
I’ve been waiting for this. With Slippi replays you should be able to make AI versions of Top players. That would be sick
Would be interesting if it had relatively human limits, so it wasn't just a terminator and it felt like someone holding a controller.
So like a built-in delay to mimic a fast human reaction time (0.05-0.1 seconds?), and a restriction on certain input sequences that would be physically impossible.
Being visual-only would be super cool as someone already made a bot that reads and reacts to game data instantaneously, and it just absolutely slays in the most unfair ways possible.
...perhaps a mistake percent that increases with every input, that when triggered decreases and does something knowingly erroneous like holding the stick too hard or missing an input...
"It didn't work, it just resorted to continuously jumping (on the top ledge)"
Oh no, its working, its solved the game
I’ve done some AI courses over college, not an expert, but I have a general idea of how things go with this. I recommend looking into the programming behind amiibos used for smash ultimate and see if you are able to generate a good idea on how to implement something similar to melee. It’s hard to copy player input when every player does a move differently. You could maybe set a parameter for wave dashes also. Teach the Ai how to do a wave dash and then if it reads the input close enough from the pro player, it will do that wave dash that it knows. Same can go for the rest of the tech in the game. Good stuff regardless! I would be excited to see a series on this. Weekly updates or monthly updates would be sick.
You have a very charismatic personality and this video was really engaging I'm sure you'll blow up soon enough, I think you need to work on the unscripted portions though, you give a much more introverted vibe near the end which makes it a bit harder to stay engaged along with the person you're commentating with being too energetic in a way that feels off especially since he's sometimes just saying things to say them even when he doesn't really get what's going on. All in all its a good video though
High quality! Keep it up
Plays just like a little brother would. Spamming one attack
So you just made Amiibo+
Edit: I am your 1,000th subscriber. If you blow up, remember me
AGHHGHGH!!!! Number 1000!!! TYSM!!!! If you wanna join the discord I can give you the holy "1000th Sub" role lmao
me who knows nothing about computers watching this: "Just show me the robot beating mango"
Honestly super well done! Def a sub these concepts are super cool
Im trying to figure out how to make a bot thats good at dbfz. Besides having a lot of modding experience in my early 20s with Fo4 and Skyrim, I'm totally new to programming. This is an inspiring example of what I'm shooting for.
When all of the reinforcement learning just amounted to the agent jumping I COULD RELATE.
This is why we are supercomputers, just think about the power it takes to 'adapt'
The return of the king 👑
Maybe try to incorporate advantage:disadvantage odds, like with stage control, to help the bot better guide itself to exploring good strategies
Thank you algorithm! Really cool idea! I would love to see more!
Great video! I'm not going to say much here but this probably would've benefited from training the ai against itself, either way you earned a sub :D
Then he found out about Slippi replays... Or maybe he hasn't. That'll help with this process a lot.
my man coded a fucking Amiibo.
If you do more nintendo/smash content, you’ll see a lot more traffic for sure. This video’s going to 10k+ views quick
bro. this is genius!
I wish you'd given it a slightly higher reward for killing after taunting XD
Btw there is a smashbot ai, and it has competed against pro player and has won!
This is too underrated
no idea who you are but now im subbed
ALTF4 can help you with the AI.
Great video
P.S. Altf4 is the name of the guy that does the AI for the SmashBot.
AFAIK smashbot doesn't have a neural network it's just a really cracked cpu, but yeah ALTF4 would probably be very helpful anyhow.
ALTF4 is the creator of libmelee which is used in this video as the API to the game's instance.
Have you considered using the replays to train a relatively small model to predict what inputs are being pressed for a given frame?
This model could then generate the inputs for an unlabeled smash video meaning you could use all smash videos from RUclips as part of the training data for the main model. OpenAI did this for their Minecraft playing model.
One thing that people who make smash AIs never know about (and, to be fair, literally idk how you would know about this unless you’re very familiar with top players haha) are issues with controllers. There are a lot of minor shitty things that exist in the game’s code that massively impact gameplay that are actually dependent based on the mechanisms of your specific controller.
The major MAJOR issue is that most serious players play with UCF (universal controller fix) which fixes some of these issues… but then if you try and train an AI based on those inputs into vanilla melee, then they may not work the same (you can run into a similar issue if someone mods custom controls, such as removing tap jump). The other thing though is I have no clue how this impacts what an AI does… as in, I have absolutely no clue if those same controller issues would apply to a created AI. A bad controller, for example, might have a ~40% chance to successfully dashback even when executing the exact same inputs. If an AI tries to replicate that, do they also have the same odds? But also, if training off of someone with a bad controller, they’re less likely to try and dashback even when it may be the optimal option…
Idk, all I’m saying is that anyone very serious about AIs playing smash REALLY needs to look into the controller as a potential barrier (or variable) too…
You said the Fox could only perform one move, but he was clearly using shine into fire Fox each time lol
I love the hours of work, the computer science and all the programming needed, and plays at your 10 year old brother level lol
Slippi replay files would have probably made training the model much easier
Watching the AI try to wavedash was cute lol. It looked like when a human is first learning
"That would take 83 days. I don't have that time." Also: "3 months of work down the drain."
Loved the code bullet joke
you could have trained the AI on BORPS sets. he's a pro player who uses literally no techskill.
I felt that when you said you forgot to do ‘self.action’ and wasted an excessive amount of time.
Although you should consider testing your setup on a randomly initialized network to see if the behavior changes *before* completely changing your model to PyTorch
I did! I may not have explained this very well, but the action taken was the output of the neural network, and at the beginning it always appeared to be training properly. It was for the feedback where it was always being fed zero (eg. even if the performed action was an fsmash, the agent would learn that the jump reward is 0.42)
Well, glad you solved it. If you continue on this project, it could be interesting to feed the network the past few frames to give it a short term memory. Perhaps that would even provide enough context to learn sequences of moves to do advanced movement.
ai training is just like ringing a million pavlov bells
I would happily volunteer to play smash to give it some data to work with. I'll play all day! I'm really good but I don't wave dash compulsively.
Hi cool project.
If I was doing this, I would get a large set of pro games, use the character positions and state (ie are they on the ground, in hit stun, in landing lag, %damage etc.), and controller inputs, as input to the neural net, and the result of the stock as output. Then I could potentially get a training sample for each frame of each game, although just taking samples when there's player input would be better. You could get hundreds of samples per game across 100s (preferably 1000s) of games.
Then the net would learn the likelyhood of the result of each given stock, given the current state and any potential controller inputs, and choose the controller inputs most likely to result in winning the stock.
nice video! :D
as a web developer the pain and suffering this project may have provoke in you, is beyond my human comprehension
AI learns how to play smash bros
CPUs: Am I a joke to you?
I love that the Ai just uses alot of smash attacks.
I also created a melee bot for a class project, but it only learned how to jump over lasers or spam smash attacks. For my project I tried learning from a window of (states, actions) -> next action with rnns
Wow, that AI really knows how to play "SMASH" Bros!
HA
You created an amiibo for melee
i did this same thing with brawl for a school project in 8th grade, cool to see someone else try to do it in a different way
This is really cool shit. I've though about programing an AI to tell me in english how to beat my opponent in real time in a game like Mortal Kombat 11. I would have to find some way to train it (that's the hard part) but then it would also get filtered through an not perfect human too, so I'm not sure it would ever really work. The goal would be to know what the opponent is about to do before they do- pick up on their pseudo-random patterns. If they always try to grab after a certain move, the AI should see that move and tell me to jump out of the way before they have a chance to grab at all, for example. Of course things get really complicated really fast but I always thought it would be pretty cool to have it tell me how to play in real time. There are certain patterns you don't even see after analyzing a replay that an AI should easily pick up on in theory.
great video, one question (sry if you already answered) but do you know if the ai uses DI when it gets hit?
Great video! Now I see why it took so long.
Only 368? Feels like I’m early to something lol
5 videos from now you're gonna have millions of views.
Comment for the RUclips algorithm overlord
7:23 I feel your sufering
Now verse it against an amiibo
Luckily the spacies’ up smash is one of the best in the game, so spamming it is far from the worst strategy
As a thought, what if you ignored actions, and instead trained it to produce raw controller inputs? Stick inputs would just be taking output from the NN, buttons would be done using a threshold.