I love how chill and lazy the AI Spiderman seems. It's like Spiderman putting the least effort into moving about with his webs, whilst allowing his body to just ragdoll with it all.
It's also better for a living being to use both arms bc it would even out the amount of strain. Plus the arm that's used all the time would probably be more muscular than the unused arm and I think we all know what that typically implies lol
I think the reason the ai moved to using only one hand was to minimize the randomness that happens to its decisions. Since half the time they don't affect the outcome if you don't use one of them.
I wondered if dropping 1 of the inputs (the left hand) allowed it to use more of the hidden layers to improve it's performance with the right hand. I'm not a doctor but that might be analogous to right/ left-handedness in humans.
@DorkViews Let's run it back one more time. My name is 01010000 01100101 01110100 01100101 01110010 00100000 01010000 01100001 01110010 01101011 01100101 01110010 and for the a very few minutes, I have been the computers one and only, Spider man. I had my uncle Ben, be deleted, gwen die to a virus, but it's okay. I have my own responsibility saving the world.
It's ironic because in most spider-man games, quick "thwips" are usually faster than using full swings, as you keep momentum better. Long swings have a curve to them and usually take a bit longer than just jumping off at the apex, which essentially looks like you are only swinging half way before thwipping again. It seems the AI has learned this.
Mathemattically, the fastest way to move would be to have a constant force in a constant direction balancing out drag and gravity. The way to approximate constant force and direction is ro constantly reshoot in the same direction.
Let’s just sit and applaud the fact this man can code this stuff, animate fun stuff, and WRITE what happens in a coherent way for new people Holy crap 👏👏👏👏👏👏👏
A thousand times, thank you for the segment at the end following the trained model! It is SO frustrating when a channel explains something for ten minutes, and then instead of giving you the gratification of a finished product (something I'm sure the creator enjoyed plenty of) they just end off with "whelp thanks for watching! byeeeee!"
I'd love to see a part 2 to this that attempts to make variants by adding silly additional rewards (aka reinforcement learning) to the current spiderman, like a version that tries to do as many backflips as possible whilst also going fast in a forwards direction.
I think specifying that the left arm must fire webs at the left wall and the right arm must fire webs at the right wall would be a good option for this.
@@markellii3093 Well, they could attach Unity's equivalent of a scene component to the ragdoll's pelvis, and only check its yaw to make sure it remains looking forward while still allowing it to do flips
Sounds cool but first I want to see it learn not to hit the walls or ground, and to only use the web up to 1x/second. I think it would feel more natural. Maybe also keeping the head upright and the face forward too.
@@Bruva_Ayamhyt typically you don't want to be to restrictive because typically with an AI like this it will either cheat your restrictions or will be locked out of potentially required steps of learning. Not to mention the interesting or groundbreaking solutions it could come up with outside of your parameters. That's why is said left for left, and so on. After thinking on it a while longer I think breaking it down even further into "reducing the amount of points earned for every successive use of the same arm in a row". Mostly because as long as the AI hasn't totally given up on ever using its other arm then the point adjustment should be able to just be made and then the ai should be able to successfully continue training without regressing too far.
I'd like to see this with more constraints added (such as web cooldown, web transit time, Spider-man needing to see where he's going instead of flopping around, how body position effects momentum, etc).
Yeah I like this idea as well. Maybe each arm can only do one web every 1-2 seconds, or the right arm can only hit the right wall and the left arm can only hit the left wall.
@@megatroneata9911 As long as no new inputs are added, the constraints will actually increase the training speed by reducing the search space. However, things like adding vision for the actor or adding additional factors to the environment like more physics can make things exponentially more difficult.
You should make it consider hitting walls a bad thing (to train it to stick closer to the middle) and consider one of the rewards to keep the body facing forward, while facing back would be a discount. That way, it will probably give favor to alternating which arm it uses, to keep facing forward and to stay at the center of the road.
It's always fascinating to watch how quickly an AI can go from absolute failure to quite competent at a task. Thanks for breaking it all down, it's a very complex topic that makes my head spin - but you explained it very well, in simple enough terms and with fantastic visual aids, that I was able to follow along very easily and feel like I understand the general process much, much better than I did before!
This is very good appreciate the swinging bit at the end, but also for providing a full explanation a bunch of youtubers I've watched will start out explaining something then blur all the words together to make it seem uber complex, and then skip right to the final product without actually explaining the steps they took to get there
You explained all of the concepts really well for someone who doesn't know anything about AI, but as someone who does know one part I was curious about which you didn't go into would be how you defined the reward. For example, did you want it to simply never fall to the ground, or were there other things you punished / rewarded apart from the general things you mentioned? I would be more interested in the process as well, which changes you made along the way etc. Maybe a separate video that is tailored to an audience that knows a bit about AI and go more into depth would be interesting!
@@symbiote1982pk yes. Style points in normal conversation is typically just a way to tell somebody they did something cool. Style points in the example I posted would be another goal the AI could track and would be an actual score system to improve learning. Hope this helps
I wasn't here for learning AI, but this is probably the most motivating video I've ever seen. You explained like everything necessary, so it gives the feeling that it is so easy
That is because he skipped the 4 hardest parts. Choosing the right algorithm to train with (in this case PPO) Choosing the right state to learn from Choosing the right reward to learn with Choosing the right actions for the model to take Those 4 parts are the parts where it goes from "science" to "art/intuition" But you should try it!!! And by hard I mean it is the equivalent of choosing what color to paint a painting. Picking a paint is not a difficult process. Picking a paint that will mesh well with all the other paints and end up with a really good painting is much more difficult.
I normally do not write comments but as someone who knows a lot about AI this was the best video I have ever seen explain the concepts. A "part 2" explaining the couple of concepts glossed over would be super interesting in a "fine tuning the spiderman" I wish we had AI interfaces that were as easy as you made them seem and if they get to that point then your video is the perfect "here is what you need to know to make your own AI" starter video.
This was awesome, you explained it in greater detail than any other programming youtuber I've watched. I noticed that the Ai is still very jumpy, shooting unnecessary webs out. I wonder if including the number of webs shot in the reward system would incentivize the long and wide swings we see spiderman usually do, as opposed to the short spastic swings the AI is currently doing.
I did play around with penalizing the AI for casting too many webs, but I think it's very hard to get right. If you penalize it too much (which honestly wasn't even that much) then the AI becomes hesitant to shoot webs and it stops learning.
@@b2stud what if you rewarded it for longer webs instead of penalizing casting too many webs, set a small incode timer to measure web time? or set a ingame timer which makes the AI only able to cast webs after a certain time period after the last one. if you're rewarding speed it's kind of incentive for the AI to spew more webs in order to get less pendulum type motion and more linear like motion, i suggest only a combination of that reward
@@b2stud actually i just realised you can reward the AI for longer web distances, and more altitude of the web, do that in combination with speed and distance rewards
@@arpita1shrivas All of those are good ideas, but the last one is very smart. Definitely easy to mess up, but if implemented right would make beautiful swings.
I think the most effective way to do might be to have him track his multiple directional velocities and diminish rewards for losing too much velocity at once, though you might need to also track directional acceleration to keep track of the changes in velocity so it could be a pain. Distance and speed are also good, but to maximize returns I feel multiple velocity and acceleration recognitions would better modify behavior because they'll allow for shorter web swings if they're warranted, IE, if you'd lose velocity by hitting something or accidentally exchange too much directionally from long webs, without creating many incentives for short high speed sling-shot webs, as those will almost always lose you a lot of velocity in some axis. Maybe overthinking though.
Amazing video, just as your content always is! I'd like to see the AI being more rewarded for speed, so it goes flying trough the city. Thanks for always making these topics (which require a lot of effort to comprehend) into something simple.
Really interesting how it ended up just using one arm. I wonder if that was just easier for it to randomly learn, where the AI only had to learn to control 1 arm rather than 2, or if it is actually somehow more efficient than 2 armed.
id guess its because it wasnt incentivized to learn to use both if one works why try learning that the other works too? could fix that by giving each arm a strength value the value drops when webbed onto something but increases when not used
Makes sense to me, using two arms requires coordination between the two, whereas using one arm only means that the AI can just shoot a web upward on one side, then the same for the other side, with no chance of one arm fucking up the other.
Very well explained, i salute you man. Also, i love how rewards system in AI/Machine learning are basically dopamine hits for the AI in the form of code. Not that they are advanced enough to "feel it" mind you. They also kind of have similar capabilities to "job experience", neuroplasticity and muscle memory, though more so simulated, which is just an amusing thought to think about.
You taught me more about Deep Reinforcement Learning and Neural Networks in 10 minutes that the two semesters I wasted in a "capstone" class in college.
I rarely write comments and only got here through the youtube-algorythm. But MAN i really enjoyed this video. Not too technical and not too basic. I rarely watch a recommended video more than 2-3 minutes but i HAD to watch this all the way through. You explained it very entertaining and i understand AI MUCH better than before. Thank you!
The problem lies in allowing it to be able to optimize with just one. A sufficient cooldown should be part of each web shooter to better imitate how Spider-Man can't make webs just instantly appear like laser beams. Once it loses efficiency with one arm, it should start trying to go faster with both.
It's not everyday you find someone who backflips and someone who explains AI... It's even rarer to find a channel that does both. Backflip you magnificent AI
imagine an online spider man game like pogo stuck and you need to control your hands with the mouse and shoot webs to get through levels, and then when you finish it it gives you like an open world map with stuff to do
Appreciate the breakdown in the entire video, but 14:04 onward is a vibe that needs to be made into a short. "AI Spooderman webslinging at Sunset" You'll get 1 million views easy 🤣
I think you did a very good job explaining a lot of this, as someone who knows little about machine learning. However, there was one part I was really lost. I was hoping you could clarify: how do the "hidden nodes" work / what do they do? I assume they're the actual math that lets the AI decide what to do, but I don't really get how.
Each time information flows through the neural network all the connections between the nodes alter it. Having hidden nodes drastically increases the amount of connections, which gives the AI more control over how it filters / transforms that information. There isn't any difference between input / hidden / output nodes besides their location in the network. Essentially the more hidden nodes you have, the more the AI can alter the information it receives = the more intelligence it can have
Another way of looking at it is that more nodes means for more ways for information to "weigh" against each other. Each node essentially has a value and a weight for how much that node influences its connection. It's all basically percents. You want lots and lots of connections because it leads to a more in depth and nuanced way for information to relate. The relative angles of the arms and each joint should relate to how the web is fired, and if caring about efficient motion through air resistance, then it will also want to factor in the other body parts too. The more connections, the more complex and nuanced relationships you can factor into the system.
I'd love to see what adding a limit to the number of webs would do. If it's only got a certain amount per arm, would it alternate arms? Would it swing further before using the next web? I wonder if it would look more like the Spider-Man we're familiar with or if it would come up with some crazy nonsense 😅
I've been interested in AI/ML for a while now, more specifically Reinforcement Learning which is the one described in the video. I already knew the general idea of what it is and how it works but I must say, your description was the most clear one I've heard so far.
Actually there's a course on coursera by Stanford Professor Andrew Ng, who's one of the founder of deep learning methods that's pretty easy and good for an introduction, I know a lot of people and friends of mine recommended it, and I'd definitely recommend it.
At first I thought this video would just be another neural-network-plays-a-game video and that I wouldn't learn anything new from it, but I am so glad that I was wrong! You presented these complicated topics so well, along with some comedy along the way! I wonder how advanced the AI could get given more inputs and outputs to control the other parts of the body, more time, and a more advanced reward function to encourage spiderman to stay up straight, move fluidly, avoid walls, and perhaps add in some acrobatic flair
10:49 - "Yo Spiderman, you good bro?" But in all seriousness, this was a really good video - explained all the concerts in a concise and easy to follow way (even if some of them went past my head lol). Looking forward to more content bro!
Been there since piderman was just a skydiver occasionally smacking it's face against a wall But seriously you explained the algorithm really well and I'm just surprised how it went from a broken atari session to... this.
This is the most easy to understand description of RL that I've come across! They should play this video for first year students. Funny and informative, keep it up!
2 things, I find it extremely fascinating that an AI decided it was going to be right handed instead of ambidextrous. It looks like spiderman is asleep and he's unconsciously swinging.
The moment you said "If you're still here... enjoy" something in my brain just clicked like the keyword for a sleeper agent to wake up. You nailed that phrase! felt so good to be hit with a moment of nostalgia unexpectedly like that :)
Interesting! Usually, the AI has many ghost clones of itself in each generation to decrease the learning time significantly. Do you not need to do this because of PPO? Is it really just that effective? Or was this a relatively easy task for AI to learn compared to something more complex like running?
It's perfectly reasonable to have multiple AI running at the same time. You can also speed up the environment. I tried both methods but I found that running the environment faster was better
You would do that with a genetic algorithm because there is no back propagation, just mutations. It also takes a lot longer to converge on desired behavior.
The slap noise near the start from it hitting the floor had me laughing way more than it should have It also made me think of when I was little and had a tall bed and I fell off it and apparently my mom heard the “smack” of me hitting the floor and came in and saw me laying face down on the floor still sleeping, I laugh whenever I imagine it
Interesting video. I would be interested to see what would happen if you turned hitting the wall into a fail condition. It's too bad the AI doesn't have some control over its lower body. Maybe locking the legs together and allowing the AI to pivot as a sort of weight to help build momentum.
12:59 This is the very first time after the dislike update I see a youtuber publishing private dislikes ever. So should I suppose people still use it? Not that bad, I’d say
"Why do you keep walking into the wall" "If I break enough bones, they will learn how to climb up the wall, achieve orbital velocity, and cure cancer eventually"
Spiderman: homeschooled
Iron Spider: Retaught.
Spiderman: Learning from zero
@@dandabossthesecond3599 no he said homeschooled because of the theme of “home” coming in Spider-Man titles
@@aahilmemon ik
The A-lazy-ing SpAIderman (the next new one in the spiderverse?)
I love how chill and lazy the AI Spiderman seems. It's like Spiderman putting the least effort into moving about with his webs, whilst allowing his body to just ragdoll with it all.
damn 69 likes i feel bad for u
He's napping while swingin'
Why look cool when it gets the job done?
@@Dionyzos Asleep is the new cool.
this is what happens when peter is knocked out and the spidersense is keeping him safe
I like how the AI just resorts to using one hand form web shooting once it gets going. It’s like “why do I need two hands to shoot seems like a waste”
The real Spider-Man should take notes, obviously the way he’s been doing it is less efficient
@@maxiliarydendrite8926 Sacrificing Efficiency for Style is something Spiderman would do, though
Yes
It's also better for a living being to use both arms bc it would even out the amount of strain. Plus the arm that's used all the time would probably be more muscular than the unused arm and I think we all know what that typically implies lol
@@shytendeakatamanoir9740 Spider-Man. Quite the beautiful word.
13:06
- he almost fell done
- saved himself in the last second
- celebration backflip
I don't think it could've been more perfectly timed with the commentary ending
He also kicked the guy in the face 😂
And showed us a couple of the cute fishie pedestrians :)
that part is so smooth
Yes
I think the reason the ai moved to using only one hand was to minimize the randomness that happens to its decisions. Since half the time they don't affect the outcome if you don't use one of them.
could have been solved if the webs toggled hands
@@BusinessWolf1 you about to toggle these hand lmao
oh yeah, this is big brain
"You can't screw with my movements if I just don't move!"
I wondered if dropping 1 of the inputs (the left hand) allowed it to use more of the hidden layers to improve it's performance with the right hand. I'm not a doctor but that might be analogous to right/ left-handedness in humans.
@@BusinessWolf1 or reward facing forward
because of the spiderverse this is a canon spiderman
I love this. This was the same with Sonic for a while, too.
Makes sense
We have to know what's his canon story
@@DorkViews Someone made a typo in the code of it's predecessor, causing it to have an error. Truly heartbreaking.
@DorkViews Let's run it back one more time. My name is 01010000 01100101 01110100 01100101 01110010 00100000 01010000 01100001 01110010 01101011 01100101 01110010 and for the a very few minutes, I have been the computers one and only, Spider man. I had my uncle Ben, be deleted, gwen die to a virus, but it's okay. I have my own responsibility saving the world.
ten years ago i would not imagine myself sitting here eating my food while watching an AI grow up to be spiderman
It's all fun and games until it becomes self-aware and launches the nuclear missiles.
@@miller-joel so true i hate when it happens ong
@@puplos125ruins a perfectly good Tuesday like nothing else
Everyone can wear the mask
@@lord_gyverLMAO
"With great distance, comes great rewards" - Piderman
sπderman
Siperman
Der-Man
spooder man
Sperman
5:54 Does this mean you could teach a jellyfish to be SpiderMan 10 times faster than this computer?
If the neurons were stripped blank without being damaged Id say why not
@@ogluqqychess4452this reminds me of a project by some science youtuber to use human neurons to pilot a drone
@@talison718isnt nearly every drone piloted by human neurons?
@@moritzkramer355 yup, but i am talking about put neurons in a plate an connect then to wires and then use a simulator to train them to fly a drone
@@talison718 quite unnecessary if you already have a brain but cool i guess
It's ironic because in most spider-man games, quick "thwips" are usually faster than using full swings, as you keep momentum better. Long swings have a curve to them and usually take a bit longer than just jumping off at the apex, which essentially looks like you are only swinging half way before thwipping again. It seems the AI has learned this.
Hope to god insomniac adds thwips to spiderman 2💀
Mathemattically, the fastest way to move would be to have a constant force in a constant direction balancing out drag and gravity. The way to approximate constant force and direction is ro constantly reshoot in the same direction.
@@bscutajar Well, almost the same direction. There would need to be constant angular adjustments to maintain the optimum elevation
The Grappendix
@@yesno1085 cruelty squad reference?
Let’s just sit and applaud the fact this man can code this stuff, animate fun stuff, and WRITE what happens in a coherent way for new people
Holy crap 👏👏👏👏👏👏👏
Even more
He composes the music that he uses for his videos
@@Wizzkidwas no way what?? Even better!
yeop i sure love seeing how everyone and their dog is smarter and more successful than me
@@crylunesame, lol
I love how the web-slinging sound is just you going _"chu" "shue" & "shu"_
Don't forget "shuye"
I lowkey hear the words "chew" "chewy"
A thousand times, thank you for the segment at the end following the trained model! It is SO frustrating when a channel explains something for ten minutes, and then instead of giving you the gratification of a finished product (something I'm sure the creator enjoyed plenty of) they just end off with "whelp thanks for watching! byeeeee!"
Yessssss!!!!! For real!!!!!
@@pinkie723 foshooooo
@@pinkie723 Based pfp
@@R0TEK Thanks lol
That's why I just skip to the end. I have a brain. Be like me.
I like how it uses little micro-adjustments like you would do with thrusters in space. It's cool to see it so casually correct its course.
I love how occasionally it does a spider-man like trick or flip, but for the most part it just like flails around and lets gravity have its way.
I'd love to see a part 2 to this that attempts to make variants by adding silly additional rewards (aka reinforcement learning) to the current spiderman, like a version that tries to do as many backflips as possible whilst also going fast in a forwards direction.
I think specifying that the left arm must fire webs at the left wall and the right arm must fire webs at the right wall would be a good option for this.
Keeping the face forward would be fun, but could cancel out backflips. Avoiding hitting buildings would probably be for the best.
@@markellii3093 Well, they could attach Unity's equivalent of a scene component to the ragdoll's pelvis, and only check its yaw to make sure it remains looking forward while still allowing it to do flips
Sounds cool but first I want to see it learn not to hit the walls or ground, and to only use the web up to 1x/second. I think it would feel more natural. Maybe also keeping the head upright and the face forward too.
@@Bruva_Ayamhyt typically you don't want to be to restrictive because typically with an AI like this it will either cheat your restrictions or will be locked out of potentially required steps of learning. Not to mention the interesting or groundbreaking solutions it could come up with outside of your parameters. That's why is said left for left, and so on. After thinking on it a while longer I think breaking it down even further into "reducing the amount of points earned for every successive use of the same arm in a row". Mostly because as long as the AI hasn't totally given up on ever using its other arm then the point adjustment should be able to just be made and then the ai should be able to successfully continue training without regressing too far.
I'd like to see this with more constraints added (such as web cooldown, web transit time, Spider-man needing to see where he's going instead of flopping around, how body position effects momentum, etc).
and how long the pizza can stay hot mhmhm
doable but if this took 11 hours that might take a few days
Yeah I like this idea as well. Maybe each arm can only do one web every 1-2 seconds, or the right arm can only hit the right wall and the left arm can only hit the left wall.
@@megatroneata9911 yes but if he adds more artificial neurons then that can shorten the time needed
@@megatroneata9911 As long as no new inputs are added, the constraints will actually increase the training speed by reducing the search space. However, things like adding vision for the actor or adding additional factors to the environment like more physics can make things exponentially more difficult.
You should make it consider hitting walls a bad thing (to train it to stick closer to the middle) and consider one of the rewards to keep the body facing forward, while facing back would be a discount. That way, it will probably give favor to alternating which arm it uses, to keep facing forward and to stay at the center of the road.
12:48 “It’s so good in fact, that it doesn’t need to look where it’s going” It developed Spidey sense without any programming lmao 😂
What's his canon event
Tax evasion
Falling flat on the ground for the first time
Uncle yen died in a car crash rite after Aiter Airker was dropped off at school by him
His canon event is the code breaking
I don't see how that's my problem
I learned more in 15 mins here than I did in a semester of Reinforcement Learning. Maybe not, but this one is a lot simpler and visually statisfying
14:50 looks like a trick that spider-man would actually do while swinging😂
Yeah, especially miles morales Spiderman, he's just going for style above all else
I thought the AI was just trained to swing like Spider-man, not actually become Spider-man
This is not show-off it necessity to swing properly...
So would this 00:18
It's always fascinating to watch how quickly an AI can go from absolute failure to quite competent at a task. Thanks for breaking it all down, it's a very complex topic that makes my head spin - but you explained it very well, in simple enough terms and with fantastic visual aids, that I was able to follow along very easily and feel like I understand the general process much, much better than I did before!
10:52
Seizure man, Seizure man
Has a seizure when he can
Lights a flash, he's collapsed
Epileptic on the task
LOOK OUT!
Dear god, there’s flashing lights
This is very good
appreciate the swinging bit at the end, but also for providing a full explanation
a bunch of youtubers I've watched will start out explaining something then blur all the words together to make it seem uber complex, and then skip right to the final product without actually explaining the steps they took to get there
*cough cough* Dani
You explained all of the concepts really well for someone who doesn't know anything about AI, but as someone who does know one part I was curious about which you didn't go into would be how you defined the reward. For example, did you want it to simply never fall to the ground, or were there other things you punished / rewarded apart from the general things you mentioned? I would be more interested in the process as well, which changes you made along the way etc. Maybe a separate video that is tailored to an audience that knows a bit about AI and go more into depth would be interesting!
I am going to write a small paper explaining the details that I didn't cover in the video as well as uploading code
@@b2stud cool, thanks!
@@b2stud You're an amazing web developer.
@@b2stud very excited for that!
:)
They need to teach ai to value literal style points as well as their primary goal.
YES PLEASE
Literal style points as opposed to figurative style points?
@@symbiote1982pk yes. Style points in normal conversation is typically just a way to tell somebody they did something cool. Style points in the example I posted would be another goal the AI could track and would be an actual score system to improve learning.
Hope this helps
The exaggerated swagger
Hmm, how would you quantify style in the reward system? Backflips, using two hands, and facing forward are worth more points or something?
the sound at 0:18 jumpscared me
Lol it was funny tho
I love that this both contains some of the most down-to-earth explanations of how AI works which were really informative, and also this: 10:34
6:44 that twist was fire tho
@1:30 Liked just for the gargling
I got an as there LOL
😂😂😂
Dave Dick Damesone
I wasn't here for learning AI, but this is probably the most motivating video I've ever seen. You explained like everything necessary, so it gives the feeling that it is so easy
That is because he skipped the 4 hardest parts.
Choosing the right algorithm to train with (in this case PPO)
Choosing the right state to learn from
Choosing the right reward to learn with
Choosing the right actions for the model to take
Those 4 parts are the parts where it goes from "science" to "art/intuition"
But you should try it!!!
And by hard I mean it is the equivalent of choosing what color to paint a painting. Picking a paint is not a difficult process. Picking a paint that will mesh well with all the other paints and end up with a really good painting is much more difficult.
@@dtracers do you work on this field ?
I normally do not write comments but as someone who knows a lot about AI this was the best video I have ever seen explain the concepts.
A "part 2" explaining the couple of concepts glossed over would be super interesting in a "fine tuning the spiderman"
I wish we had AI interfaces that were as easy as you made them seem and if they get to that point then your video is the perfect "here is what you need to know to make your own AI" starter video.
Your sense of humor is stellar man. You definitely deserve more subs.
tyty
12:02 Exaggerated Swagger
Of a White Guy
This was awesome, you explained it in greater detail than any other programming youtuber I've watched. I noticed that the Ai is still very jumpy, shooting unnecessary webs out. I wonder if including the number of webs shot in the reward system would incentivize the long and wide swings we see spiderman usually do, as opposed to the short spastic swings the AI is currently doing.
I did play around with penalizing the AI for casting too many webs, but I think it's very hard to get right. If you penalize it too much (which honestly wasn't even that much) then the AI becomes hesitant to shoot webs and it stops learning.
@@b2stud what if you rewarded it for longer webs instead of penalizing casting too many webs, set a small incode timer to measure web time?
or set a ingame timer which makes the AI only able to cast webs after a certain time period after the last one. if you're rewarding speed it's kind of incentive for the AI to spew more webs in order to get less pendulum type motion and more linear like motion, i suggest only a combination of that reward
@@b2stud actually i just realised you can reward the AI for longer web distances, and more altitude of the web, do that in combination with speed and distance rewards
@@arpita1shrivas All of those are good ideas, but the last one is very smart. Definitely easy to mess up, but if implemented right would make beautiful swings.
I think the most effective way to do might be to have him track his multiple directional velocities and diminish rewards for losing too much velocity at once, though you might need to also track directional acceleration to keep track of the changes in velocity so it could be a pain. Distance and speed are also good, but to maximize returns I feel multiple velocity and acceleration recognitions would better modify behavior because they'll allow for shorter web swings if they're warranted, IE, if you'd lose velocity by hitting something or accidentally exchange too much directionally from long webs, without creating many incentives for short high speed sling-shot webs, as those will almost always lose you a lot of velocity in some axis. Maybe overthinking though.
Amazing video, just as your content always is! I'd like to see the AI being more rewarded for speed, so it goes flying trough the city.
Thanks for always making these topics (which require a lot of effort to comprehend) into something simple.
Really interesting how it ended up just using one arm. I wonder if that was just easier for it to randomly learn, where the AI only had to learn to control 1 arm rather than 2, or if it is actually somehow more efficient than 2 armed.
id guess its because it wasnt incentivized to learn to use both
if one works
why try learning that the other works too?
could fix that by giving each arm a strength value
the value drops when webbed onto something but increases when not used
Makes sense to me, using two arms requires coordination between the two, whereas using one arm only means that the AI can just shoot a web upward on one side, then the same for the other side, with no chance of one arm fucking up the other.
It isn’t concerned with style points after all. 😅
Very well explained, i salute you man. Also, i love how rewards system in AI/Machine learning are basically dopamine hits for the AI in the form of code. Not that they are advanced enough to "feel it" mind you. They also kind of have similar capabilities to "job experience", neuroplasticity and muscle memory, though more so simulated, which is just an amusing thought to think about.
Please make 10 hour loops of ai swinging, this helped me sleep so well
Strangly pleasant to watch him just swing for 2 minutes straight
This guy is really producing the high quality content out there. Love the videos so keep it up💯
"Roman Sakutin" passed off your work as his own, and also inserted an advertisement in the video. You can try throwing a strike on his video.
Why did i found the ''Just Swinging'' part so relaxing 😭
You taught me more about Deep Reinforcement Learning and Neural Networks in 10 minutes that the two semesters I wasted in a "capstone" class in college.
I rarely write comments and only got here through the youtube-algorythm. But MAN i really enjoyed this video. Not too technical and not too basic. I rarely watch a recommended video more than 2-3 minutes but i HAD to watch this all the way through. You explained it very entertaining and i understand AI MUCH better than before. Thank you!
Np. I'm very happy to hear that!
I think it would have been a good idea to reward the AI for using both of its hands, possibly alternating or swinging with both.
@Erinç Argımak fair, but we want style darnit
The problem lies in allowing it to be able to optimize with just one. A sufficient cooldown should be part of each web shooter to better imitate how Spider-Man can't make webs just instantly appear like laser beams. Once it loses efficiency with one arm, it should start trying to go faster with both.
yeah style points
It's not everyday you find someone who backflips and someone who explains AI...
It's even rarer to find a channel that does both.
Backflip you magnificent AI
The chances are millions to one
The ai backflips he explains neither does both
imagine an online spider man game like pogo stuck and you need to control your hands with the mouse and shoot webs to get through levels, and then when you finish it it gives you like an open world map with stuff to do
0:19 Best thing ever.
Appreciate the breakdown in the entire video, but 14:04 onward is a vibe that needs to be made into a short. "AI Spooderman webslinging at Sunset" You'll get 1 million views easy 🤣
I think you did a very good job explaining a lot of this, as someone who knows little about machine learning. However, there was one part I was really lost. I was hoping you could clarify: how do the "hidden nodes" work / what do they do? I assume they're the actual math that lets the AI decide what to do, but I don't really get how.
Each time information flows through the neural network all the connections between the nodes alter it. Having hidden nodes drastically increases the amount of connections, which gives the AI more control over how it filters / transforms that information. There isn't any difference between input / hidden / output nodes besides their location in the network.
Essentially the more hidden nodes you have, the more the AI can alter the information it receives = the more intelligence it can have
Another way of looking at it is that more nodes means for more ways for information to "weigh" against each other. Each node essentially has a value and a weight for how much that node influences its connection. It's all basically percents. You want lots and lots of connections because it leads to a more in depth and nuanced way for information to relate.
The relative angles of the arms and each joint should relate to how the web is fired, and if caring about efficient motion through air resistance, then it will also want to factor in the other body parts too.
The more connections, the more complex and nuanced relationships you can factor into the system.
love the little mid-air pose at 13:11
I'd love to see what adding a limit to the number of webs would do. If it's only got a certain amount per arm, would it alternate arms? Would it swing further before using the next web? I wonder if it would look more like the Spider-Man we're familiar with or if it would come up with some crazy nonsense 😅
1:36 “Saudi arabia riyals”
Yes ﷼
This was such a well detailed video! I love learning the more technical side of these AI endeavors. Great video!
Watching it catch itself at 14:20 was really cool
I've been interested in AI/ML for a while now, more specifically Reinforcement Learning which is the one described in the video. I already knew the general idea of what it is and how it works but I must say, your description was the most clear one I've heard so far.
Actually there's a course on coursera by Stanford Professor Andrew Ng, who's one of the founder of deep learning methods that's pretty easy and good for an introduction, I know a lot of people and friends of mine recommended it, and I'd definitely recommend it.
Why is this actually one of my favorite RUclips videos ever
14:48 was insane
At first I thought this video would just be another neural-network-plays-a-game video and that I wouldn't learn anything new from it, but I am so glad that I was wrong! You presented these complicated topics so well, along with some comedy along the way!
I wonder how advanced the AI could get given more inputs and outputs to control the other parts of the body, more time, and a more advanced reward function to encourage spiderman to stay up straight, move fluidly, avoid walls, and perhaps add in some acrobatic flair
12:29
"overall, it just seems more confident with its actions."
*slams into a wall*
Wdym bro is wall running like you do in the PS4 games, actually he's better at wall running that every single spider man to exist
10:49 - "Yo Spiderman, you good bro?"
But in all seriousness, this was a really good video - explained all the concerts in a concise and easy to follow way (even if some of them went past my head lol).
Looking forward to more content bro!
Mate that thumbnail is actually mindboggling good. Never tapped on a video that frekin quick!😂
b2studios is just a sane code bullet
That sound when it hit the ground😂 0:19
0:39 that house makes me go brrrrrrr 🎈
That makes me wanna throw darts
Had to scroll pretty low to find a comment about the village
That makes me want to pop bloons
Been there since piderman was just a skydiver occasionally smacking it's face against a wall
But seriously you explained the algorithm really well and I'm just surprised how it went from a broken atari session to... this.
0:19 what in the world😂😂😂😂
I used this timestamp about 20 times
@@isaakhatcherlol
This is the most easy to understand description of RL that I've come across! They should play this video for first year students. Funny and informative, keep it up!
3:05 *Oh no, it's Chairman Drek- HE'S BACK, and this time he's HIGH RISEN!*
2:46 "..we don't really consider the value of time." I have all videos at 1.5x speed by default.
7:30
For some reason saying "it will become slightly less wrong" instead of "slightly more correct/slightly better" gave me a chuckle.
This is a great video - came to it way late but nice work.
We need AI Spiderman in the next Spiderverse movie
Brilliantly explained video, I absolutely love your visual style as well as the little ball mascots, they got a name?
Thank you! They do have a name, I just call them fish
@@b2stud FIIIISH
13:34 Ai learns how to wall run
Yes
2 things, I find it extremely fascinating that an AI decided it was going to be right handed instead of ambidextrous. It looks like spiderman is asleep and he's unconsciously swinging.
Remember spiderman three when he just wakes up on the side of a building in the black suit? Now you know how he looked while swinging there 😂
1:01 the youtube like button animation going off because you said "button presses" is wild
2:26 Just realized this is a B2Studio version of J Jonah Jameson. Very cute! :D
look at his phone background in that shot
I want this as my wallpaper engine background, just endless swinging
I learnt so much from this video... You are fantastic and so clear with your explanations. Subbed for sure!
4:10 *Throws phone happily*
Yo when he said button the like button lit up 1:04
Feel like a little of the pizza delivery music from the Spider-Man 2 game for PS2 over the just swinging section woulda been nice haha
The moment you said "If you're still here... enjoy" something in my brain just clicked like the keyword for a sleeper agent to wake up. You nailed that phrase! felt so good to be hit with a moment of nostalgia unexpectedly like that :)
Interesting! Usually, the AI has many ghost clones of itself in each generation to decrease the learning time significantly.
Do you not need to do this because of PPO? Is it really just that effective? Or was this a relatively easy task for AI to learn compared to something more complex like running?
It's perfectly reasonable to have multiple AI running at the same time. You can also speed up the environment. I tried both methods but I found that running the environment faster was better
You would do that with a genetic algorithm because there is no back propagation, just mutations. It also takes a lot longer to converge on desired behavior.
The most detailed explanation of AI learning I've ever seen. Thank you!
How spidersman canonically swings his web
I was expecting a ai computer to actually learn to websling in a Spiderman videogame
The slap noise near the start from it hitting the floor had me laughing way more than it should have
It also made me think of when I was little and had a tall bed and I fell off it and apparently my mom heard the “smack” of me hitting the floor and came in and saw me laying face down on the floor still sleeping, I laugh whenever I imagine it
I now need a ai spiderman web slinging for 1 hour straight
Interesting video. I would be interested to see what would happen if you turned hitting the wall into a fail condition. It's too bad the AI doesn't have some control over its lower body. Maybe locking the legs together and allowing the AI to pivot as a sort of weight to help build momentum.
I'm really digging the dedicated bus lanes and wide sidewalks!
This has gotta be the first video that actually explains it beyond putting it in the simplest possible terms pretty cool
Imagine how many glass panes Piderman broken hitting all those buildings.
12:59 This is the very first time after the dislike update I see a youtuber publishing private dislikes ever.
So should I suppose people still use it?
Not that bad, I’d say
It is quite a collection of dislikes, Im quite proud of it
I still think removing the dislike counter from viewing was pointless and more harmful than good.
"Why do you keep walking into the wall"
"If I break enough bones, they will learn how to climb up the wall, achieve orbital velocity, and cure cancer eventually"
0:19 best part
The movement at 14:46 was ACTUALLY really cool!!