A.I. Learns to play Snake using Deep Q Learning
HTML-код
- Опубликовано: 11 июл 2019
- Can an AI learn to play the perfect game of Snake?
Huge thanks to Brilliant.org for supporting this channel, check them out: www.brilliant.org/CodeBullet
Twitter: / code_bullet
Patreon: / codebullet
Discord: / discord
Art created by @Dachi.art / dachi.art
"It'll not take 3 months"
Top 10 anime betrayals
And thats a fact
Oof
Sadly it's TRUE
cameron schiralli probably a new season of Brooklyn 9-9
Only if it was four months
"It will not take three months."
Four months later...
Wow, 21 likes in one day. There must be a lot of people waiting for him to make a new video
@@marekmiarka8550 and youre one of them.
@@lithuanianinbound589 we all are one of them on this blessed day
Technically it hasn’t been 3 months (but 4)?
Code bullet is pulling a ceeday
Code bullet 3 months ago: "it will not take 3 months"
Me after 3 months: "lmfao"
Pyren 4 months
Sure bud.
I need to change my bank.
i mean... he wasnt wrong, it took 5 months...
He’s right it didn’t take 3 months. We’re on 4.
Next is 1 year
@@Qazsnivy You have no idea how right you were
“I’m as stubborn as I am lazy” ~ Code Bullet
Relatable
very stubborn.
very lazy.
Slazy
@Daedric Skyrim Same omggg
*Relatable*
How i did it:
* Relatable *
*Relatable*
This ended up being a snake version of a dvd screensaver hitting the corner
RavenCrow only reply
@@hashirama1507 that's not treu
:((((
enjoyed all 5 myself :))))) @Andres Lorenzo
@@danielstein3659 I didn't, but I'm also not 12 years old anymore so I suppose I'm old and bitter.
"What is my purpose"
You play snake.
Oh my god
Welcome to the club, pal.
Rick and Morty very nice
What if he actually died and we’re just sitting here making fun of him for not being able to upload within three months
holy fuck dont fukcking scare me
Would be funny
He was in uni so he couldn’t upload but it’s done now and he’s editing a video
Dominic Ward I thought the same thing
It was the fbi this is his clone making this vid
"It wont take 3 months, Im thinking a couple of weeks max" - Code bullet before not uploading for 3 months
@@samuelbowling9530🤨 I was searching aswell what a coincidence🤨
wowwwwww
Can you make a AI that can make more AI videos than you
Only 2 months!! In oktober its gonna be 3 months!! So in 5 days xD
@@sychoecho9497 that a great idea!
"I'm as stubborn as I am lazy." uf, i felt that, i felt that on a spiritual level
Not a #MeToo but me too :3
Yap, me too...
If you rotate the frame you feed the machine so that the snake is always facing upwards, you effectively quadruple your training data by allowing the snake to treat isomorphic game states as identical. It'd take a little work to correctly rotate it, then rotate the output, but the savings in training time would probably be worth it.
Why not giving him a reward base on the time between eating apples?.
It's been 4 months... RIP
Because if you go straight into the apples you could accidentally run into itself because it's trying to get there as fast as possible and won't realize where it's body is until it hits it because it's not focusing on that.
A Thousand Year Old Fossil negative punishment
A far enough game encourages you to take the longest time/route possible to get to the Apple, that is, to fill up the whole screen with their body before going for the Apple
A time-based reward would be very counterintuitive to that
@@DawnAfternoon That highly depends.
I'd say the key is to store some values and modify them constantly.
As for the video the AI doesn't have any memory. But it needs some memory values to adopt it's strategies better.
Example:
You can count the amount of apples the AI got. Just decrease the time-reward every few apples and the AI will be more aggressive at start and more careful later.
Somewhere, Code realizes he could've just made the head green.
I already suggested that. like 4 months ago
That was my first idea, and yet here we are with this shitshow of an AI lol
Four months ago
Sounds like that would be.............. For pussies
@@kevinwells9751 if you say, the thing he did is a shitshow, then make an AI yourself and programm the game, let it learn......
i am waiting for you to upload such a video, where you make it bether than him ;)
I have an idea
You should put a time limit for the snake to find the apple
So it can find the most efficient way to find the apple and to prevent it from doing shit over and over again without finding the apple
Good idea!
Not gonna work i guess. if the snake is to long this will kill him most of the time lmao
@@theboman4816 i mean u can extend the time by how long he is aka each block give it 5 more seconds or something but in end even then time scaling a need to be adjusted muply times or b it probbaly wont work
@BlazePlayz YT yea but in the end there are still chance it will kill itself :D
@@jessenxxx floor(8 + 0.2b)? I can't remember what length the snake starts, should give 8s to find the apple to begin
what would you do if adrian suddenly spelled "HELP" with his tail after 20000 games.
Insult it for making a useless formation.
"Won't take 3 months"
9 days before it hits the 4 month marks.
He never said that it wouldn't take longer than 3 months though, so technically he's still right
"It will take a couple of weeks."
Another 3 months later.
2:30
CB: hey artist can u draw my avatar for coding videos
Artist: sure what poses do u need?
CB: punching
Artist: why would u need that for a programming video
CB: with both hands
@@firestar9650 how could this be a nazi Joke?
CB: Oh! And one with a pistol.
Artist: hoW IS THAT CODING RELATED???
@@P.cookie CB: IT'S FOR PUNISHING MY AI!!!!!
Artist:......o....ok...
Am I going to hit you with my left or my right fist?
All of I could think of
@Firestar how do you even pull goddamn *Nazis* from "draw my guy punching for programming videos"? it's a joke about the artist being concerned why someone needs to punch for programming videos, it's not that hard
"it definitely won't be 3 months"
Me 3 months later: bruh
Heyy
4...
its ok he just needed to grab some milk
"I won't die for 3 months"
Dies for 4 months.
I love how the snake just randomly decides to be a squiggly boi sometimes
No your not your just a squiggly boy
he squiggle
he squirm
he a big fat worm
Mcdoggo In cape SAME
Can you make a game easy and bug free
Make a fortnite ai smart
AI:what is my purpose
CB: you eat apples
AI: oh my god
Ywuhhly ywuhhly ywoh
Ywoh nottt jwoooh you lithuaniaaaan speeekaeeeerre
yeet yeet can you translate this to English
I just wrote how your name is pronounced in the international phonetic alphabet
yeet yeet ok, but if I do wanna read it right, flip ur head upside dowb
yeet yeet also, do you get my Rick and morty reference
"It will definitely not take 3 months"
-- *Code Bullet 4 months ago*
I can’t believe you’re still SO underrated after all these years
*that's what she said*
1. Punish for going too long without an apple.
2. Recurring. Yes. I dared saying that.
like a -1 punishment for every move, + 99 when he gets the apple.
@@potato_hoarder I ifnd it problematic. If snake, at some point, will have to travel more than 100 blocks, wouldn't it lose points therefore perform worse?
Niezbo yea, maybe calculate how many blocks a snake in average would need to travel to hit an apple
And if its over 1.5 or 2 times this average (bc ai are dumb) let it loose points
@@niezbo it would still be better than continually wandering around without getting the apple: remember it has no other choice but to move.
What do you mean by 'Recurring'?
idea: punish it whenever it takes more than 20 seconds to find the apple
@Notabotatall _ 27 Correct
That’s the idea I had it would make it have a way better way of searching
Or just make it to where the apple is always in it's vision?
Like if the apple is out of the snake's field of view, make a separate field for the apple?
@Notabotatall _ 27 WTF is "ceying" ?!
A Day in The Life of Code Bullet:
Wake Up
Coffee
Crying
Doritos
Coffee
Working on his number addiction
Doritos
More Coffee
Coding
Doritos
Even More Coffee
Sleep
Sounds like hes more of a coffee addict honestly
The last one is wrong he doesn’t sleep
@@duncannonnn4259 Yeah he’s a uni student he has no clue what sleep is
make a game of this
K
could you just not have made the head of the snake a different color? like, purple, or something?
things that make you go hmmmmm:
"won't take three months"
Nah mate, it'll take four months
@@Tytoalba777 my bet's on 5
Na i think It'll be 1 years
Nah my bet is never
He'll be back, he always comes back.
Me: "Oh he's still alive."
CB: "Yeah, I'm still alive."
Lmao
He's going to create an AI that ends up killing him. I just know it, not that I want that to happen.
@@jamesgockel854 Shh... don't tell him.
I had to double check i wasn't seeing things
James Gockel why are you checked??
@@cyclus_gaming he's verified I think
honestly bro - Ive been watching your channel for a while, and it makes me laugh so much - Its a wonder break for me from my own code
This guy is really out her taking something mind-boggling boring to watch (I.e. coding) and making it entertaining, somewhat educational, and funny. Definitely worth a sub
It feels like the natural selection algorithm should have taken in account time, if snake doesn't find next apple in X * 3 + 50 (just an example) steps.. then it "dies", where the X is the current length of snake, since longer the snake harder it is to navigate on the board.
In your case, where time is "infinite" you just ended up with a safe snake that doesn't focus on searching, only avoiding obstacles and occasionally picking up these goddam apples.
This also would have improved the learning speed, with all the procrastination snakes getting killed right away ¯\_(ツ)_/¯
Justice and honor glad I'm not the only one
it's not a natural selection algorithm, it's q learning.
When the snake is too long, it needs to kinda "fold" or "sort" itself before eating apple to avoid collision with its body right?
Maybe the time will need to be extended as the snake goes longer ?
Yes he should add time. You can just add time to the feedback (reinforcement learning) of Q-learning.
EX)
The snake get's the apple in X steps.
2. If X > 10, punish the AI by some function of X points.
3. If it dies, punish it.
4. If it gets the apple, give it a reward.
The snake would be incentivized to get the apple in a timely manner since it gets punished depending on the amount of time wasted. Here 10 is just the number of steps before we consider it to be wasting time. Here you would need to balance how much punishment it gets on death vs wasted time.
---------
Other ways to improve learning is to give the snake a better sense of its surroundings. Both the boxed region and the simplified region force the snakes vision down significantly.
How to fix this:
1. -> One way would be to feed the whole screen to the snake and call it a day. This is slow (as we found out) so not really realistic.
You can improve this by using a CNN. This would use a "window" and scan it along all the positions. In other words, instead of feeding all w*h inputs, we can scan 7x7 windows along the board with the same "network". Since it's the same network scanning, we just need to build a network for 42 inputs rather than w*h inputs. This would allow for a much smaller network size and significantly speed up training.
2. -> Another way without using complex networks would be to feed the network with "interesting" points. All the unused space is really not needed. So we can feed the network a few things: The apple, the borders of the game and finally the snake body.
You can actually do better than this and get rid of more useless information. For example we can reduce the border to 4 corners (we don't need all the squares connecting the borders). Even better than this, just provide the height / width of the game (no need to give (0, 0), (h, w), (h, 0), (0, w) when we just need h and w).
On top of that, we don't need to give the entire snake body to the AI. We just need the head position, tail position, and all the positions of "bends" in the snake from previous turns. The rest of the snake can be assumed by the AI.
The only problem now is that the snake size is dynamic (can be 3 bends, or 150 bends). The network cannot change the number of inputs without more complex code (such as an RNN). So we can fix this by assuming a max number of "bends" like 50. This way we fill in the bends as inputs to the network and put 0's for the unused portions. We can have the code kill the snake if it surpass the max number of bends and use that as another reinforcement punishment for the AI, basically teaching it to limit the number of turns it makes.
Now the number of inputs are 1 apple + 2 for borders + 50 bends + 1 head + 1 tail = 55 inputs.
We also have feedback on getting the apple, getting the apple in time, death, and the overall efficiency (number of turns made, which was maxed out at 50).
Training should go by much much faster with a lot more improvement given the different feedback and FULL visibility. We can even expand the network to a much larger size to really learn some techniques.
I'm no coder, but wouldn't that be pretty ineffective, since the apple begins in different places. It would try to go in the direction of the first trial's apple, but it wouldn't be able to find randomized apples
well programming + animations just screams out long video release dates so you dont need to apologize, we know what we subscribed to
edit: also code bullet, why dont you make the head of the snake a different color so the bot will know where the head is and you can make it see the hole screen!
What animations? These are premade images being swapped out and, at most, being manipulated with basic 2D transforms on whatever editor he uses.
Weblure Joltik still has too draw
@@weblure still counts as animating
They’re premade and not by him
@@klcompany7303 No; he did not draw them.
I love his pure joy and enthusiasm when making this video it was brilliant
I love binging your stuff because you go from polite and funny text to being the mostile hilariously hostile creative in a snap XD
I named him Adrian
**2 seconds later**
ADRIAN WHAT THE F
09:00
You forgot the question mark.
DID A BROTHER CALL MY NAME, FOR THE EMPEROR
@@adrianradu8428 yesh
Adrian Radu I AM HERE TOO BROTHET
"I'm as stubborn as I'm lazy....."
-Code Bullet 2019
Skeptisk so Stubbornly lazy
same
youtube recommended me this exactly a year after it was made
"It definitely wont take three months, you can bank on that."
My house would have been foreclosed by now then.
“There’s no way it could take me 3 months” - Codebullet, 3 months ago.
"Theres no way it could take me 3 months" - Codebullet *1 year ago*
I mean, he died in the accident, so it makes sense
4 months ago
*_Who else would legitamitely love a Livestream of Adrian's training?_*
who is adrian?
@@marley7776 oh my b
YES
I'm Adrian
@@bastion212 But can you play 20,000 games of snake in a row with no breaks?
How have I never seen your stuff yet. This is amazing :D
been about 3 months, must mean that mr. code bullet must be starting on his next project. HYPE!
7:43
"You'd either call a doctor, or an exorcist" had me dying laughing
oh whoops, fixing that typo
“How long has it been?” “THREE MONTHS?!?”
Who else thinks this is just going to be his intro for the next video
It has been 3 months -.- so yeh kinda annoying 4 no vids
It's 4 months
Caden Allison what are you wooooshing?
@@opponentbacon R/YouAreStupid
Well guess who is right...
The video and idea is all great! I just wanted to give a little input that may help. Give the snake the coordinates to the apple and let it figure out the direction to move and such since that's sorta what we determine with our eyes. Also, training the snake wise you might want to train it to play mostly when it has a long tail since that is when the game is hardest and needs a specific behavior of planning not to run into itself. Hope that helps a bit! Oh and for the direction the snake is going, you could maybe cheat and make the head a different color for it then later on change it back to white and have it train more on the existing weights.
“I’m still alive!”
Three months later…
Bruh 😂
All comments are how he has taken more than 3 months
You have no videos or 10k subs
I've been duped
As a suggestion, maybe punish him if he goes to long without finding food. So he can't get caught in a loop.
Edit: oh wow, this blew up a bit more than I expected. @skeletalZ @Alex.Doan @Ian and @Jonathan.Yang have my favorite suggestions. I feel like a mix between my suggestion, @skeletalZ 's suggestion, and @Alex.Doan 's suggestion could work really well. But at the same time, a mix of @Ian 's suggestion and @Jonathan.Yang 's should work really well too.
Owen McLaughlin issue with that is if the snake is too dumb to find the food, it doesn’t have any other option. So punishing it won’t result in anything.
That would be against his animal rights >:(
This can be done with the use of a discount factor
Hmm
Maybe a timer based on the snake size
Owen McLaughlin and a reward for finding the food in that time limit
RUclips: is that a gun
RUclips: DEMONITIZED
domonetized*
He's 1 step closer to making scp-079
Yin Yang ?
Wow thanks for not taking 3 months!
*IT'S TAKING MORE*
So I'm new to the channel and I like the idea of using the whola map for the snake to see.
What I notice is that after snake is bigger that the "sight" box it will have high chance to eat itself cuz it doesn't remember the position of his body and will box itself.
What I would suggest is to use the whole map as you intended but you couls change the color of the head like so it knows where the head is(sometimes the simplest solution are the right ones ;)
You could also try putting a starvation timer to motivate the snake to go eat.
I'll look forward to your next video
Change the color of the head
- Everyone in the comment, 2019
Lol
My idea too. Give it a head, and now the AI knows which way the snake is facing. And now it could see the whole screen.
"How can the AI know where the head of the snake is"
Me: "Tell it"
CB: "Shit ... my solution is way more complicated"
The head was only a minor problem. The bigger problem was that the AI had too many inputs, which is why he shrank its vision range. This just happened to solve the problem of knowing where the head was, which was a bonus.
not a different color, give him a cute face :D
I was think this for the problem of where the head is: Make the head a different color than the apples and rest of the body. Maybe like a blue or something?
Yeah, that was the most obvious solution.
I was thinking of that as well and waiting to see it implemented. It can see the whole screen so it always knows where the apple is, it knows where it’s head is, and it does that all with one frame of 1600 pixels.
It’s actually way less efficient. You’d need to check all the sample until you find the head and only then look at the 4 adjacent pixels to know the direction. With his method you skip the looking for the head part, it’s always in the same place
Make it the same colour as the apples. Then you get point every frame!
Adam Mullarkey color and function is a different thing
This guy is genuinely so funny unintentionally 😂
I love consistent upload schedules
Looks like he found a way to clone himself. If he can do that, he can definitely do this.
@5dope ayy
1:00 flips the avatar but name on shirt is backwards and too lazy so just writes cb in arial white font lol
that cracked me up.
It's the funniest when these "top-quality content" channels do it too.
It is a top quality channel what do you mean?
69 likes
Yeh we saw that u don't have to say it again
CB: Creative Bibliography.
Man, u r amazing! Both smart and hilariously funny, we don't get a lot of those recently...
Code bullet: “it will not take 3 months”
Me after four months: so that was a f*cking lie
No it isn't. I didn't take 3 months. It's taking more
*AC Unity flashbacks*
As a fellow Adrian speaking, he did better at snake than I ever could.
Fellow Adrian here to humbly agree.
Fellow adrian is calling bs
Not Adrian here to ruin the Adrian reply chain
MrCinch ok Shepard
@@buckiez Other Adrian to complain about the sudden loss in the Adrian snake chain. How are we gonna succeed in the snake game and appease the Q Gods if you cut the Adrian snake off like this?
Theory: Code bullet is taking so long because he's still secretly working on the Enigma decoder
Yes
Dancing animations are more important
When you get back. We miss your video's. We love them.
When he said the name Adrian I'm like HOW DID HE KNOW MY NAME then he's like I named it that and now every time I hear Adrian I'm like wassup
Next you should try and do:
AI CREATES REGULAR UPLOADING SCHEDULE!!!
Yeah, we're all waiting!
Spicy
Lol
1 vid every leap day lmao
I don't think we should give him shit, i mean, these videos are pretty hard to make.
Have YOU ever coded a game from scratch with the pressure of the expectation from millions of people?
You at least upload more often than cgp grey and oversimplified combined!
The holy Trinity
FANKIEonPC rip
Cgp grey is a live streamer and podcaster though
@@dhuill8900 rip the bois
you would watch them wouldn't you
For Christmas I want a new video from code bullet
Right bruv just take your time you do what's necessary in your life you amazing human being
Love your videos you glorious nut job! Rock on! Code on!
"it will not take 3 months"
3 months later: nothing
Pschhh its gonna be 3 months ! In 5 days so it was only 2 months when you wrote the comment 😂😂
@@Melinstri Ok Ok now its 3 months+
Why not make the head a different color and feed blurred pixels from outside the vision
That's a great idea.
And does it need to see all the empty squares? If you just give it the information of the Apple, the head, and the body couldn’t it extrapolate all the empty space?
Or perhaps keep the nearby vision, but give the snake which quadrant of the game board the apple is in, so it at least knows where to look.
I fell like the head part might be kinda cheating but love the idea of blurring the outside so that like every 4 pixels is condensed and if theres a apple its red. Great idea!
I was thinking the same. If the initial problem is it can't see where the head it, can't the head be a different color and feed it the whole map at once? Maybe like that it will work and build and strategy based on where the whole body is at the time and where the apple is
Two years and it still brings back memories
10:50 HOW IS THAT NOT A THUMBS UP
what about colouring the snakes head FEFEFE and the body FFFFFF so it looks the same to humans, but it can see it's own head cos it's a different colour
Literally the first thing I thought he’d do. Talk about overkill
you don't have to color it differently, just give it a different value to the ai and keep the color the same
@@shiinondogewalker2809 further on this, why not just give he ai a version of the screen where each block is a single value in a 2d array (0-3 could be empty, body, head, food) . This would be incredibly small and really the bare minimum without losing any data
Nerds be watching nerds code AI
@LintyCarcass that was showing 2 pictures
Had you made it 4 months we would’ve had a problem
AH FUCKING SEE THAT JOKE SJNAHDHDH
Dont give him any ideas
Mista in 15 years it is possible for the date to be 4/4/44
@@philiplee8834 in 2 years we will have 2/2/22
oh no
wait that was a 4 letter sentence
Love the attitude 💯💯
3 months? Try 13
And so he did
I want "Adrian wtf" merch cause I think that was everyone's reaction to him
What if you add a penalty for wasting time? For going x amount of time without eating for example.
This is a reasonable thing to do, and in fact is pretty commonly done with Q-learning algorithms. It might come with some downsides in the case of snake (getting so greedy for food that you corner yourself) but that seems like a fair price to pay.
Come to think of it, I wonder if Code Bullet simply chose not to discuss that possibility because the video was already tangential and long-winded as is.
this was my idea too. Because otherwise the only thing the snake has to do is survive long enuogh so that it has almost unlimkited time to grow slowly
Yeah you could give it like a hunger value. If (hunger == 0) { snakeHitpoints = 0; }
Sorry for the Java example 😂
This was something I was considering while watching as well. It might help to create a different behavior than its pong like behavior while searching for food.
you could also just give it the screen position or coordinates of the apple, the head and the body, since you made the game yourself so it'll be easily accessible to the AI. Then it'll be way less inputs (maybe more as the snake gets bigger but once it reaches the halfway point you could just give it all the empty squares instead)
You are such a great youtube coder.
PLEASE UPLOAD I MISS YOU SO MUCH YOUR FUCKING COMEDY
Its so fucking good
...
Why didn't you just, you know, gave the snake a head?
Like
Slightly different color pixel for example
Then you can make it see the whole screen
We would know that the different color is the head but the AI wouldn't.
I still don't understand why he was trying to find the head originally, since he eventually seemed to just have that information and could center the square of invisibility around it. But the problem with it seeing the whole screen is that's too many pixels for it to have scan, so it just takes absolutely forever to train. It doesn't matter how easy it would be to find the head with that scan, you just don't want to scan the whole screen for any reason or it's going to be too slow.
@@MitchellD249 its not that he couldn't find it, the ai couldn't because it only saw individual still frames, and had no memory. Either end could be the head from that information.
Why does it have to be pixels tho? Why not just give it bounds, tail, head, apple coordinates and the direction? The rest is void anyway.
@@DropkickedBarracuda So how did the AI find it in the end?
Every vaguely science-related channel: **Exists**
Brilliant: _allow us to introduce ourselves_
Adjust the snakes vision so that it can see the entire board but with gradually decreasing resolution the further away things are . Same number of inputs but full field of view.
Above distance x the input only knows if there is an apple somewhere within 4 nearby blocks, above distance y its 9 blocks etc...
Q: What do you call a snake that only eats desert?
A: A pie-thon! 😂🤣
*GET OFF THE STAGE*
Chicken pie.
gotem
that's just not funny.. and the emojis make it even worse..
Booo!
Oh brother this guy STINKS!
@@lividman2372 hehe thx for finishing my meme
“Milking snake”
Code bullet - 2019
Why didnt you just make the head a different color and then Tell the ai “Thats you”??
Because the og snake didn't do that, so why should he
Well that's basically what he did by putting it in the middle. He could've increased the viewdistance to like 30 to achieve a effect that would come closer to what you say, I guess he thinks that it's too many inputs once again.
Smug Anime Girl oh yeah that Makes sense
That’s what I was thinking!,
Maybe he could give the AI a list of the positions of the body parts and maybe a bit of extra information that tells which one is the head.
Thx for all the training mate - Adrian
The fact that I could forget about this channel and then remember about it again months later and still not have another upload says something I think 🤔
Ok you made a Snake ai, but does it know how to abuse his Up-Tilt?
Nikita intensifies.
Raghav Pant beep beep
All snakes need to know is to press b
@@andrewpereira888 and thats a fact
What do you mean?
So Adrian is basically looking for the apple randomly, but any human player knows where the apple is all the time, so why not just tell him where it is, either with or instead of the 20x20 fov
How do you tell it where it is? 2 solutions:
1) giving the full grid with 0 when there is nothing 1 when there is the apple, this is the solution used here and often used beause by experience AI perform well with that form, but it require a lots of inputs
2) Giving it coordinates, so 2 inputs, one for each the horizontal and vertical axis:
So less inputs so far so good. The problem is that AI often struggle to understand this kind of informations, because they need to learn more complicated task to understand that sometimes a x=10 mean go up and some times it will mean go down, it need to learn some tasks like making substraction... Not as easy as it looks for an AI that learn by itself.
@@kasonnara Maybe you could give the coordinates relative to the head. So negative/positive numbers would always mean the same direction and it would become a minimization problem.
your awsome ive subbed i love it
Man, I wish code bullet went with idea 3. That sounds like it would make an amazing video that could instantly become his most popular video ever.
Punish AI.
AI learns punishment motivates.
AI takes over.
AI makes Robots.
Robots Punish humans.
Duh
Nah, AI converts punishment to positive reward, become masochist, does everything wrong and digitally orgasms all the time.
Then humans learn
Humans makes babies
Babies punish ai
Infinite loop
@@LtNeverL2 true
@@Atlas_Redux lol
skynet
Code Bullet: going to milk snake for a fourth video
Adrian: (chuckles) I'm in danger
I really want to see some tutorials from you!
Q-learning works on Markov Decision Processes. In order to avoid any memory in the neural network, the input feature vector needs to encode all the information about the state. If it doesn't, you have created a Partially Hidden Markov Decision Process, and need to add memory for your NN to manage, making it some form of RNN.
There are efficient ways to decrease the number of input features quickly in your neural net without manually throwing out potentially-necessary information: attention-based neural networks, which, happily, outperform RNNs on the types of sequence tasks where you would normally have this problem.
I am surprised the 2-frames method was too large an input vector, given that the original paper used 4 frames of an Atari game at a time as the input vector.
Impressive results though.
Came here from the next video to learn the rules you imposed, so let's hope you found these ideas in the interim :)
Reward it for getting apples faster. Solved.
Don't forget it has to avoid trapping itself as well. If it can't see its own body outside of its field of vision, then it can't learn to avoid trapping itself because it can never realize it's trapping itself. No memory is a pain in the butt.
@@VoltaicBacklash That's what I thought of. If I knew how to code one of the things I'd do after seeing it entrap itself would've been to make it aware of it's body so that doesn't happen. Has the added effect of allowing it to find apples next to itself much faster later on.
And to prevent the initial problem of seeing too much of the screen and bugging out, make it so if a body part has another body part on all 4 sides, it becomes invisible to the AI / "deactivated"
This is probably easier said than done, though, so I'm gonna stop there
True
Voltaic Backlash is right, also, Thank you Darwin... Both for explaining evolution, and making advanced ways to modify an AI to do work for us.... And make Code Bullet make hilarious videos in the dumbest yet smartest way.
Yes
Do a.I. learns how to play geting over it
lol too hard
That game requires too much precise movement
ruclips.net/video/HOa5B3COET4/видео.html look at this video :D hh
One of 4 things will happen then 1:code bullet well smash his computer to bits because of how insanely hard it would be to program 2:if he doesn't kill him self on making the project the AI will self destruct by thinking it self to death *cough* *cough* (Cortana) 3:the computer processing unit will come out of his computer murdering him 4:AI will become sentiant and destroy the world with a hammer
@@Kappo288 that's a TAS not AI
I love just listening to this guy talk...
I like the idea of idea 3. Its seems like a good topic to focus on