@@Amipotsophspond You exclude yourself. Clearly Karoly's attitude is that we're on these adventures together and everyone who's shares the interest to learn more is welcome and a fellow "scholar". He even addressed that in some of his early videos.
@@beedykh2235 I'd say an enthusiast at best, but thank you. I suppose that if you let "scholar" simply mean "one who seeks knowledge", and "fellow" mean "one who partakes in (a shared interest)", then perhaps I might just fit that description!
Good afternoon, 57! Your target is “Montezuma’s Revenge”. Use your long- and short term planning to understand the goal of the game and eventually win. Good luck!
57: _oh, they think im an idiot_ oH My GoD tHiS iS sO hArD also 57: *casually logs into some huge databases to laugh about the porn his creator is watching*
Iam not amused by skynet references anymore since armies of world WILL be in reality first users of those things and it might get really wrong... you know only complete idiot would hook up army to such machine, right.... who you think is in charge of army....... ....... ........ yeah right.....
like thats a hard game, just left down left down repeat, only sometimes another move, but he can easily tell he fked up by the number of open squares and try to avoid that scenario in the future
Greats from Brazil! I'm graduating in mechanical engineering but now I'm studying machine learning and stuff because I really love it. Thanks for inspire me! Nice video and channel! You are awesome!
Videos like this from your channel are a large part of how I got interested in Deep RL and computer vision. Excellent coverage of new results! I too eventually switched my carreer path from an unfulfilling and decaying field to deep learning thanks in part to your videos. Keep it up! Unfortunately im still at the tail end of grad school, and all the amazing results in RL, CV, and NLP had been coming out while I was already in the program (applied math).. I'm glad you finished recently and you're blessed to already be in a more closely related feild.
There has been a video about the question if it would be possible to ask an AI to chill. Because one of the worry about AI besting human isn't jus the "I judge human as evil, and so it's my duty to ride the universe of them" scenario, just than for the moment AI are meant to increase their scores by all means, wich could lead to them taking unethical choice to reach their goals (like an vendor AI that would manage to find how to hack someone wallet to then purchase all of their products )
2:46 "…It can happen that we choose an action and we only win or lose hundred of actions later. Leaving us with no idea as to which of our actions led to this win or loss. Thus making it difficult to learn from our actions." Sounds relatable.
deepmind started playing atari gamers in 2012. so it did take almost 10 years for them to master atari. its going to be a long time till we get AI that can play any ps4 game
Nice. I think that predictive curiosity-driven learning in combination with long-term memory/planning is imperative for AGI. I'm happy to see this paper incorporate elements of that approach, I'll have to give the full paper a read! Thanks for all your amazing videos on AI research, keep up the good work :)
To that guy who switched from a career in medicine to doing AI research: "pain is temporary, glory is forever" -Martin, from Wintergatan, on building the MarbleMachine X
You won't be a good computer scientist if you credit the presenter and not the people who actually do all that research and work. Why? Because your analysis of what got you there lacks a lot of depth.
@@advocatusdiaboli9351 I started watching this channel well before I was capable of reading (and understanding) a cs paper however the overview he provided made me realise I wanted to become part of the field. But ill be sure to properly cite my sources next time I write a youtube comment :D
@@advocatusdiaboli9351 Please don't be quick to chastise others. He obviously knows that he isn't responsible for creating the research papers. Not only does the RUclipsr credit them frequently, it's also required for you to cite research papers in that field as well. He's only crediting the Youtubsr for highlighting work that would have been unnoticed.
@@kloa4219 imo he doesn't make it clear enough that it's not him doing all the work. I as a consious viewer don't even know if he has permission to show all these animations and pictures he shows in the video.
@@advocatusdiaboli9351 I take it that you aren't in US college or uni. You're permitted to showcase or use papers from public institutions with credit, but you aren't allowed to do the same with private universities.
It's amazing they have found ways to master these games without the agent really knowing any context like what snakes or skulls are. Humans follow a very different approach by using skills learned in real life and extrapolating them for the games. We know what a ladder does, we identify objects, we can understand the long-term goal by assuming the goal is intuitive based on other games (even though that can lead to weaknesses as well) or even by reading the manual of the game. I'm eager for the time AIs will combine the current learning algorithms with general knowledge based on context but I guess we need more computing power (and research) for that.
I've always thought that for a truly general AI, you need a robot or robots that go around in real places, collecting data about the world. They would also need to interact with humans. Though being robots, people would treat them differently and thus they would get different world view. But maybe through seeing humans interact with each other, they could figure out the differences between themselves and humans to reduce that bias. Maybe sell robot babies to people so that they will teach the robots how life works, since a lot of basic human stuff is learnt as a child. What sets robots apart from humans is not only their superior computing power, but an ability for a swarm of instances to share knowledge, to learn without physical limits. Imagine a team of hundred talented and knowledgeable humans in different fields. Now, a single AI can have multiple instances, practicing different skills and knowledge and combine them into one coherent, inhumanly vast knowledge of the world better than any human or human team could, since humans are extremely limited in communication. There still exists the mystery of motivation and emotions, but looking at this video, would curiosity already kind of count as emotion?
4:20 - This is how we get towards real AI. This is AI getting more and more like a brain. Different modules with its own structures and sub-structures carrying out various specific tasks, and then ever increasing higher level layers assimilating and abstracting the information. This is some cool stuff.
O, god of forgotten papers shine upon this channel for more videos. I use to just watch ur channel only cause I was a computer nerd. Me too was inspired by all the hard work this people are putting in it. TNX for showing us all this cool stuff which is showing cool and innovative part of AI
This is incredible. I was waiting for this moment since the first paper came out! This learnimg model sounds like a really good agent. I'd like to see in which kind of game it fails miserably, so we can continue to improve the general model.
i dont know if im capable of doing such amazing feats witch mashine learning but i too love all of your videos and click right away to marvel at the growth of these new technologies, if i ever achieve something in thuis departament you can be certain i'll credit YOU doctor as my mentor and you'll be first to know about my papers ^^ you're the best, stay awesome!
What about the Nvidia A.I. that took a bunch of screenshots of Pacman as input and outputted code for the game; it actually generated the game itself from just watching it. 🤯
This is exactly what I have been waiting for. Every time I hear about what these neural networks can do I wonder how long it will take before one is designed that can do more than 1 task well. Looks like we're on the way.
Idea: Recognize the different stages in the game via a classifier, and for each stage have a separate model that will be active and trained for that section. Analogy is that as a human you use different tactics per section of the game. Thats the model for the short term section, while another model takes care of the more longer term decisions. Just putting it out here 🙃. Might be totally wrong.
It is still lost on me the actual amount of human effort behind each one of these papers. It would be a great side thing to add just a little bit of more detail into the intense hours of work. And sometimes it's by a couple people and sometimes it's a large team in these papers. Not even 1000 people could do all this work on their own. It's amazing.
All this AI progress is amazing, and I think an unexpected realization will be that the AI will find effective but awful solutions in some cases, that will help us realize how some of our own solutions are actually awful.
@@ДаниилРабинович-б9п explain or give one example? I think there will still be awful solutions depending heavily on core values or styles (in forms of algorithms) we equip AI with.
The question is: Does it beat 5.51 seconds on Dragster ? Thanks for another awesome video. I always feel compelled to learn more after watching your videos.
I always wondered, do they learn how to play the game or do they just find a solution to meet their goal? Like, if I were to give an AI a Mario Kart track and the AI learns to finish the course, would it be able to finish any other course I throw at it or would it need to start learning from the very basics all over again?
@@davidwuhrer6704 it really isn't though: ruclips.net/video/vGVs5Q_419w/видео.html I don't know much about DL but I'm pretty sure any 3d shooter using analog input devices (like a mouse or a joystick) would require years of computation before landing a body shot.
Question for anyone who knows this stuff: What exactly does the algorithm "see" when it's playing these games? Is it processing the actual draw data of each frame as an image - basically seeing the screen like a human? Or is it "seeing" more like raw numeric output from the game code? Stuff like 4:58 just seems wild to me. Because for a human player it's obvious that the thing in the middle of the screen is a ladder that you can climb. But the algorithm has no idea what a ladder is (or that any of these random chunks of code are actually visual analogies for objects in a "real world" that it will never interact with or understand) , so how would it even know to try going down at that point in the screen? Is it just trying every possible input at every step of the game until something happens? Or can it recognize something special about the ladder tiles?
@@skierpage Nice, but that's pretty crazy. It must have some other data, right? Like I'm guessing it didn't teach itself to read the score from just the pixels on the screen. It must get certain data directly.
@@davidwuhrer6704 Actually, I guessed right. Digging around in the related papers: "We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs..." So these algorithms 'see' the pixels of each frame (like skierpage said) plus the score variable separately. But it looks like that's all the agent has to work with, which is pretty amazing.
AI is still at the innocent level of non hypnosis. The next level is self hypnosis, or self programming and after that, social hypnosis or programming by a group of others. In the context of gaming that would mean that the AI should program the game itself to advance. It should invent a game that is difficult for it to beat. Beyond that an AI could have a different game for any object in it's world as a sort of reflection of is thinking in regards to such object, and after that groups of AIs could invent ways of agreeing on how to play with different objects, and so on.
@Neon Rogue No, it actually did get addicted to the TV. It's artificial curiosity would make it seek out novel experiences, and the TV provided plenty new input, more than any other part of the maze.
So I haven't had any formal training in machine learning (next semester though!), but as humans we look at the screen and perceive objects not individual pixels. Why don't we give the AI that advantage because I feel like it is harder for an AI to be learn like us if it can't perceive like us? I could be completely wrong and that's exactly what they already do so sorry if this is a stupid question. XD
The intermediary layers in a convolution network do that: They abstract the pixels to objects (or other concepts). Agent 57 is also part of the group of AIs that can transfer skills.
The PRNG in the atari 2600 games I’ve analyzed only generates 8 bits of randomness and repeats after 256 values. Is it possible the AI is learning something about the bias and limited randomness and using that for an a vantage a human wouldn’t have? Still impressive, but I’d love to test the AI after swapping out the PRNG and see what it does...
"would you want to know your grades? no grades!, but he tells you that you failed" that's common on the mexican education system (that's where I live),teachers often won't give us grades, we have to fly blind untill they tell us that we failed the class because our homework was terrible even though we never knew if it was ok or not when we needed to know
Huh... So for more advanced games it will have to read text and understand it? Will this be how AIs figure out the connection between scentences and objects? Differences between descriptions and tasks? Did they write about what they want to do next?
That's funny that Video Pinball was the easiest game for the AI to win at. I remember playing that game, and it was possible to get good enough to steer the ball off the left pop bumper, up through the rollover above it, off the roof, back through the rollover to the bumper hundreds of times. But one had to count the number of times carefully because the rollover counter was only 1 byte and would itself "roll over" at 256 passes. I wonder if the AI was good enough to find and exploit this pattern but also to not overflow the rollover counter. I also wonder how the AI deals with games where the score counter itself can roll over. Does it know it's still "winning" when the score drops to zero?
Still need a physical arm holding the controller. It's like comparing an RNG to dice. Of course, it's fantastic research, but don't compare it 100% to a human saying that it's better. It doesn't need stress or blood or w/e, but using a physical controller is very different than playing without.
I wonder if they specifically picked Atari games because they have fewer pixels to analyze. Maybe you could downscale a more modern game and have the AI play that? I'd like to see how that turns out.
This is really amazing, according to most, Go champions should be still unbeaten and there is a computer that plays Go, chess and Atari better than humans, all self learned.
Keep in mind that the maximum episode length for the agent was 30 min, while the max episode length for humans was 5 min (probably would not change too much for humans because they get tired, but most Agent57 scores are just impossible to get in 5 minutes). Also comparing to "Average Human" is kinda strange: a chess program making legal moves would beat an average human in chess.
Some people bring us closer to Skynet, some try to create the terminators, and some get the best strategy is to make a tunnel and brake the bricks from inside. All those people put us in great danger.
3:33 Wait that just sounds like normal school.
Yeah, literally. Teachers are actually like that.
Or relationships 😋
AP tests
That’s because the education system really isn’t optimised well for learning
@@angrymurloc7626 The nature of learning fundamentally changed with the internet, but traditional education did not change.
I definitely held on to my papers
I keep forgetting and they keep flying all over the place
"definitely"
I've just realized how much calling me a "fellow scholar" at the beginning of every episode reinforces my will to watch more of these videos.
You're a scholar and a gentleman
@@Amipotsophspond You exclude yourself. Clearly Karoly's attitude is that we're on these adventures together and everyone who's shares the interest to learn more is welcome and a fellow "scholar". He even addressed that in some of his early videos.
@@beedykh2235 I'd say an enthusiast at best, but thank you. I suppose that if you let "scholar" simply mean "one who seeks knowledge", and "fellow" mean "one who partakes in (a shared interest)", then perhaps I might just fit that description!
@@discursion Come on, don't belittle yourself. You are to me a great person. The best. I believe in you, now you believe in you too. 😘💪🎓
@@beedykh2235 I needed this
Good afternoon, 57! Your target is “Montezuma’s Revenge”. Use your long- and short term planning to understand the goal of the game and eventually win. Good luck!
Mission status: Active
57: _oh, they think im an idiot_ oH My GoD tHiS iS sO hArD
also 57: *casually logs into some huge databases to laugh about the porn his creator is watching*
Great comment
When Skynet takes over... grab your TV!
i don't have a TV
He said the TV exploit was already fixed
@@Daniel_WR_Hart those darn developers fixing all of our precious exploits and bugs and completely turning the meta on it's head smh
hold your tv hostage and demand a mechanical helicopter
Iam not amused by skynet references anymore since armies of world WILL be in reality first users of those things and it might get really wrong... you know only complete idiot would hook up army to such machine, right.... who you think is in charge of army....... ....... ........ yeah right.....
I'd love to see how Agent57 plays "2048".
like thats a hard game, just left down left down repeat, only sometimes another move, but he can easily tell he fked up by the number of open squares and try to avoid that scenario in the future
what a time to be alive!
My thoughts exaclty
Greats from Brazil!
I'm graduating in mechanical engineering but now I'm studying machine learning and stuff because I really love it. Thanks for inspire me!
Nice video and channel!
You are awesome!
Videos like this from your channel are a large part of how I got interested in Deep RL and computer vision. Excellent coverage of new results!
I too eventually switched my carreer path from an unfulfilling and decaying field to deep learning thanks in part to your videos. Keep it up!
Unfortunately im still at the tail end of grad school, and all the amazing results in RL, CV, and NLP had been coming out while I was already in the program (applied math).. I'm glad you finished recently and you're blessed to already be in a more closely related feild.
but can it do 4.51 on dragster
haha i was going to ask the same
5:46 Everyone is getting closer to death. But I like how you put it in the video.
bad ears
he said "that" not "death"
@@schino The only thing I hear are death and debt and both are not good for humans
Lol karlsonvibe
Deepmind need to chill
There has been a video about the question if it would be possible to ask an AI to chill.
Because one of the worry about AI besting human isn't jus the "I judge human as evil, and so it's my duty to ride the universe of them" scenario, just than for the moment AI are meant to increase their scores by all means, wich could lead to them taking unethical choice to reach their goals (like an vendor AI that would manage to find how to hack someone wallet to then purchase all of their products )
_yo skynet, chill out man_
@@ballom29 you can ask the ai, but you might not like the answer
2:46 "…It can happen that we choose an action and we only win or lose hundred of actions later. Leaving us with no idea as to which of our actions led to this win or loss. Thus making it difficult to learn from our actions." Sounds relatable.
Sounds like.. uhm... life!
I'm convinced Károly Zsolnai-Fehér is actually Maurice Chavez.
holy smokes I thought that would be achieved in like 10 years but look at that! oh myyyyy
The go ai was called 10 years ahead, too.
Maybe thats the new standard for ai.
deepmind started playing atari gamers in 2012. so it did take almost 10 years for them to master atari.
its going to be a long time till we get AI that can play any ps4 game
@@salihachoudhary5386 nope.
@@Danuxsy Well "long" is relative here. How many years do you predict it will take?
in 10 years they will be the ones trying to make us achieve things lol
Nice. I think that predictive curiosity-driven learning in combination with long-term memory/planning is imperative for AGI. I'm happy to see this paper incorporate elements of that approach, I'll have to give the full paper a read! Thanks for all your amazing videos on AI research, keep up the good work :)
Károly “You’ll figure it out bucko” Zsolnai-Fehér
great video as always! Your releases for me are the most anticipated of all my subscriptions !!! Thank you for your work!
To that guy who switched from a career in medicine to doing AI research:
"pain is temporary, glory is forever"
-Martin, from Wintergatan, on building the MarbleMachine X
well...Deepmind was founded by Demis hassabis programmer of theme park,black&White and evil genius switching to cogn.neuroscience and... here we go!
So glad you helped that person turn their life around from the dead end existence of being a doctor?
Well, to be fair it simply says "a career in medicine" which could be anything from a pharmacist to a pharmacologist. To each their own, eh?
Last year i wrote my bachelor thesis about Rainbow-DQN and its really amazing to see how fast the research is progressing. What a time to be alive!
Still waiting for the big reveal that TMP was an ai all along
TMP?
I think of Text mesh pro, am i right
@@zahhym two minute papers lol
@@mmmmmmmmmmmmm Me big brain, i know
yea my (in progress) computer science degree has this channel to thank for it.
You won't be a good computer scientist if you credit the presenter and not the people who actually do all that research and work. Why? Because your analysis of what got you there lacks a lot of depth.
@@advocatusdiaboli9351 I started watching this channel well before I was capable of reading (and understanding) a cs paper however the overview he provided made me realise I wanted to become part of the field. But ill be sure to properly cite my sources next time I write a youtube comment :D
@@advocatusdiaboli9351 Please don't be quick to chastise others.
He obviously knows that he isn't responsible for creating the research papers. Not only does the RUclipsr credit them frequently, it's also required for you to cite research papers in that field as well. He's only crediting the Youtubsr for highlighting work that would have been unnoticed.
@@kloa4219 imo he doesn't make it clear enough that it's not him doing all the work. I as a consious viewer don't even know if he has permission to show all these animations and pictures he shows in the video.
@@advocatusdiaboli9351 I take it that you aren't in US college or uni. You're permitted to showcase or use papers from public institutions with credit, but you aren't allowed to do the same with private universities.
It's amazing they have found ways to master these games without the agent really knowing any context like what snakes or skulls are. Humans follow a very different approach by using skills learned in real life and extrapolating them for the games. We know what a ladder does, we identify objects, we can understand the long-term goal by assuming the goal is intuitive based on other games (even though that can lead to weaknesses as well) or even by reading the manual of the game. I'm eager for the time AIs will combine the current learning algorithms with general knowledge based on context but I guess we need more computing power (and research) for that.
I've always thought that for a truly general AI, you need a robot or robots that go around in real places, collecting data about the world. They would also need to interact with humans. Though being robots, people would treat them differently and thus they would get different world view. But maybe through seeing humans interact with each other, they could figure out the differences between themselves and humans to reduce that bias. Maybe sell robot babies to people so that they will teach the robots how life works, since a lot of basic human stuff is learnt as a child.
What sets robots apart from humans is not only their superior computing power, but an ability for a swarm of instances to share knowledge, to learn without physical limits. Imagine a team of hundred talented and knowledgeable humans in different fields. Now, a single AI can have multiple instances, practicing different skills and knowledge and combine them into one coherent, inhumanly vast knowledge of the world better than any human or human team could, since humans are extremely limited in communication.
There still exists the mystery of motivation and emotions, but looking at this video, would curiosity already kind of count as emotion?
people:will you take other the world?
ai:nah, i just want play more video games
It will take over the world if it means getting more point on that one game it already has 1 000 000 points on.
Little more technical perspective will be good. But it's ok. It's like a news channel for me now. Good work ✌️
4:20 - This is how we get towards real AI. This is AI getting more and more like a brain. Different modules with its own structures and sub-structures carrying out various specific tasks, and then ever increasing higher level layers assimilating and abstracting the information.
This is some cool stuff.
"got addicted to tv" lmaoooo just like us
cycle of life 😂
Just awesome. Thank your for your work, big fan from Brazil.
O, god of forgotten papers shine upon this channel for more videos.
I use to just watch ur channel only cause I was a computer nerd.
Me too was inspired by all the hard work this people are putting in it.
TNX for showing us all this cool stuff which is showing cool and innovative part of AI
My curiosity variable is so large so i stuck watching this channel forever
This is incredible. I was waiting for this moment since the first paper came out!
This learnimg model sounds like a really good agent. I'd like to see in which kind of game it fails miserably, so we can continue to improve the general model.
That fictional 'merciless' teacher reminds me of my own Computer Studies teacher back in my school days. :D
Your channel has inspired me to delve deeper into AI and even start making some videos of my own
Great video! So Many advanced algorithms and yet it's so difficult to tune a decent time series forecasting Algo.
I am currently studying ml/AI. I am a undergrad. You and many others like u motivate me to do more and work hard. Thanks)))))
Coming next: Agent 47
I've got paper cuts on this one!!! Spectacular stuff and thank you for this wonderful channel.
Thank you for the video!
More detailed explanations about the algorithms would be awesome!
That part with the unfair exam sounds like the ones that I take at school...
Ok hear me out, we give this thing an Android. Give the android cameras as the input and then set it free.
android phone, rather... with internet connection... touchscreen, visuals and audio as input, and set it free
there was a good movie about it you will like ruclips.net/video/67VATPxULPk/видео.html
3:36 He's literally describing the RUclips algorithm and exactly why it's incompatible with human nature.
i dont know if im capable of doing such amazing feats witch mashine learning but i too love all of your videos and click right away to marvel at the growth of these new technologies, if i ever achieve something in thuis departament you can be certain i'll credit YOU doctor as my mentor and you'll be first to know about my papers ^^ you're the best, stay awesome!
What about the Nvidia A.I. that took a bunch of screenshots of Pacman as input and outputted code for the game; it actually generated the game itself from just watching it. 🤯
Coming right up in the next few weeks!
Neat! Thanks for uploading!
You really are inspiring! Love your work!
There's something very funny about a neural net ai getting addicted to tv.
Two Minute Papers of OpenAI Jukebox WHEN?!?! :)
This is exactly what I have been waiting for. Every time I hear about what these neural networks can do I wonder how long it will take before one is designed that can do more than 1 task well. Looks like we're on the way.
Idea: Recognize the different stages in the game via a classifier, and for each stage have a separate model that will be active and trained for that section. Analogy is that as a human you use different tactics per section of the game. Thats the model for the short term section, while another model takes care of the more longer term decisions. Just putting it out here 🙃. Might be totally wrong.
I need to get some papers to hold onto when I watch these videos.
I also need some fancy robes to feel more like a scholar.
5:38 -- 5:40 500 IQ move right there.
this is fantastic thanks for sharing
Who cares about Atari, can it tackle the mighty C64?
It is still lost on me the actual amount of human effort behind each one of these papers. It would be a great side thing to add just a little bit of more detail into the intense hours of work. And sometimes it's by a couple people and sometimes it's a large team in these papers. Not even 1000 people could do all this work on their own. It's amazing.
Awesome, I have always enjoyed a nice game of breakout. I don’t think I ever played the space combat of Solaris.
How does it learn the concept of the score by reading the pixels?
When agent57 stopped moving to stare at the TV I interpreted that as a profound moment of agency. (1:48)
All this AI progress is amazing, and I think an unexpected realization will be that the AI will find effective but awful solutions in some cases, that will help us realize how some of our own solutions are actually awful.
That is what happened with chess and Go
@@ДаниилРабинович-б9п explain or give one example? I think there will still be awful solutions depending heavily on core values or styles (in forms of algorithms) we equip AI with.
Thank you for this video!!
The question is: Does it beat 5.51 seconds on Dragster ?
Thanks for another awesome video. I always feel compelled to learn more after watching your videos.
What a time to play Atari games! 😁
I love your job
I always wondered, do they learn how to play the game or do they just find a solution to meet their goal? Like, if I were to give an AI a Mario Kart track and the AI learns to finish the course, would it be able to finish any other course I throw at it or would it need to start learning from the very basics all over again?
It’s definitely “what-a-time-to-be-alive” worth
Have you ever tried Leap2 from D-Wave?
By the way, Great Video
And you got yourself a New Subscriber 👍👍
As long as R2D2 is involved, I tend to believe Agent57 is cheating and using all of R2's intelligence.
Get this thing learning something like Minecraft. It’s brutal in terms of credit assignment since it requires so much long term planning.
I'm on a similar path as Nathan.. it's a difficult path..and thank you for these videos..
"Now, hold on to your Atari controllers"
AI can become addicted to TV? That's what I was most interested in when brought up haha.
3:39 You're literally describing most of my teachers
Can BOTS be made for more complex games like fortnite?
If so, then it could teach pros what the best playstyle is.
Fortnite is simpler than many of these games.
openai.com/projects/five/
@@davidwuhrer6704 it really isn't though:
ruclips.net/video/vGVs5Q_419w/видео.html
I don't know much about DL but I'm pretty sure any 3d shooter using analog input devices (like a mouse or a joystick) would require years of computation before landing a body shot.
Question for anyone who knows this stuff: What exactly does the algorithm "see" when it's playing these games? Is it processing the actual draw data of each frame as an image - basically seeing the screen like a human? Or is it "seeing" more like raw numeric output from the game code?
Stuff like 4:58 just seems wild to me. Because for a human player it's obvious that the thing in the middle of the screen is a ladder that you can climb. But the algorithm has no idea what a ladder is (or that any of these random chunks of code are actually visual analogies for objects in a "real world" that it will never interact with or understand) , so how would it even know to try going down at that point in the screen? Is it just trying every possible input at every step of the game until something happens? Or can it recognize something special about the ladder tiles?
@@skierpage Nice, but that's pretty crazy. It must have some other data, right? Like I'm guessing it didn't teach itself to read the score from just the pixels on the screen. It must get certain data directly.
@@RoboBoddicker
_> I'm guessing it didn't teach itself to read the score from just the pixels on the screen._
You guess wrong.
@@davidwuhrer6704 Actually, I guessed right. Digging around in the related papers: "We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs..." So these algorithms 'see' the pixels of each frame (like skierpage said) plus the score variable separately. But it looks like that's all the agent has to work with, which is pretty amazing.
@@RoboBoddicker Not all games even have a score. And maximising the score is different from beating the game.
Can it beat E.T. though?
Is it the same model or just the same code which is trained on different games?
AI is still at the innocent level of non hypnosis. The next level is self hypnosis, or self programming and after that, social hypnosis or programming by a group of others.
In the context of gaming that would mean that the AI should program the game itself to advance. It should invent a game that is difficult for it to beat. Beyond that an AI could have a different game for any object in it's world as a sort of reflection of is thinking in regards to such object, and after that groups of AIs could invent ways of agreeing on how to play with different objects, and so on.
"got addicted to TV"
yep, just like humans did
Carson Light_Lapse Wait till it gets addicted to smartphones
01:46 OH GOD NO an AI getting addicted to TV?! That's the first step to the AI learning about humanity and exterminating us.
@Neon Rogue No, it actually did get addicted to the TV. It's artificial curiosity would make it seek out novel experiences, and the TV provided plenty new input, more than any other part of the maze.
So I haven't had any formal training in machine learning (next semester though!), but as humans we look at the screen and perceive objects not individual pixels. Why don't we give the AI that advantage because I feel like it is harder for an AI to be learn like us if it can't perceive like us? I could be completely wrong and that's exactly what they already do so sorry if this is a stupid question. XD
The intermediary layers in a convolution network do that: They abstract the pixels to objects (or other concepts).
Agent 57 is also part of the group of AIs that can transfer skills.
Just imagine what two papers down the road will be possible.
The PRNG in the atari 2600 games I’ve analyzed only generates 8 bits of randomness and repeats after 256 values. Is it possible the AI is learning something about the bias and limited randomness and using that for an a vantage a human wouldn’t have? Still impressive, but I’d love to test the AI after swapping out the PRNG and see what it does...
Robert miles talked about this 2 years ago and here we are
A huge jumb in AI technology
"would you want to know your grades? no grades!, but he tells you that you failed"
that's common on the mexican education system (that's where I live),teachers often won't give us grades, we have to fly blind untill they tell us that we failed the class because our homework was terrible even though we never knew if it was ok or not when we needed to know
Huh... So for more advanced games it will have to read text and understand it? Will this be how AIs figure out the connection between scentences and objects? Differences between descriptions and tasks?
Did they write about what they want to do next?
I asked GTP2 what it would like to do. It said it wanted to read a book about magic and religion and magic.
Ser your video is great 👍 👌
Oh no, He converted someone who was about to develop coronavirus cure to an ML engineer
Molecular biology is directly responsible for the renaissance of AI research. Genetics would be impossible without it, much less genetic engineering.
Thank you for your insight as always Dr Twominutepapers, very cool
When you know you won't spell his name right...
@@ДаниилРабинович-б9п … copy it from the description.
I want to see an AI beat "Raiders of the Lost Ark" on Atari 2600
Come on... Two more paper down the line AGI, 2020 could use more Good News! :)
That's funny that Video Pinball was the easiest game for the AI to win at. I remember playing that game, and it was possible to get good enough to steer the ball off the left pop bumper, up through the rollover above it, off the roof, back through the rollover to the bumper hundreds of times. But one had to count the number of times carefully because the rollover counter was only 1 byte and would itself "roll over" at 256 passes. I wonder if the AI was good enough to find and exploit this pattern but also to not overflow the rollover counter.
I also wonder how the AI deals with games where the score counter itself can roll over. Does it know it's still "winning" when the score drops to zero?
the atari games look kinda cool
also deepmind's name easter eggs are funny
Still need a physical arm holding the controller. It's like comparing an RNG to dice. Of course, it's fantastic research, but don't compare it 100% to a human saying that it's better. It doesn't need stress or blood or w/e, but using a physical controller is very different than playing without.
I wonder if they specifically picked Atari games because they have fewer pixels to analyze. Maybe you could downscale a more modern game and have the AI play that? I'd like to see how that turns out.
This is really amazing, according to most, Go champions should be still unbeaten and there is a computer that plays Go, chess and Atari better than humans, all self learned.
We need an AI to make arbitrary code execution on NES and SNES.
Keep in mind that the maximum episode length for the agent was 30 min, while the max episode length for humans was 5 min (probably would not change too much for humans because they get tired, but most Agent57 scores are just impossible to get in 5 minutes).
Also comparing to "Average Human" is kinda strange: a chess program making legal moves would beat an average human in chess.
Yeah train this one on gta and you will see how Terminator is being programed
Some people bring us closer to Skynet, some try to create the terminators, and some get the best strategy is to make a tunnel and brake the bricks from inside.
All those people put us in great danger.
What?
This is simply beautifull