OpenAI Five: When AI beats professional gamers

Arxiv Insights

Просмотров 26 тыс.

Добавить в
- Мой плейлист
- Посмотреть позже
Поделиться

HTML-код

Размер видео:

Показать панель управления

Автовоспроизведение

Автоповтор

Опубликовано: 31 дек 2024

Комментарии • 89

@paulstevenconyngham7880 6 лет назад ⁺³⁵
an order of magnitude better than the Taj Ravals explanation.
@abcqer555 6 лет назад ⁺¹
Agreed
@DragonofStorm 5 лет назад
yeah, this is much more coherent and focused
@EdViaja 6 лет назад ⁺²⁰
excellent video... You are really good at explaining the news in AI. Good job!
@wyattaxton4085 3 года назад
instablaster...
@nateshrager512 6 лет назад ⁺⁶
Fantastic video! Best reporting and breakdown of this event i've seen. Bravo!
@johannesCmayer 6 лет назад ⁺⁴⁶
your audio is really quiet compared to other videos, maybe look into that.
@ArxivInsights 6 лет назад ⁺⁷
Johannes C. Mayer You know what, I've noticed that too but I was expecting RUclips to normalize the audio volume but apparently it didn't.. And there's no way to adjust the overall volume after uploading. Bummer :p I guess people will just have to turn up volumes!
@loopuleasa 6 лет назад
you might be able to do it retroactively
@that_tabby 6 лет назад
Re-upload it with higher levels. In the long run this might be the best idea.
@ArxivInsights 6 лет назад ⁺¹
André Tabarra Hmm though choice. Can't right now anyway (5 day road trip :p) But isn't there an option to upload a different audio track? Not sure if you can fully replace the audio though..
@wlorenz65 6 лет назад ⁺¹
Video debug info says "relative_loudness": "-17.609". So Google knows it, but instead of just running MP3Gain.exe from 20 years ago over the MP4, they have decided that loudness normalization should be the browser's job.
@shmigelsky 6 лет назад
Wow, that was really good. Aside from diving into PPO (and why they used PPO vs DDPG or others), it would be really interesting to see your explanation of the full neural architecture diagram. For example, going through each pathway and what purpose it serves, why it is, how the gradient is flowing back through the network, etc. Links to the related content (ie github, diagrams, OpenAI videos) would also be helpful.
@FlyingOctopus0 6 лет назад
Isn't space of all action that can be performed exponentialy large? So I think the question of overfiting is a question of priors and what states\actions are considered equal. Like in images we use convolution to introduce spatial invariants, which makes translated images equal. Maybe we as humans also have some invariants that may make our "sampling" dense.
@abhijeetghodgaonkar 6 лет назад ⁺³
When will the PPO video roll out , did PPO have a big impact , considering that they achieved this good result by partially overfitting the huge training data.
@ArxivInsights 6 лет назад ⁺²
RouxNBlind PPO video will be out in about three weeks I think (gotta write out a small script, record the video and edit, all during my spare time :p). But PPO definitely had a large role to play since it is such a stable RL algo!
@markoavramovic1659 6 лет назад
Roshan upon death drops an item that a single person can pick up that revives them upon death once, and gives some gold to the entire team, but there are no buffs for entire team unlike in lol. ( unless you consider getting 200 gold a buff )
@ArxivInsights 6 лет назад
Thx for pointing this out! As you can see, I'm not a Dota II player myself :)
@tomfillot5453 6 лет назад
I'm really interested in how they are dealing with incomplete information. I'm getting in ANN a bit, and the thing I have the most problems with is dealing with variable amount of information. Data structure basically (how are qualitative data encoded ?!) Booleans only take you so far, and not knowing if there is something is very different from knowing there is nothing.
Somehow, it doesn't seem to have really understood how to predict movements, since the ward strategy lacks any comprehension of critical passages to keep in sight. Maybe just sheer probability and a strong formation of the 5 champions to respond quickly.
@1forgiven2 6 лет назад
Some model structures can deal with variable sized input data, but what often happens is that data is "forced" into fixed size features, such as cropping images to identical sizes, embedding words into vectors and padding sequence data to a certain length.
@hck1bloodday 6 лет назад
"Somehow, it doesn't seem to have really understood how to predict movements, since the ward strategy lacks any comprehension of critical passages to keep in sight."
wards and dust have artifacts like the roshan check every now and then. the bots have fixed item builds and they bough wards and dust because it is hardcoded, if the inventory is full they just use the consumable item to release on space when the courier bring new items so he ward/dust where they are at that moment.
openAI five team have said that they are working to lift up the hardcoded build limitation and train so the bots can learn a proper build for each hero in each match, let's see if it work at the international ;)
@abcqer555 6 лет назад
Fantastic video!! Please keep up the good work and keep them coming :)
@2DReanimation 6 лет назад
Thank you for this very clear, thorough but concise summary!
@purelogic4533 6 лет назад ⁺¹
Enjoyed the explanation.. now I can appreciate what the model is achieving on a dense space. Basically if you train on a universe on all possible actions wt the reward function to penalize and strengthen actions accordingly you can out do human performance. But this is nothing more than a brute force implementation.
@this_is_private 5 лет назад
what about communication? did the ai communicate with each others?
@ivansummers7259 6 лет назад ⁺¹
As always good quality content. Audio levels are low though. Keep up the good work.
@MurloKX 6 лет назад ⁺⁴
Umm why is the reward for killing an opposing hero negative?
@Muskar2 6 лет назад ⁺¹
_"This score supplements the score for the gold/experience gained._ _The explicit "Kill" score is negative to reduce the agents' reward received by a kill, but the total is still positive."_
@Muskar2 6 лет назад ⁺¹
I'm guessing their bots become too obsessed with killing without it.
@MurloKX 6 лет назад
That makes sense. I was guessing it would be something to discourage them from going only after kills. Thank you!
@Dr_Menny 6 лет назад
Very interesting video. Thank you very much!
I'd like to know how they 5 AI cooperate. As it's written on the OpenAI blog, there's just a global parameter for "teamplay". This means that each AI work alone right? When does this "teamplay" parameter become important?
@ArxivInsights 6 лет назад ⁺¹
Each hero is played by a different Neural Net (but they do share parameters in their input processing pipeline). There is no explicit communication, but with the team-spirit hyperparameter approaching 1, all bots basically share the same goal function so their behavior automatically becomes in-tune. There might actually be very subtle ways the bots communicate with each other that emerge through training but it's rather hard to check this
@Dr_Menny 6 лет назад
Clear, thank you for your answer! Waiting for your next video :)
@varshneydevansh 6 лет назад
You're doing great man.
Soon, I am also coming.
Love your work.
@poojanpatel2437 6 лет назад
Just waiting for your video.
@JonathanCGroberg 6 лет назад ⁺¹
To be fair they threw together 5 random high ranked players who dont play together. Obviously, the game comes down to teamwork. Also, the players were not allowed to watch any games of the bots, as you can tell the second game they did much better. In the pro scene players study the current meta where they watch others teams play hundreds of games. Also it was the 3rd game that twitch chat picked the draft for the bots.
@ArxivInsights 6 лет назад ⁺¹
Jonathan groberg Agreed, the pro team was not top human performance, but with these kinds of AI systems, once you get close to expert level, it's usually simply a matter of scaling before you surpass that. And as for counterplay, I'm really certain that given one week with the bots, a pro team would be able to find weaknesses indeed. But still, from an AI perspective, this is an impressive achievement!
@ArxivInsights 6 лет назад
And yes, I misspoke (game 2 vs 3) while recording and was to lazy to redo the take :p
@jennyxie5382 6 лет назад
It is still amazing feat , consider the machine learn by itself , and able to play dota . It is something you would not think of 10 years ago. AI being very stupid and does not understand stuff.
I feel like the AI is able to mimic alot of concepts and ideas that make us human ~.~ watching the games.
@insidetrip101 6 лет назад
After having studied AI a (ridiculously) small amount, I can now totally understand why someone like Chollet would say what he's saying.
However, since I've not actually gotten into this stuff any further (as of yet) than to understand the math and how to create relatively simple classifier networks, I sitll remember what it was like to be a completely lay person and not understand what's going on.
My point, is that before I thought the Singularity was just around the corner. Now, I think its still imminent because our path is still in constant progress; however, I'm not sure we're thinking about it in the right way.
To Chollet I would ask him, what makes us think that we don't require a "ridiculous" amount of "training" data before we start making decent decisions. Sure, we might not require thousands of years of play to be able to become pro at Dota 2 (although there are many people who could play for that time and not become pro), but we do require almost a year before we even start to say a handful one syllable words. Two to three (or even more) years before we can formulate simple ideas. 10, 15, 20 (sometimes never) years before we have the capacity to start understanding complex mathematical, scientific, philosophic, ect ideas.
So, yeah, we do require a LOT less "training" data, than OpenAI does for Dota II, but the difference--at least in my still lay person opinion--seems to me to be in degree rather than dimension.
Are there some clever tricks that we're going to be able to do to use less training data, make the models more general, ect? Of course; but I'm not certain that the difference between the tricks we use today, and the tricks we use tomorrow, will be much different than the hardware we use today versus the hardware we use tomorrow.
Its very possible that this is it, and we just need years of our own training data on building particular ai systems before we'll end up making a general one . . .
After all, there is a way in which--if we have enough computing power--we could just define these networks to recursively interact with each other according to the specific tasked needed. Of course, there's emphasis on the computing power there, but what's really going to be the difference there between the millions upon millions of years of evolutionary history that was required before we walked the Earth?
It seems to me that we're kind of putting intelligence on a bit of a pedestal.
@ArxivInsights 6 лет назад
I think the main thing we currently lack in AI is generalization and transfer learning. You're absolutely right that human infants take years to learn how to see, walk and talk. But once they've mastered those skills, humans are incredibly good at using those as a platform for building towards more complicated skills. Our intelligence seems to be an intricately connected system of highly modular and reusable blocks.
Current ML approaches have very limited transfer learning capabilities, so for every new (hard) task, you basically have to start from scratch and learn everything end-to-end, hence the low sample complexity.
Solving this is currently the biggest open problem in AI research and there aren't any fundamental solutions to it (yet).
But as you can imagine, fixing this would hurdle AI capabilities from current levels right past superhuman levels in the blink of an eye..
@imranrashid2890 6 лет назад
Could you make a video on Deepmind's Logic Units paper
@wiiiiktor 6 лет назад
waiting for the 2nd part! :-) about the proximal policy
@chrishare 6 лет назад
Great video, mate.
@peschebichsu 3 года назад
Great video. Always impressing to see such thing. As a lol player I'm a bit sad it's dota an not League of Legends, is there really no good content about an AI trying to beat humans in LOL (I think there is one called AlphaLOL, but couldn't find andy good commentary about it)?
@arkasaha4412 6 лет назад
I love your analysis man, would you consider making videos on various reinforcement learning algorithms?
@ArxivInsights 6 лет назад ⁺³
Arka Saha Well, a video on PPO is up next! After that I might do some stuff on representation learning since that's the area I'm currently doing a PhD in. Obviously closely related to RL.. But I'm gonna host some community polls to see what other things I could cover!
@arkasaha4412 6 лет назад
Thanks for the reply, looking forward to your next video. :)
@hwhd 6 лет назад ⁺¹
Great video!
@bosr 6 лет назад
Great video! Thanks a lot.
@areallyboredindividual8766 4 года назад
I wonder if it would be possible to train an AI to play Deus Ex
@paulstevenconyngham7880 6 лет назад
hey man, love your work - dont want to sound unappreciative, as know how much time it must take to make one of these , just wondering if your PPO video is still coming?
@ArxivInsights 6 лет назад ⁺¹
Script is done, filming this weekend, hopefully uploaded next week! I'm slacking lately, I know :p
@paulstevenconyngham7880 6 лет назад
ruclips.net/video/VZ2HcRl4wSk/видео.html
@theecherry9115 6 лет назад ⁺¹
Wow ! Nice Job!
@abhijeetghodgaonkar 6 лет назад
I love Dota and Go , and I am working in ML , and I loved the video!
@iblabla60 6 лет назад
Very interesting. Keep going
@norlesh 5 месяцев назад
Had to LOL watching this in 2024 when OpenAI was introduced as a non profit during the intro.
@ThibaultNeveu 6 лет назад
Thanks!
@venugopalbv2069 4 года назад
Audio is very low
@marekforst8358 6 лет назад
16:00 omg!! people who are not 99.5 percentile and above will always think that the bots were confused. THEY WERE NOT! they actually played really, really well. and did the best thing that they could with their shitty picks
EDIT: ofc they did mistakes, but overall they played much better than any human would.
@randomdude4136 6 лет назад
Much better then any human in some aspects yes, like reaction speed to blink engages, but in many many other areas no. They made mistakes mid tier matchmaking players wouldn't make.
@wiiiiktor 6 лет назад
cool video, thnx!
@tomtorger9502 5 лет назад
Great videos! I hope you do a comparison to AlphaStar :-)
@shubhpatni2123 6 лет назад
waiting for the championship
@codeincomplete 6 лет назад
Wouldn't the Dota professional players be an example of over fitting as well ;)
@curtleyambrose100 6 лет назад
Good job my friend. Here's a virtual burger for you as a reward. "Burger".
@codymaverick94 6 лет назад ⁺³
"simple video game" really?
@ArxivInsights 6 лет назад ⁺²
As replied before, that quote was primarily aimed at people who have never played video games and would be surprised that research organisations spend millions on creating AI systems to tackle those. I've played many hours of League of legends so in no way did I intend to undermine the complexity of these games, they are great! But you have to keep in mind that some people wouldn't necessarily understand why 'video games' can be so complex for AI systems!
@randomdude4136 6 лет назад
Not trying to take anything away from this development, but abit of a overexageration on the strength of the human team, yes they are the top 0.5-1 percentile of players but you have to take into account the average "player" probably has around 20- 40 hours in Dota.It's a free game after all and is constantly advertised on steam. If I took a high ELO team of randoms in divine 5 from the matchmaking system of DOTA they would likely beat this human team as well. In GO terms id say they were 1- 2 Dan at the very very most, and if I was being real probably in the non pro Dan ranks
@ArxivInsights 6 лет назад
Fair point! And as we've seen, OpenAi Five indeed isn't strong enough yet to beat processional teams. But for many benchmarks in AI, figuring out how to bridge the gap between "absolute crap" and "decent" performance is usually much harder than going from "decent" to "superhuman". In many cases simply scaling up compute will do the trick. However that has been the case for many past challenges. Perhaps we're starting to hit the boundaries of what scaled up 'curve fitting' can do?
@randomdude4136 6 лет назад
Agreed, i'm sure OpenAI will be much more powerful next year, even if it doesn't manage to beat pro teams still just expanding the hero pool to a range much larger then the current 15 would be impressive
@xinofiero7829 6 лет назад
Lol he just said dota 2 is a simple game 😂😂😂😂😂😂
@perceptoshmegington3371 4 года назад
Those NA casters aren't pro lol, OG (the world champions) destroyed OpenAI 5.
@Plaxer02 3 года назад
the AI destroyed OG not the other way around. Even though its 1 year later, the AI is right now literally unbeatable, and changed the way Dota gets played
@kodokbeku45 6 лет назад
dota 2 simple game?? you maybe drunk while making that video
@loopuleasa 6 лет назад ⁺⁶
"why would they spend so much time to research AI for simple video games"
Dude, do you even know what the game is about? Dota is not simple at all, it's one of the most brutal multiplayer games out there, and people reach 5thousand hours plus and barely being good at the game. "Trivial and uninteresting", that is false.
Innacurate on the intention of Dota2 OpenAI. The team didn't want to benchmark using dota2 as an environment, they wished to tackle a very complex and generalistic game like dota2 in order to advance the frontier.
If you go to the OpenAI blog, they used the same architecture to train robot arms to manipulate objects in a smooth fashion.
Source: Software Engineer, Dota2 player and watcher, OpenAI fanboy, and I watched the live bot game and read all their blogs
@ArxivInsights 6 лет назад ⁺¹⁶
loopuleasa That quote was primarily aimed at people who have never played video games and would be surprised that research organisations spend millions on creating AI systems to tackle those. I've played many hours of League of legends so in no way did I intend to undermine the complexity of these games, they are great! But you have to keep in mind that some people wouldn't necessarily understand why 'video games' can be so complex for AI systems!
@tomfillot5453 6 лет назад
What I would love to see as a Starcraft 2 player, is if a similar method can be used to "solve" SC2. I have a feeling that SC2 is a tougher thing.
For exempke, I feel like a good portion of the skill of OpenAI was having 130 years+ of match-up analysis to get the edge right away. No such advantages could be gained on starcraft, especially in mirror match-ups.
Also, it would be hard in a BO5, but i'm guessing sufficient analysis could break even that one. Could a good team, training against OpenAI for a longer time, learn how to make it trip ?
And if they hard code a build in the starcraft ai, it may become way too predictable. Though Innovation, arguably one of the strongest terran, is known to do basically always the same build and winning anyway.
@PhoenixKDIX 6 лет назад ⁺³
Starcraft seems way, way easier to me. Dota has more units, abilities, items, discrete actions, etc.
I legitimately think Starcraft's Open AI would be beating everyone at this point already, especially if they relaxed the whole "we want to give humans a chance in reaction time". A bot could execute completely flawless macro and micro at the same time, giving itself an advantage literally every second the game continues. It would have perfect knowledge of the attack range of every unit, not having to eyeball anything, but knowing exactly when and where to move each individual unit backwards or forwards for perfect damage splits a and concaves. I don't think Starcraft is even half as interesting as Dota as a platform for advancing the field's capabilities.
@saurabhjhanjee2408 6 лет назад
Kuran Nasir in the highest levels of Starcraft, being able to micro and macro impeccably is only part of the fight, especially in starcraft 1 which is said to be almost perfectly balanced as there is a counter for every strategy.
@ShiftingSkys 6 лет назад
Lmao Simple video games. Dota 2 is far from simple.

Следующие

Автовоспроизведение