A. I. Learns to Play Starcraft 2 (Reinforcement Learning)

Поделиться
HTML-код
  • Опубликовано: 22 апр 2022
  • Tinkering with reinforcement learning via Stable Baselines 3 and Starcraft 2.
    Code and model: github.com/Sentdex/SC2RL
    Stable Baselines 3 tutorial: pythonprogramming.net/introdu...
    Neural Networks from Scratch book: nnfs.io
    Channel membership: / @sentdex
    Discord: / discord
    Reddit: / sentdex
    Support the content: pythonprogramming.net/support...
    Twitter: / sentdex
    Instagram: / sentdex
    Facebook: / pythonprogramming.net
    Twitch: / sentdex
    #artificialintelligence #machinelearning #python

Комментарии • 308

  • @serta5727
    @serta5727 2 года назад +17

    I have to say you make the most understandable learning materials
    Your website together to the videos. All the Code is there, the book, the playlists from scratch. Most professional educators can’t do this 🤗

  • @awsamalmughrabi860
    @awsamalmughrabi860 2 года назад +1

    I like how in depth this video is, really enjoyed it!

  • @JohnJackKeane
    @JohnJackKeane 2 года назад +2

    I do not code or have the desire to code, but this video is beautiful. I enjoy StarCraft videos seeing people micromanage, but the thought and process that goes into creating a “program” to do the same thing is fascinating. The amount of work and work to obtain the knowledge that goes into the work is far underrated. I hope for you the best!

  • @fuba44
    @fuba44 2 года назад +277

    Very interesting idea with a macro ai and a strategic ai, sort of working in tandem forming a symbiotic relationship of sorts.. could maybe even break that down even further, like on a per unit type basis... Tho i imagine the complexity explodes at that point.

    • @sentdex
      @sentdex  2 года назад +55

      We have very few unit types, at least here. For the full game, there are more, and even here I wasn't utilizing all the things a voidray can actually do, but certainly there are ways to have a "voidray" algo and a "probe" algo...etc. Definitely something to think on.

    • @hikari1690
      @hikari1690 2 года назад +12

      This sounds like how deepfakes work. Have 2 ai models compete with each other to improve each other.
      So if the macro ai needs to try to defeat the strategy and vice versa

    • @prodj.mixapeofficial6431
      @prodj.mixapeofficial6431 2 года назад +3

      I believe dota have 5 controllable units, with individual open ai per unit, and modified communication between the 5 to mimic real human gameplay.

    • @Dethek
      @Dethek 2 года назад

      When I was looking into the AI for starcraft i was thinking of the following:
      Overarching AI - makes final decision on what action to take
      Supported by:
      Strategy AI - use training from professional replays to assess based on what player has seen, what is their likely strategy, and then choose strategy based on that
      Macro AI
      Micro AI

    • @TheFalconerNZ
      @TheFalconerNZ 2 года назад

      @@ccriztoff Get his book lol ;-)

  • @kailalueni3251
    @kailalueni3251 8 месяцев назад +1

    I love you idea of drawing your own minimap! Thats a smart way to make more information available easily.

  • @fuba44
    @fuba44 2 года назад +28

    This was an interesting video. I will have a look at your example code for sure, wanna try to tinker a bit. Thanx for all your hard work.

  • @Derrekito
    @Derrekito 2 года назад +9

    Never before has a marketing ploy worked so well on me. I'm looking forward to receiving the hardcover version of the book!

  • @sebbes333
    @sebbes333 2 года назад +84

    *__* One thing I feel is missing from the map, is a kind of "ghost" of where enemies have been seen previously, which could become "points of interest" for scouting in the future.
    The "ghosts" could "fade" over time, but never fade to zero again (caped at minimum 1, starting at like 255 or something), to make the algorithm prioritize the most recent ghost locations.
    Also, instead of scouting with void rays, wouldn't it be cheaper to scout with drones (to generate ghost areas) (scouting probably targets mineral areas without ghost, to see if enemies have expanded, while voidrays can scout areas WITH ghosts, to see if the enemies are still there & try to defeat them there, can also send a probe first to ghost area, to determine enemy strength before attacking).

    • @Lithane97
      @Lithane97 2 года назад +6

      Better yet, just train an observer to scout ghosts, it's almost like they're made for that 👍
      Wouldn't even require any logic really, just if ghost entity train observer and have it sit there all game.

    • @achtsekundenfurz7876
      @achtsekundenfurz7876 2 года назад +2

      I can imagine some ways to refine the AI using more inputs:
      -- time elapsed since game started (there's hardly any risk of attack at all in the 1st minute, but at a late stage, the risk is much higher),
      -- current resource totals (letting resources sit in the "bank" is usually wose than expanding the economy or forces),
      -- # of "ghosts" on the map (where enemies were sighted and lost again).
      About rewards and penalties, I'd suggest the following:
      -- adjust the reward/punishment for victory/defeat: a "good" AI should aim for a quick victory, but not at all costs. Maybe set the victory reward to 24,000 / sqrt(seconds played) and cap at 1000 (i.e. don't reward any higher for games lasting

    • @tjw2469
      @tjw2469 3 месяца назад

      ​@@Lithane97 if there is a raven+cyclone/raven+viking/missile turret then its a dead observer

  • @Ammothief41
    @Ammothief41 2 года назад

    Thanks for putting all of that together. Looks neat.

  • @gavinmorton7682
    @gavinmorton7682 2 года назад

    this is such a cool project! would love to see this keep going

  • @protoplmz
    @protoplmz 2 года назад

    Hey! I love the update here. I followed the original series you put out. As a SC2 veteran I noticed deficiencies and deviated in a strong way halfway through. I setup separate models to handle the decision making for each aspect of the game. This makes it so it can make the decision to use its army separately from the decision of progressing tech (or not). I stopped around the time I couldn't figure out how to have it build its own strategies as I ended giving it a long set of possible actions and letting it pick and it felt too 'guided'. It was able to beat "Very Hard" 50% of the time vs random's 0%.
    Was my first exercise with ML. I got the chance to apply the concept it at work for something outside of my scope. Used both that and the SC2 project as demonstration in an interview and got a promotion out of it. This inspires me to try my hand at it again!
    EDIT: To handle army movement which you mentioned in the video, I chopped the maps up into a grid and gave it decisions to make where it could attack-move its army to any of these at will. 9 worked the best but you could make it much more granular. It used this to both attack and defend.

  • @pognar
    @pognar 2 года назад

    I have played starcraft for years and years, and I love this channel.
    This is going to be great.

  • @serta5727
    @serta5727 2 года назад +1

    Can’t get enough of learning this awesome stuff

  • @adityachawla7523
    @adityachawla7523 2 года назад +51

    Here is an idea: You can use more then 3 channels to give spatial information to your network. No need to limit yourself by conventional idea of 3 channels! If you are worried about how to visualize this, just think of it as an extra map.

  • @serta5727
    @serta5727 2 года назад +1

    Wow congratulations I think what you did is amazing 🤩 I would like to do something like this for software testing for a while but it is so complicated

  • @faithful451
    @faithful451 2 года назад +2

    I'd love to see the next video in this series with dual macro and micro algorithms and improving the win percentage

  • @Neceros
    @Neceros Год назад +1

    This is great! I'd love to see something like this could compete in the arena

  • @whitey9933
    @whitey9933 2 года назад

    Looks great, always been interested in the Alpha Star gameplay and how it manages all the different tasks.
    For the enemy search, can focus on undiscovered minerals (enemies would normally congregate around minerals fields) and probably better than random search.

  • @XmKevinChen
    @XmKevinChen 2 года назад +3

    It’s a very interesting video about the ML + gaming. As a newbie to this AI world, it also gives lots incentives to continue learning.

  • @PathToPrestige
    @PathToPrestige 2 года назад

    I'm replying very rarely to those kind of videos.. but hats off. Even though the project structure is messy, your genuine "realistic" practical approach was very enjoy some to watch.

  • @adye88
    @adye88 2 года назад +12

    This is freaking intense! also for the hunters problem: Why not make a "return to safe space" function for them when they detect enemies. That way they only perform scouting duties.

    • @adye88
      @adye88 2 года назад

      And obviously set a variable for safe space= position holding command center

  • @RickBeacham
    @RickBeacham 2 года назад

    Great stuff! Super interesting.

  • @BretBowlby
    @BretBowlby 2 года назад +1

    I like the ideas here, but be sure that you've got task that can understand the adv. of having a high ground vision giving better attacking vs not having high ground vision. Also, I'd consider having the model constantly scouting as all information gained on the players actions can lead for better counter attacks and so forth. But yeah I'm loving this. keep'em coming!

  • @Singularitarian
    @Singularitarian 2 года назад +1

    Very illuminating!

  • @binxuwang4960
    @binxuwang4960 2 года назад +1

    Already super impressive that you could do rl for macro level strategy!
    Totally agree that to solve a csrtain problem how to formulate the state action and reward is key

  • @Mutual_Information
    @Mutual_Information 2 года назад +3

    Wow I'm literally working on a series on RL theory and I was just wondering how the hell you'd code things up to actually play Warcraft 3. Starcraft 2, close enough! Such a useful channel

  • @VaSoapman
    @VaSoapman 2 года назад +44

    Why not give rewards based on how many enemy units/buildings are destroyed?
    Then give a penalty based on how many units/buildings are destroyed?
    Also to help the AI prioritize winning over stalling, you could increase the value of a win based on how fast it won.

    • @nrobo3840
      @nrobo3840 2 года назад +6

      Yeah, adding a time decay to the win reward was where my mind immediately went.

    • @moseszero3281
      @moseszero3281 2 года назад

      I was thinking a k/d reward and a lowering of all rewards for time

  • @ericzahn274
    @ericzahn274 2 года назад

    Great vid. Buying the book.

  • @SocalNewsOne
    @SocalNewsOne Год назад

    Thanks! Your tutorials were the first that worked for me. Biggest problem that I had was the directory path for the Starcraft maps.

    • @sentdex
      @sentdex  Год назад

      Thank you for the super!

  • @robanson32
    @robanson32 2 года назад

    Great book! Love my copy

  • @wootcrisp
    @wootcrisp 2 года назад

    Nicely done.

  • @Stthow
    @Stthow 2 года назад

    Amazing video dude. Gj.

  • @kevintyrrell7409
    @kevintyrrell7409 2 года назад +3

    14:49 That's some next-level Gateway placement.

  • @cheddar500
    @cheddar500 2 года назад

    Very satisfying to watch

  • @kylee.7654
    @kylee.7654 2 года назад +1

    At 4:52 regarding your comment, I added
    async def on_start(self):
    self.last_sent = 0
    after the on_step function. It makes it a little clearer

  • @serta5727
    @serta5727 2 года назад

    Thanks for the amazing content

  • @BasicAndSimple
    @BasicAndSimple 2 года назад

    Book Purchased. Thanks

  • @EnderSword
    @EnderSword 2 года назад +84

    Kind of neat, I'm wondering if you looked at the AlphaStar research at all to do this, or looked into the StarCraft 2 AI community? There's about 70 coders of various bots and AI that compete against each other and it'd give you a ton of ideas on build choices and especially unit control and decision making.

    • @Leonhart_93
      @Leonhart_93 2 года назад +8

      The AI coders in the community don't make true AI, they just give them a set of commands and responses to various actions. A true AI learns from successes and failures (reinforcement) with very little initial programming.

    • @PeterRAmice
      @PeterRAmice 2 года назад +6

      @@Leonhart_93 while this has some truth to it, what you are referring to is machine learning. The ai spectrum is much wider than learning like a human, the best way of describing ai imo is: a machine which observes it's environment and executes actions which maximizes its goals. So with that definition in mind I would argue those people are actually building ai's which do not automatically learn from their past experiences and thus they do not build machine learning ai's, which alpha did.

    • @Leonhart_93
      @Leonhart_93 2 года назад +5

      @@PeterRAmice We just called bots that follow specific sets of instructions AI in the past out of laziness and limited understanding. It doesn't apply to current times anymore, fake AI and true AI have almost nothing in common. We can't use the same word to describe them both, so a "bot" is proper for the fake AI.

    • @Leonhart_93
      @Leonhart_93 2 года назад +1

      @string name; Yes, bots. I played vs the top bots of the sc2 bots community, they are really good. They won't be easy unless you are at least masters, which is impressive for a bot. The major problem with those fake AI is that they can always be cheesed in some way, no human programmer can ever input the right answer for every situation.
      Btw, AlphaStar never had complete map vision, it wouldn't have been a valid test. It had complete vision of whatever parts it could see since there was no player-like camera which removed any delay from responses. I think that's ok, even bots respond to everything with 0 delay.
      AlphaStar has potential, but it will never progress past a certain point if they don't train it permanently on the ladder vs pro players and actually see current tactics.

    • @ErazerPT
      @ErazerPT 2 года назад +3

      @string name; It's no more cheesy than a grandmaster switching cams at 400+apm (yes, they do it...). And while "beating the best" might sound like a great eng goal, all you need is to beat 99% to already go WAY beyond what humans can do (on averga). There's a few F1 top racers, there's billions of "common drivers", for a driving ML model which is more important, beating the top F1 or consistently outdoing "Average Joe"?
      p.s. that one "human trick" that beat the model in one game was a simple "loop", as the model got stuck reacting to the same thing in a loop, back and forward. You can observer that level of idiocy in humans too at times ;)

  • @denisf.7409
    @denisf.7409 2 года назад

    It's amazing how are you doing it. Your videos are really inspiring

  • @tibielias
    @tibielias 2 года назад +1

    What an awesome video! I wonder how making an API like this for other RTS games would be possible and then training AI models for those separately. 🤔

  • @MFTomp09
    @MFTomp09 2 года назад +6

    I wonder if modifying the reward structure to include a small reward for scouting. Like finding new enemy structures or something would be useful to get more wins in those games where you said they regrouped and came back with a larger force to beat you later

    • @AlexGrom
      @AlexGrom Год назад

      Later on there is potential to counter based on what and when was seen. You see early barracks - prepare to counter marines, marauders or reapers.

  • @vladimirtchuiev2218
    @vladimirtchuiev2218 2 года назад

    This video is so God damn cool, I have a current project that I try to make chess self play work on very limited resources, I think SC2 will be my next project if the actual python API is open. What is your GPU, and how long did it take for you to train the agent?

  • @teardowndynamic6171
    @teardowndynamic6171 2 года назад

    i am trying to make a AI that will farm for me in rust, but ia mso lost xD, if I understand you are not using computer vision because the camera movement is to complicated ? so you are building data from minimap only ? if i wanted to train my AI to farm sulfur nodes in rust what would be your approach ?

  • @ButtersDClown
    @ButtersDClown 2 года назад

    Very cool idea. I think programing a few meta builds into your algorithm and seeing how it learns with time (if achieved "this" by "this time" do "this" otherwise do "this") like doing a rush build ect.

  • @benoitkinziki3916
    @benoitkinziki3916 2 года назад +1

    For the reward mechanism you could probably build a LSTM that gives you the probability of winning for each action you take and you should probably include a time penalty to avoid the bot dragging the game out

  • @Telos8
    @Telos8 2 года назад

    Any plans on a part 2 with the microgame plan implemented and see how it runs in tandem?

  • @J3553xAnotherFan
    @J3553xAnotherFan 2 года назад +23

    This is now the 3rd programming/ artificial intelligence channel that I've found myself watching even though my ability to code (or even Math) is so awful that if there was a gun to my head I would beg to just be shot. But I find it satisfying to watch. Like a time-lapse of an ant colony diligently working away.

  • @dogtato
    @dogtato 2 года назад

    very interesting to see how you structured it to use ML decisions for higher level decision making. would definitely be interested in seeing how you approach a micro script and specifically wonder about the ability to add new behaviors without having to retrain from scratch

  • @cedrickram3180
    @cedrickram3180 2 года назад

    Some time series analysis (windowed access to what has been searched, where stuff was, ...) would probably help the AI make better decisions. The data of just the map does not do a good job of storing time-information.
    Your rewards seem like a good fit. Great video!

  • @th1nhng0
    @th1nhng0 2 года назад

    This is what Im looking for

  • @BalimaarTheBassFish
    @BalimaarTheBassFish 2 года назад

    I'll be interested to see how you link the different AIs with there different specialties together. My only concern would be there is bound to be some overlap, how would the AI resolve 'competition' against itself when one or more AI specialties want to control the same thing?
    ugh I can English I swear!

  • @Magicks
    @Magicks 2 года назад

    well done sir

  • @serta5727
    @serta5727 2 года назад

    So good blows my mind

  • @floydbarber7528
    @floydbarber7528 Год назад

    oh man, i needed that book 3 months ago, made with 3 others our own NN and genetic algorithm to play mario. also with reinforment learning. i was thinking about how hard it would be for sc to do so. but it doesnt seemed too hard, but you used didnt wrote your own neural network right?

  • @Gameboy499
    @Gameboy499 2 года назад

    Hello, may I ask how to automate process of training there? Or I need to manually restart game everytime?

  • @arashiiku417
    @arashiiku417 2 года назад

    Would it be possible for you to script the last few enemy corordinates where the ai encounters them and then project a trajectory to where the enemy may be?

  • @eight7934
    @eight7934 2 года назад

    looking forward to see how this turns out after its polished.

  • @jeremyheng8573
    @jeremyheng8573 2 года назад

    very inspiring video! Looking forward for more reinforcement learning tutorial!

  • @davidcristobal7152
    @davidcristobal7152 2 года назад

    Does Stable-baselines allow to store states - reward pairs in harddisk? I developed a modification of the MemorySequential class in keras-rl to use little memory in ram. My algorithm uses a thread to store states (images or whatever) as numpy arrays in my ssd disk, and keeps a randomized subset of the states in every loop of the algorithm in order to train the agent without using tons of RAM (which i don't have). It's a sloppy implementation so I was wondering if stable-baselines has something like that

  • @ducnguyen4973
    @ducnguyen4973 2 года назад

    Cool video

  • @nastrimarcello
    @nastrimarcello 2 года назад +14

    This amazing. Amazing code, amazing explanation, amazing editing.
    Only one suggestion: when possible, don't use try:...except:pass
    As this can lead to hellish problems.
    If you know what exception you are having in that try-except statement, using that exception explicitly is better (even if you are just going to 'pass' it)

  • @robwolters7401
    @robwolters7401 2 года назад

    In my experience grouping attacks and synchronising targets is very important.

  • @themaster8432
    @themaster8432 2 года назад

    is there a c# version of the code snippets? maybe another resource that can teach machine learning algorithms in c# also? :)

  • @alrey72
    @alrey72 2 года назад

    Can some of the values be included in the iteration or training ... like for example the reward values?

  • @alansmithee419
    @alansmithee419 2 года назад

    2:20
    But does the ai then think there's very little there, or are they dim just for us to more intuitively understand the video?
    If the former, why would the AI go to those areas if it believes there's nothing there?
    Or does it have to learn itself that dim means there's very little, thereby also learning for itself that very dim means unknown?

  • @keanamrazek3745
    @keanamrazek3745 2 года назад

    What if you were to only use the reward function you did in the initial training and then use the win-loose reward for model refinement.

  • @danielglidewell
    @danielglidewell 2 года назад

    I wasn’t in the mood to watch the video when I read the title, but when I realized what the thumbnail was I stopped by to drop a like lol.

  • @whateverppl1229
    @whateverppl1229 2 года назад

    9:20 that's what I figured you'd do but my question is would it be a bad idea to take away points if an enemy unit/building dies? because then, it would be rewarded for attacking. (more points from a kill than a loss, or individually price every enemy unit/building as its own value and same with ally losses) to help teach it to not lose units but to do damage.

  • @le_med
    @le_med 2 года назад

    Any chances of putting the books on amazon as well?

  • @calebb4782
    @calebb4782 2 года назад

    Couldn't you use the mini map and cursor to move camera by clicking said mini map? or am I missing something?

  • @fuba44
    @fuba44 2 года назад +2

    In regards to rewards, did you try "resource worth of kills" devided by "resource worth of loses" + the win or lose bonus. ? (Maybe with a modifier on workers to make them more juicy targets) + maybe something to do with map exploration.. to find hiding bases.. just spitballing here, i know you already did a lot on this project.

    • @sentdex
      @sentdex  2 года назад +3

      I am not sure I tried that exactly, I think you really want to have some sort of total "resource" reward that doesnt punish you for building units/buildings, but then you might needlessly build things.
      For destruction, what you describe may work well as a reward, so maybe combat against a worker is worth less than attacking a city hall...etc.
      I am not sure it's wise to force the AI to reward things higher or not though, weird things happen the more biases you insert into rewards, at least that's what I've found so far with RL. Ideally, the rwd function is as simplistic as possible.

    • @BalimaarTheBassFish
      @BalimaarTheBassFish 2 года назад

      @@sentdex "so maybe combat against a worker is worth less than attacking a city hall...etc"
      Which could be dangerous against a Terran opponent whose workers can repair buildings while you're attacking them. Had an issue like this with my CNN attempt at SC2 where i lost my entire army because they were too busy focusing a terran command center while completely ignoring the half dozen marines shooting them.

  • @achtsekundenfurz7876
    @achtsekundenfurz7876 2 года назад

    Just a quick note: the "can afford" check at 04:47 is NOT totally redundant. You're inside a "for each idle stargate" sort of loop, and if two are idle, you could end up in a situation where you can afford one but not the other -- and depending on the capabilities of the ex-handler, tripping an exception doe to insufficient resources could crash the AI.

  • @user-gs6lg4gd3b
    @user-gs6lg4gd3b 2 года назад +2

    You actually need more then just voidrays. And for other units some scripts for fighting patterns. You can probably star with archon-immortal-charge zealots composition. It has almost none fighting patterns

    • @WTfire10
      @WTfire10 2 года назад +2

      No voidrays are the only unit a protoss needs

  • @serta5727
    @serta5727 2 года назад

    That is very cool and powerful

  • @romanlee7082
    @romanlee7082 2 года назад

    Hi, thank you very much for sharing this video.It opened a new window for me to know AI. May I know where I can download the code corresponding to your video please?Thanks again.

  • @kja2ja
    @kja2ja 2 года назад

    Interest! But isn't the build in AI already do all these? just need to adjust the difficulty level, no? What is the difference between this Python code vs the SC AI? Awesome info!

  • @Shazumbi
    @Shazumbi 2 года назад

    I know nothing about coding or anything else, really, that went on in this video. But I do have a question, as this is incredibly interesting; how do you make the program know if an action is "good" or "bad"? Trying to rack my pea brain on how one would write this out. Even if you say the final outcome is greater than x (as a "good" outcome) how do you 'convince' your program that it should continue to try for that outcome?

  • @FF7Cloud
    @FF7Cloud 2 года назад

    it might help to allow a phoenix now and then for scouting purposes since void rays are super slow

  • @kritikusi-666
    @kritikusi-666 2 года назад

    Is there a way to put this model to test vs ladder?

  • @teardowndynamic6171
    @teardowndynamic6171 2 года назад

    i know nothing about programming or AI but this is just so fun this watch

  • @cmilkau
    @cmilkau Год назад

    Interesting actions. Not only do they encode a lot of knowledge about the game, they include deep causal chains that otherwise would take long to learn.

  • @stonecoldscubasteveo4827
    @stonecoldscubasteveo4827 2 года назад

    Reward for resources spent. this will incentivize expansion and rapid army growth until max out. At that point change the reward to enemy units/structures killed. Something like (big reward) for spending money on nexus/probe/stargate (bigger reward) for void ray, (penalty) for having too much money banked up unless supply is >190. Then (big reward) for killing enemy unit/structure, while dialing back on rewards for building structures. zero out the rewards for probes over 70-80 and for pylons over 200 supply. When supply drops due to combat, flip the rewards back to making void rays to max out again.

  • @jin416
    @jin416 2 года назад

    supper cool !!!!!

  • @canadiancoding
    @canadiancoding 2 года назад

    Might also want to look into upgrades if you haven't. Units in mid-late without any upgrades are much worse in SC and this might have quite an impact.

  • @cowjuicethepallytank
    @cowjuicethepallytank 2 года назад

    Some potential rewards (or punishments) could be losing a voidray is a negative percentage of the positive reinforcement for attacking. Locating the enemy could be a small reward every x seconds to incentivise optimal searching patterns. Another question I have is what information does the API have access to? Does it have the capability of identifying enemy units? Are you able to get unit counts of the AI's specific units? Do we have the capability of training upgrades?
    In general, I think that with given the correct training it may be possible to find certain timings of when best to scount and taking optimal scouting paths as well as best attack timings in terms of time in game as well as potentially within build order. The difficulty, depending on how far you take it, could come down to army composition and as you were saying, micro.
    Lastly, showing my lack of knowledge in AI learning. Would it be possible to train the AI using professional gameplay wins, then use that as a baseline "build order" for then using the reinforcement learning?

  • @serta5727
    @serta5727 2 года назад +1

    So coooool 🤗

  • @serta5727
    @serta5727 2 года назад

    Wish to also learn those skills for software testing

  • @serta5727
    @serta5727 2 года назад +1

    I find it very interesting 🤓

  • @lucasbussinger3955
    @lucasbussinger3955 Год назад

    Does anyone knows what app he uses to keep track of the data ? ( 13:59 ).

  • @ccgamerlol
    @ccgamerlol 2 года назад

    like Deepmind Alphastar, cool, would love to see full gameplay of this, please?

  • @bronsoncarder2491
    @bronsoncarder2491 2 года назад +11

    Here's my issue with your approach:
    Your actions are basically just a hard coded list of commands. You could essentially just create a hierarchy of those commands and apply a little probability and get similar results.
    The way you've set this up, the AI will never develop novel strategies. It can, at best, play with the topmost level of human strategy available (and that's only if you spend the time to hardcode that into each action). And, that's cool, but... I feel like the point of an exercise like this should be to see how the AI "thinks" about the task and what novel strategies might arise from that.
    Idk, I do understand that the computing power to decide between the thousands of different options available at any given moment in an RTS is beyond most personal computers, but... I feel like hard-coding the actions kind of defeats the whole purpose.

    • @JOHNSMITH-ve3rq
      @JOHNSMITH-ve3rq 2 года назад +1

      hard agree. love the channel but yeah -- all the hardcoded rules are confusing. Can't you simply give it the barest of initial game parameters - no strategy, no rules - and let it learn from winning strategies?

  • @ReallyWhy123
    @ReallyWhy123 2 года назад

    this book is impressive

  • @dracomurdock6349
    @dracomurdock6349 2 года назад

    The criteria I would try to ensure it has highest on its priority is- if you win, only- unit efficiency. IE: how many resources did this unit earn, or destroy for an opponent, relative to its own cost? Averaging them out, and defining those units by a percentage based on the actions they were made to perform- and segmenting the game into the first 5 minutes and the rest of the game- you could provide a huge assist to the AI learning more complicated macro and micro strategies.

  • @djsyntic
    @djsyntic 2 года назад

    When you got talking about how to handle the gas extractor on your minimap was that you handled it strangely. So keep in mind that the RGB values for the colors you put on your map are arbitrary and serve to help you visually more than the computer. But you could have encoded some meaningful data into the RGB itself. For example, instead of saying "This building is green, this building is dark green" and so on, you could have put all building/unit type info into the R-value of RGB. IE: This building is R-value 12, this building is R-value 13, and so on. Then the G-Value could represent something else, like building health. IE: R-12, G-255 means it's a Refinery at full health while R-12, G-1 means the Refinery is about to explode if it takes any more damage. Finally, the B-Value could then be used as some sort of indicator of something specific to that building. R-13 might be a Barracks, and B-2 might mean that it's in the middle of training something and has 2 units of time before it finishes and can do something else. On the other hand, R-14 might be a Gas node, and B-# could indicate how much gas that node has, while R-15 indicates that this is an extractor with the B-# still indicating how much is still in the node.
    Sure to YOU R-14 and R-15 are basically the same amounts of red and your eyes wouldn't be able to tell the difference, but to a computer, those are two distinct values.

  • @erics3596
    @erics3596 2 года назад +1

    Do you want Skynet? Because this is how you get Skynet :) (also great strats and explanation on how this works)

  • @rpraver1
    @rpraver1 2 года назад

    Long time follower and purchased your book. Why not touch on genetic algo from scratch?

  • @FireTouched
    @FireTouched 2 года назад

    I wonder the reward structure. It doesn't realy feel like looking for optimised play as the only negative reward you mentioned was the loss itself and after that only determining efficiancy by the total score. But what about tracking negative rewards (loss, loss of units/structures/resource access, etc.) and comparing the positive and negative score? That way the AI could pick a winning strategy that accrues few losses over one that accrues many losses - despite both having the same end score. And in turn the AI would be able to know the errors due to the dip in the comparison.
    Also maybe implementing a way that reduces positive/increases negative score over time? That way stalling would also be discouraged.

  • @matheusGMN
    @matheusGMN 2 года назад

    your strategy of multiple Ais to coordinate everything at the end that you mention is the same one Paradox Entertainment uses in games like Stellaris and EU4

  • @TheThunderSpirit
    @TheThunderSpirit 2 года назад

    u have to use pipes for interprocess comm. or at least udp

  • @witherslayer8673
    @witherslayer8673 2 года назад

    how about building more than just void rays(having stats of each unit, cost, and space. may save for big units, or LOTS of small units)
    and where air units can go, and were ground units can travel