There is the famous guy who won the AI only poker tournament.. His AI just always went all in, and the other AI's thought since its a large bet, they have better hand and just folded.
Hey! Poker player here, first of all grat job, must not be easy at all! I think you would have better success playing against your friends (and bad poker player in general) if you programme your AI not to play in equilibrium but to play unbalanced in order to exploit these weaker players' leaks!
Well done on the poker AI project, challenging your friends. It was definitely worth it, but there are obviously a few areas that use some improvement. Your explanation of Nash and CFR was quite solid, while the practical implementation seemed somewhat missing in depth, especially handling complex game states beyond simple abstractions. This, in consequence of the pre-flop and flop separation, is what some may consider a big flaw for a more robust AI. Your idea to use Equity for card abstraction is great, but it seems that your method hugely oversimplifies the complex situations, as you noted with very similar equity hands playing differently. Now, this would be of interest: a tournament of poker AI systems. If you're game for it, let's set up a match against Poker AI from pokerbotAI. It will be impressive to see how different strategies and their various implementations fare against each other.
Variable change is a big factor too. Like the 66 and KQ equity, and each of their distinct strategy. In a single raise pots, both 66 and KQ are 100% raise to an unopened pot in any position. Post flop is where it differs. 66 needs to pot control on 1 over card board texture. Meanwhile, KQ can Cbet 1/3 pot to deny equity and/or semibluff as the aggressor. Variable change for poker is then the product of nut advantage and range advantage.
@@DalePumento The adaptation of strategy according to hand strength and position is key, and the variations of changes in post-flop decisions are huge. Distinctions between hands like 66 and KQ, and their respective strategies based on board texture, all point to complexities that a more advanced AI should address.
@@worldofpoker I agree with you that OP oversimplified his AI model. The decision tree is too large for the model that he designed. Hopefully with better time and more model training, he could improve it.
Confused why it calls the huge jam on AA2 with QT. Is it because pre-flop and flop are decoupled, somehow? Or is it not realising the difference between all in and small bet?
It could mainly be because this spot never comes up in nash equilibrium strategy. I think the bot is not playing GTO in general and has some flaws, but even a well made bot might struggle with this spot because 6x pot jam should never be played.
@@BarvGwydh Possibly. But watching the film I suspect his preflop model isn’t putting enough aces in this line because it’s decoupled from post flop action. Obviously with more aces in his range here, the need to call weaker hands reduces. To get the algorithm to compute a balanced preflop strategy requires CFR to be done across multiple streets, and he implies that he solved preflop and postflop separately. Basically if you just train a preflop model based on maximising equity before you see the flop, your AI won’t have enough board coverage when it’s deep, and it won’t slow play enough because it won’t see the value it gets from those things post-flop.
This has been a project that I've been working on for a pretty long time... so I'm happy to finally release it! Let me know if you guys want a follow-up, since there are still quite a few things I can improve the AI on. I'd like to challenge an actual professional poker player next time. If you enjoyed watching the video, make sure to leave a like, it helps grow the channel :))
This is an incredible video. You combined two of my facorite things, AI and Poker. Cant believe this does not have more views! Such a high production quality also. As for me, I would definitely like a follow up to this video!
AI will take over eventually. Its not smart to be making videos like this. It will definitely ruin online poker one day if it hasn't already... Sad shame
Maybe try training the AI on all the top poker tournaments (or just Dan Negreanu strategy) & also use facial recognition for player emotion/stress. Or simply take into account the time to react factor. Great project overall though! Kudos
A friend of mine coded a poker bot for texas hold 'em back around 2006. He set one instance running 1c/2c cash game tables and was returning around 10c/day profit playing the odds and position only. I haven't seen him for years to see how he went once he weighted it for variance and stake styles.
@@spacebomb9126 I highly doubt it. The guy was a top level coder and wrote most of the original office polymorphic viruses for bug bounties with Microsoft. And worked a government job. More likely he knew his way around a VPN and a dummy account.
@@authenticallysuperficial9874 It's a youtube channel that warns people for the bullshit during elections. It has pretty girls and plenty of swearing in australien. The Juice media.
It would be interesting to see this how this would develop further if the AI takes into account the position it is sitting at the table to see how many players bet before or would have an opportunity to bet after. Its interesting to see how other players could interpret the AI's decisions, most poker players would tell you to always bet/raise/reraise the same amount preflop for hands you play, as changing the bet amount would only signal that you have a strong hand or were bluffing. 😃
The bot will use randomization so that sizes do not make it exploitable, it will also simply be balanced no matter what, it will have an unexploitable mix of value and bluffs. Nash equilibrium cannot be beaten, at best the opponent can break even if they also use a nash equilibrium strategy, like the rock paper scissors example where the NE strategy is to pick each action 1/3 of the time. If the bot used multiple sizes it will still balance them, or for example, if it uses multiple sizes preflop then it might use 2bb half the time and 2.5bb the other half of the time if it wanted to open raise a hand to 2 different sizes.
@@BarvGwydh That makes sense for the betting, but I still think it would make a difference based off the position you are sitting. If the bot calls/raises a lower equity hand at the bottom of its range, it has a higher chance of winning when it bets last at the table and there is only one other person in the pot then vs when its UTG and is betting the same hand into 6 people or calling the hand with 6 people in the pot. Nash equilibrium can be beaten in a game with more than two people if others have shifting strategies - it only ensures that a single player can't improve by deviating.
Awesome! Congratulations on getting so far with the project! I find it very interesting how much you achieved without knowing anything about poker. I believe that, precisely because of your lack of knowledge about the game, you may have underestimated it in some way or expected positive results where they don't necessarily hold significant value. First of all, playing GTO is the foundational study for any good poker player, but it’s not how they actually play. Why? Because they constantly rely on adaptations, with GTO being the pivot. For example, if you play against someone who goes All-In on every post-flop hand, you would likely need to start calling with a stronger range of hands. That’s how you would win, and it has less to do with strict GTO and more with adapting to the player's behavior and moves. An interesting approach could be to implement a model similar to reinforcement learning or perhaps include more parameters that indicate the game context when training the model. The reality is that GTO play is exploitable; to be unexploitable, it would need to face another player who is also using GTO.
nice work!! keep going! dont be surprised that AI would ask you to call the last hand or even QTs hands, just the solution come from the AI is based on the optimum play against the player play perfectly (or at least much more bluff than your friends) so to compete with your friends you might need to do a lot of adjustment which your AI couldnt understand
Youd want to get a complete solve of the game tree for several different opening bet sizes and feed that into your ai. It wouldnt even need to do any ai magic unless you end up in a branch of the tree you didnt solve for. Every presolved scenario would just be a database lookup.
As an actual full-time professional kt was quite interesting to watch. But this ai at its current state would get absolutely destroyed by a real pro :). Still a fun endeavour and for sure very hard to come up with making this system. Good job
The goal of Texas Holdem is not to make the best 5 card hand. Your goal in Texas Holdem is to put yourself in situations where you have the larger share of equity.
Interesting effort, part of the problem you may be encountering is that the GTO solvers generally need an input on your opponents expected range. GTO by default assumes the other person is also playing GTO. But as they deviate the optimal strategy is for you to also deviate.
@@MrZweene The maths has no problem at all with the "optimum" strategy existing at the level that your response would be unique to every possible opponent in every possible spot. The need for a generalised set of actions is an engineering and practical one not a mathematical one.
You guys don't always seem to be playing all that great, but I think even you would be able to beat this so called GTO playing AI. I think every action was just horrible.
why the dancing animals? WHY?? 3:58..."Poker is a game of people" - D. Brunson check and mate no seriously, well-made video you made me laugh thanks, love the AI voice its hilarious 🤭
I wanted to attempt a project like this myself I would consider using poker solvers as the basis of a strategy and using solvers as a database because they implement gto and give you the proper move in every scenario
The concept of a style of play that can't be exploited is interesting but not actually the 'best' strategy in poker. The biggest winners are the ones who are able to exploit their opponents blindspots and weaknesses, which necessarily involves exposing themselves to risk of counter-exploitation.
I agree with the general reaction to the last hand I don't see how the math could be right, it should be a significantly lower EV to stay in (easy to calculate) especially if you consider the weight of the bet / pool size against the percentages you reported, in fact I think your opponent started to realize that issue/calculation bug in the previous all in bet, I haven't read any papers on this but even with imperfect information the bet/pool size should have a significant effect on the odds (even if some Nash optimization requires some additional randomness injected into that decision) . I would gamble a normal CNN with a lot of training should outperform this vastly (in its current form, you should be able to beat it if you can figure out the true Nash value of each state). Either way its a cool project, but I think there are much more simpler approaches and code implementations that could perform better here. Thumbs up for a great project idea regardless.
It would be difficult to have the AI factor in the opponents body language and other tells, but what about their strategy? Would it be possible for the AI to utilize in game machine learning to adapt its playing style based on the opponents hands (that they don't muck), raise frequency, bet sizing... etc. Additionally, as we all know, playing poker with fake money doesn't really work super well. I was wondering if you and your friends were using real money or just chips? This would drastically change how your friends were playing. Great video, I learned a lot. Thanks for taking the time!!
It didn't only make 1 mistake. After it calls with the first all-in with the QT, it is clearly that it is easy to exploit. You wait until you hit a hand, where the AI hasn't shown particular strength, and then you go all-in if you have hit. That it want's to call 1/3 of the time in the last hand, shows that this simple strategy will most likely win close to 100% of the time. There are other exploits, but you are far behind to win against any pro poker player. You have a lot more work to do, before it stand a chance against even semi-pros.
That why the paper he derived from say in the end, you still need human data to trained on. The ai self play expect the opposition to play optimal. But human played differently and each player has different behavior to exploit into or being exploit by.
An I wrong to think the AI saw quickly that “he who bets last and bets the biggest usually wins” then figured the best way to do that was to bet as much as it could as soon as possible?
What I'm missing in this is the AI trying to figure out the cards other players have based on their individual strategies, actions, and known data. For example if it figures out after a round or two that a specific player bets high and ends up winning, that's a pretty good indicator that when they bet high they have cards that would let them win. The AI knows the cards it has and the cards on the table, so by matching all of that together it could make a reasonable assumption about how likely each player is to have a winning hand.
It doesn't need to. In poker it's just extremely hard to not lose tremendously vs a nash equilibrium strategy and it's impossible to win. The bot can just play as if it's playing against an unexploitable opponent, and it will crush someone whose strategy has nothing to do with the unexploitable strategy, and it will comfortably beat a human who has spent a decade studying nash equilibrium strategies. If it adjusted to play against the poor strategy it would absolutely decimate them even moreso. However, in such a small sample, we will not discover someone's strategy. And trying to make significant adjustments based on a small sample just opens up the bot to being exploited itself. Sometimes human vs human you can find ways to exploit someone's strategy in a comically small sample like 100 hands, mostly because a lot of humans are terrible at the game, but a bot would really need like 10,000 hands just to get a vague idea of someone's strategy. Plus, if it adjusts before that, a smart player could intentionally make it adjust one way then overadjust their own strategy the other way to exploit the bot. If anything, it might be better to datamine online heads-up games and take an aggregate strategy of all profitable humans, and the bot could maybe even play a strategy that's mostly GTO with some adjustments mixed in from this online data. TLDR trying to make a bot adjust its strategy based on its opponent is probably a bad idea unless it's making very small careful adjustments over 10k+, 100k+ hands.
Actually I take back some of what I said. I think a bot could squeeze out a lot of EV by making adjustments over small samples as small as like 50 hands. But it could open the bot up to being exploited for sure. And in any case, the nash equilibrium strategy doesn't need to be improved, it's unbeatable.
It's just playing according to its preset understanding of optimal play. If it was an actual AI that was taking in other player's tendencies, it wouldn't bother making theoretically correct calls like the 54o hand as it'd know it's rarely good against Brandon. Against Olivier Busquet, though, I'd say it's not a horrible call. Edit: Think of solvers (like this AI) more as extremely complex and detailed flow charts.
Heads up all in calls need to be much more conservative. Really, overall, it needs to have less weight put on the opponent's bet if it's going to survive any real players at all. Also understand that, poker solvers already exist, no need to reinvent the wheel there, all the AI needs to do is take a broader look at equity, not just a single momentary hand but the totality of hands, and what the pot values are because pot value has a whole lot to do with strategy. Think of a bet as buying a ticket to the lotto, and the cards in play determine the odds of winning. It appears to not be doing any of that. Also understand that, data is available on how the professional players play certain hands, as is data on how people play online as well. That's a powerful tool that you don't have access to at the moment, but it's simple enough to get access, you just buy it. You could even have a way to have your system ingest that data, and a separate play mode for unknown players, such that you could eventually learn a given player and profile them. Good luck.
Great work. Loved the video. It seems like your friends worked out how to read the AI by the last round. When it checked after seeing each Ace and then the King, Brandon correctly guessed it had a bad hand. A human in the AI's position probably would have bluffed.
If the AI is working correctly, this is similar to saying that you picked up on the strategy of a nash equilibrium bot in rock paper scissors - where the bot's strategy is to randomly pick each action 1/3 of the time.
@@cpatterson365 you've been playing different poker. Not necessarily wrong. You manage risk differently depending on the relative bet, stack and pot sizes. The bot didn't. Look if you feel your chance of winning is 1 in 3 on a bluff, you won't go into it as likely with all-in as opposed to having 10 more games ahead. If you have infinite chips or games there is no difference. I'm just saying that calling vs folding being 5 or 30% is wrong. This percentage should change depending on the chips involved.
@@OmateYayami theoretically the Martingale system works on roulette. The reason it does this, is because it assumes you have infinite money, and infinite bet sizes. For these things to have any relevance, you need to look at non infinite. And funny enough, as soon as you look at non infinite at the Martingale system, it theoretically always fails. Creating a system with an assumption that is not real, will make a system that is not relatable. Even then, this bot would not stand a chance against a pro, even with infinite money and time.
@@NSelsted i might have oversimplified but you do not have infinite bet size here. There is always a limit. Despite the name "no limit" the pot limit is the sum of all stacks which is always finite. You can't always place a bigger bet and reduce game to one win need. I have oversimplified allowing to argue that infinite aggression will always win in an invite game, but that was not the point. The point was that finite aggression was too much and unstable. With invite games it would have been alleviated but that was too broad relaxation, which you properly exploited. I am not sure if a pro could reliably win against a properly written bot. I have my doubts.
12:51 Heads up, I wouldn't exactly call this a semi-bluff as BB's range should be way weaker than the AI's, so it's a value raise. Very standard. 13:26 This hand is atrocious. I'm guessing you guys are playing 50/100 this hand, hard to tell. In which case we're ~30bb effective and this should just be a fold for the AI as even against the bottom of SB's range, it's not looking too hot. Plus, AI blocks the more natural semi-bluffs like Qxhh for a second nut flush draw. If we can really call a 7.25x shove a natural semi-bluff. Why does the AI call? If I *had* to rationalize a call, I would say that this shove is either the nuts or air and because we block the Qh, we have more equity against most air hands, except Khx. This is a horrible justification for a call but, it's what it might be thinking. Interesting note, at 20bb effective, solver calls QhTd at ~10% frequency. 15:02 It might seem insane but this hand is not so horrible from your AI's perspective. It's actually correct from a GTO perspective. Reason for the call is that you are beating 137/185 hands (74.5% equity) and getting ~1.53:1 pot odds (~40% equity required to call and break even). If we folded every time, we would be over folding and losing money long run given the price is just so good. I think your AI plays kinda okay, considering it's only learnt over 1m hands. The QT hand is crazy, one other consideration is that it's not gonna adjust to the mistakes made by other players so it's gonna make some whack decisions like this one. I'd be interested to play it and also see how it plays against GTO Wizard.
GTO doesn't know what to do when a donkey over plays marginal hands and doesnt follow standard range conventions. The strategy assumes both players play very rationally/reasonably with specific ranges of hands. Once you have a non-conformist player the strategy starts to show gaps. For example, OMC only plays AA and KK, GTO doesn't factor that in and therefore will consistently be at a disadvantage assuming the player is playing a more reasonable range (i.e. ~50% from button, ~10% UTG). If you adjust your bot on the fly to adjust for this it will perform as intended, but you'd need to know and make that adjustment for every player. Also when you watch top players, they play close to GTO but break away from it on occasion to exploit their opponents. This AI trys to adapt to the player which is interesting since it makes it closer to exploitative than GTO, which is more profitable if done correctly.
@@wesleykim1758 of course, Exploitative will always be more profitable and deviating is the correct thing to do. GTO will always be a net zero(or lose to rake) Vs other GTO bots. However an exploitative player will never beat a GTO bot(if it’s truly playing perfectly). Not to mention bots have no human emotions and will always be playing it’s A game. With all that being said most bots aren’t playing GTO perfect as of yet and a good amount of them are losing players that only profit to rakeback/promos.
Id say that you programmed it in a mathematical way using equity. I dont know much about programming so please correct me if I am wrong but a better way would be the AI narrowing the opponents range down using the information that the player gives and use that to make accurate decisions
You could have all kinds of heuristics for a player if you were recording their decisions but that might contravene any kind of rules of a provider. But for personal use you could do all what you suggested and go even further to exploit the Mersenne Twist algorithm that all casinos use to shuffle their poker decks.
Just tweak the algo. Tell it, when it faces elimination, it should tighten it's range and only call when it's 90% and above, or whatever .... why am I helping you? lol
The problem is that calling there 30% of the time is WAY too much lol. It seems to provide a percentage chance to take actions that should be a 0% frequency (like the call with Q high). This AI will hemorrhage money. I'm guessing the issue is that it's not able to account for your opponent's range, and assumes they could have any 2 random cards for any situation, though I don't know the exact details of setting the AI up, so that may just be an assumption on my part.
There is the famous guy who won the AI only poker tournament.. His AI just always went all in, and the other AI's thought since its a large bet, they have better hand and just folded.
😂😂
Then those AIs were completely horrible, or the one that only went all-in got incredibly lucky.
@@Ohrami It was a school project kinda thing, so I assume it was mix of both.
Others got no AA,KK ?
Lol😂
Hey! Poker player here, first of all grat job, must not be easy at all!
I think you would have better success playing against your friends (and bad poker player in general) if you programme your AI not to play in equilibrium but to play unbalanced in order to exploit these weaker players' leaks!
Well done on the poker AI project, challenging your friends. It was definitely worth it, but there are obviously a few areas that use some improvement. Your explanation of Nash and CFR was quite solid, while the practical implementation seemed somewhat missing in depth, especially handling complex game states beyond simple abstractions.
This, in consequence of the pre-flop and flop separation, is what some may consider a big flaw for a more robust AI. Your idea to use Equity for card abstraction is great, but it seems that your method hugely oversimplifies the complex situations, as you noted with very similar equity hands playing differently.
Now, this would be of interest: a tournament of poker AI systems. If you're game for it, let's set up a match against Poker AI from pokerbotAI. It will be impressive to see how different strategies and their various implementations fare against each other.
Variable change is a big factor too. Like the 66 and KQ equity, and each of their distinct strategy. In a single raise pots, both 66 and KQ are 100% raise to an unopened pot in any position. Post flop is where it differs. 66 needs to pot control on 1 over card board texture. Meanwhile, KQ can Cbet 1/3 pot to deny equity and/or semibluff as the aggressor. Variable change for poker is then the product of nut advantage and range advantage.
@@DalePumento The adaptation of strategy according to hand strength and position is key, and the variations of changes in post-flop decisions are huge. Distinctions between hands like 66 and KQ, and their respective strategies based on board texture, all point to complexities that a more advanced AI should address.
@@worldofpoker I agree with you that OP oversimplified his AI model. The decision tree is too large for the model that he designed. Hopefully with better time and more model training, he could improve it.
Hahhaa the punt jam on AA2 followed by the even bigger punt call "I'm folding, don't get used to it" - I believe it lmao 😂
Confused why it calls the huge jam on AA2 with QT. Is it because pre-flop and flop are decoupled, somehow? Or is it not realising the difference between all in and small bet?
It could mainly be because this spot never comes up in nash equilibrium strategy. I think the bot is not playing GTO in general and has some flaws, but even a well made bot might struggle with this spot because 6x pot jam should never be played.
@@BarvGwydh Possibly. But watching the film I suspect his preflop model isn’t putting enough aces in this line because it’s decoupled from post flop action. Obviously with more aces in his range here, the need to call weaker hands reduces. To get the algorithm to compute a balanced preflop strategy requires CFR to be done across multiple streets, and he implies that he solved preflop and postflop separately. Basically if you just train a preflop model based on maximising equity before you see the flop, your AI won’t have enough board coverage when it’s deep, and it won’t slow play enough because it won’t see the value it gets from those things post-flop.
The timing on the Helmuth rage clip was great
This has been a project that I've been working on for a pretty long time... so I'm happy to finally release it! Let me know if you guys want a follow-up, since there are still quite a few things I can improve the AI on. I'd like to challenge an actual professional poker player next time. If you enjoyed watching the video, make sure to leave a like, it helps grow the channel :))
This is an incredible video. You combined two of my facorite things, AI and Poker. Cant believe this does not have more views! Such a high production quality also. As for me, I would definitely like a follow up to this video!
It sucks dude. I built a way better one. My poker bot would destroy yours
AI will take over eventually. Its not smart to be making videos like this. It will definitely ruin online poker one day if it hasn't already... Sad shame
for sure, this is very interesting, please continue
Maybe try training the AI on all the top poker tournaments (or just Dan Negreanu strategy) & also use facial recognition for player emotion/stress.
Or simply take into account the time to react factor.
Great project overall though! Kudos
A friend of mine coded a poker bot for texas hold 'em back around 2006. He set one instance running 1c/2c cash game tables and was returning around 10c/day profit playing the odds and position only. I haven't seen him for years to see how he went once he weighted it for variance and stake styles.
He's probably in prison now for theft on gambling sites.
@@spacebomb9126 I highly doubt it. The guy was a top level coder and wrote most of the original office polymorphic viruses for bug bounties with Microsoft. And worked a government job. More likely he knew his way around a VPN and a dummy account.
this is so sick! i need to learn to code cus my god it seems so satisfying completing a project like this
I went from a honest government ad warning me about AI, to this...
Great vid that one
"Honest government" is an oxymoron
@@authenticallysuperficial9874 yup,that's the point of it
@@authenticallysuperficial9874 It's a youtube channel that warns people for the bullshit during elections. It has pretty girls and plenty of swearing in australien. The Juice media.
"government warning about ai" 😂
This needs a part 2
Good job man, 👏 props to you regardless of the loss...Take failure, learn from it and embrace it to always make your self better..
This was really fascinating, would love to see a tournament against pros (but like a table of 6)
The Phil Helmuth reference is gold!
It would be interesting to see this how this would develop further if the AI takes into account the position it is sitting at the table to see how many players bet before or would have an opportunity to bet after. Its interesting to see how other players could interpret the AI's decisions, most poker players would tell you to always bet/raise/reraise the same amount preflop for hands you play, as changing the bet amount would only signal that you have a strong hand or were bluffing. 😃
The bot will use randomization so that sizes do not make it exploitable, it will also simply be balanced no matter what, it will have an unexploitable mix of value and bluffs. Nash equilibrium cannot be beaten, at best the opponent can break even if they also use a nash equilibrium strategy, like the rock paper scissors example where the NE strategy is to pick each action 1/3 of the time. If the bot used multiple sizes it will still balance them, or for example, if it uses multiple sizes preflop then it might use 2bb half the time and 2.5bb the other half of the time if it wanted to open raise a hand to 2 different sizes.
@@BarvGwydh That makes sense for the betting, but I still think it would make a difference based off the position you are sitting. If the bot calls/raises a lower equity hand at the bottom of its range, it has a higher chance of winning when it bets last at the table and there is only one other person in the pot then vs when its UTG and is betting the same hand into 6 people or calling the hand with 6 people in the pot. Nash equilibrium can be beaten in a game with more than two people if others have shifting strategies - it only ensures that a single player can't improve by deviating.
I have been wanting to see this for so long! Great job on the AI, it seemed to be making right plays most of the time!
Yeah but most of the time is bad enough to make you go broke...
It really isn’t. Seems easy to exploit.
why there is 2 cards 6 of diamonds on the deck? at 12:55
Nice catch
At 13:09 its a completely different board ?3QQK too lol He wins with 6d7c
The AI didn't seem to be playing very close to a gto solution of most spots
I absolutely love your videos man, you inspired me alot, and now i will be an intern in the biggest it company in poland at 16 years old!
good job!! keep pushing
Awesome! Congratulations on getting so far with the project! I find it very interesting how much you achieved without knowing anything about poker. I believe that, precisely because of your lack of knowledge about the game, you may have underestimated it in some way or expected positive results where they don't necessarily hold significant value.
First of all, playing GTO is the foundational study for any good poker player, but it’s not how they actually play. Why? Because they constantly rely on adaptations, with GTO being the pivot. For example, if you play against someone who goes All-In on every post-flop hand, you would likely need to start calling with a stronger range of hands. That’s how you would win, and it has less to do with strict GTO and more with adapting to the player's behavior and moves.
An interesting approach could be to implement a model similar to reinforcement learning or perhaps include more parameters that indicate the game context when training the model. The reality is that GTO play is exploitable; to be unexploitable, it would need to face another player who is also using GTO.
AI was on tilt after the punt all-in call, its ok, we've all been there 😫😫
nice work!! keep going! dont be surprised that AI would ask you to call the last hand or even QTs hands, just the solution come from the AI is based on the optimum play against the player play perfectly (or at least much more bluff than your friends) so to compete with your friends you might need to do a lot of adjustment which your AI couldnt understand
It just means the humans are not playing at Nash equilibrium
did not expect cornell cs 2850 to make a cameo in this video
Youd want to get a complete solve of the game tree for several different opening bet sizes and feed that into your ai. It wouldnt even need to do any ai magic unless you end up in a branch of the tree you didnt solve for. Every presolved scenario would just be a database lookup.
This vid deserves 1000x more views that what it is at now holy moly
okay now make it solve multiway hands! nice vid 😊
Just a heads up... your video shows two 6 of Diamonds, one is in the AI hand and then a second is the turn card...
"Private" Cards.. I love it. I'm using this instead of hole cards from now on. Sir! These are my on cards, they are private!
As an actual full-time professional kt was quite interesting to watch. But this ai at its current state would get absolutely destroyed by a real pro :).
Still a fun endeavour and for sure very hard to come up with making this system. Good job
The goal of Texas Holdem is not to make the best 5 card hand. Your goal in Texas Holdem is to put yourself in situations where you have the larger share of equity.
Interesting effort, part of the problem you may be encountering is that the GTO solvers generally need an input on your opponents expected range.
GTO by default assumes the other person is also playing GTO. But as they deviate the optimal strategy is for you to also deviate.
That’s bs. How can a game theory optimum depend on other people behavior?
@@MrZweene The maths has no problem at all with the "optimum" strategy existing at the level that your response would be unique to every possible opponent in every possible spot. The need for a generalised set of actions is an engineering and practical one not a mathematical one.
Thanks, this was very interesting as for curiosity I wanted to implement it myself but never got the time to do it.
wow this channel is going to blow up, great idea and very fun execution!
we will demolish that AI
ChatGPT? More like ChatATM
You guys don't always seem to be playing all that great, but I think even you would be able to beat this so called GTO playing AI. I think every action was just horrible.
Love the video man. Just wish your Mike was a little bit bit better. Only complaint though keep running.
you should secretly scan your friends cards and make your Ai win 100% of the time 😂😂
why the dancing animals? WHY?? 3:58..."Poker is a game of people" - D. Brunson check and mate no seriously, well-made video you made me laugh thanks, love the AI voice its hilarious 🤭
I wanted to attempt a project like this myself I would consider using poker solvers as the basis of a strategy and using solvers as a database because they implement gto and give you the proper move in every scenario
The concept of a style of play that can't be exploited is interesting but not actually the 'best' strategy in poker. The biggest winners are the ones who are able to exploit their opponents blindspots and weaknesses, which necessarily involves exposing themselves to risk of counter-exploitation.
I wonder if the ai treats calling the all in as a semi bluff
You should’ve made a results part at the end, good video thooo 🎉🎉
it would be so lovely if you would do the Python tutorial or some kind of Playlist to educate your audience. Keep up, love your content!!
Nice work Steven! Keep it up 👍🏻
impressive! I love this topic! Did u use RL to train?
I agree with the general reaction to the last hand I don't see how the math could be right, it should be a significantly lower EV to stay in (easy to calculate) especially if you consider the weight of the bet / pool size against the percentages you reported, in fact I think your opponent started to realize that issue/calculation bug in the previous all in bet, I haven't read any papers on this but even with imperfect information the bet/pool size should have a significant effect on the odds (even if some Nash optimization requires some additional randomness injected into that decision) . I would gamble a normal CNN with a lot of training should outperform this vastly (in its current form, you should be able to beat it if you can figure out the true Nash value of each state).
Either way its a cool project, but I think there are much more simpler approaches and code implementations that could perform better here. Thumbs up for a great project idea regardless.
Good vid
It would be difficult to have the AI factor in the opponents body language and other tells, but what about their strategy? Would it be possible for the AI to utilize in game machine learning to adapt its playing style based on the opponents hands (that they don't muck), raise frequency, bet sizing... etc.
Additionally, as we all know, playing poker with fake money doesn't really work super well. I was wondering if you and your friends were using real money or just chips? This would drastically change how your friends were playing.
Great video, I learned a lot. Thanks for taking the time!!
pretty cool idea, imagine if you got glasses w/ a camera and an earpiece and took this AI to a casino
Sounds like a great way to get kicked out of a casino
There's an old saying "F*ck around and find out."
Try it and see what happens, kiddo.
It didn't only make 1 mistake. After it calls with the first all-in with the QT, it is clearly that it is easy to exploit. You wait until you hit a hand, where the AI hasn't shown particular strength, and then you go all-in if you have hit. That it want's to call 1/3 of the time in the last hand, shows that this simple strategy will most likely win close to 100% of the time. There are other exploits, but you are far behind to win against any pro poker player. You have a lot more work to do, before it stand a chance against even semi-pros.
I love by the end, the A. I, through training of other people playing, become a degenerative gambler. 😂😂😂
Love your videos!!! Keep pushing❤
That why the paper he derived from say in the end, you still need human data to trained on. The ai self play expect the opposition to play optimal. But human played differently and each player has different behavior to exploit into or being exploit by.
the cards delt, of 52 say exactly the cards that will come. like it am known. play the new drones am greater fun
An I wrong to think the AI saw quickly that “he who bets last and bets the biggest usually wins” then figured the best way to do that was to bet as much as it could as soon as possible?
Ive been getting crushed by these "AI`s" in online rooms for years
................................................................................ That is how I felt watching this video
I agree.
They are teaching people how to scam people online in poker.
That was unreal
This video was awesome! Don't really have anything else to say just commenting so this gets more views.
like the percentage of the cards yourself have increase every fold, as well with every card delt.
I might be late to the party but you could make it easier on yourself by hard coding some preflop charts and only focusing on your postflop strategy.
so cool
What I'm missing in this is the AI trying to figure out the cards other players have based on their individual strategies, actions, and known data. For example if it figures out after a round or two that a specific player bets high and ends up winning, that's a pretty good indicator that when they bet high they have cards that would let them win. The AI knows the cards it has and the cards on the table, so by matching all of that together it could make a reasonable assumption about how likely each player is to have a winning hand.
It doesn't need to. In poker it's just extremely hard to not lose tremendously vs a nash equilibrium strategy and it's impossible to win. The bot can just play as if it's playing against an unexploitable opponent, and it will crush someone whose strategy has nothing to do with the unexploitable strategy, and it will comfortably beat a human who has spent a decade studying nash equilibrium strategies. If it adjusted to play against the poor strategy it would absolutely decimate them even moreso. However, in such a small sample, we will not discover someone's strategy. And trying to make significant adjustments based on a small sample just opens up the bot to being exploited itself. Sometimes human vs human you can find ways to exploit someone's strategy in a comically small sample like 100 hands, mostly because a lot of humans are terrible at the game, but a bot would really need like 10,000 hands just to get a vague idea of someone's strategy. Plus, if it adjusts before that, a smart player could intentionally make it adjust one way then overadjust their own strategy the other way to exploit the bot.
If anything, it might be better to datamine online heads-up games and take an aggregate strategy of all profitable humans, and the bot could maybe even play a strategy that's mostly GTO with some adjustments mixed in from this online data.
TLDR trying to make a bot adjust its strategy based on its opponent is probably a bad idea unless it's making very small careful adjustments over 10k+, 100k+ hands.
Actually I take back some of what I said. I think a bot could squeeze out a lot of EV by making adjustments over small samples as small as like 50 hands. But it could open the bot up to being exploited for sure. And in any case, the nash equilibrium strategy doesn't need to be improved, it's unbeatable.
It's just playing according to its preset understanding of optimal play. If it was an actual AI that was taking in other player's tendencies, it wouldn't bother making theoretically correct calls like the 54o hand as it'd know it's rarely good against Brandon. Against Olivier Busquet, though, I'd say it's not a horrible call.
Edit:
Think of solvers (like this AI) more as extremely complex and detailed flow charts.
Wow, surely noone ever did that before...
Heads up all in calls need to be much more conservative. Really, overall, it needs to have less weight put on the opponent's bet if it's going to survive any real players at all.
Also understand that, poker solvers already exist, no need to reinvent the wheel there, all the AI needs to do is take a broader look at equity, not just a single momentary hand but the totality of hands, and what the pot values are because pot value has a whole lot to do with strategy. Think of a bet as buying a ticket to the lotto, and the cards in play determine the odds of winning. It appears to not be doing any of that.
Also understand that, data is available on how the professional players play certain hands, as is data on how people play online as well. That's a powerful tool that you don't have access to at the moment, but it's simple enough to get access, you just buy it. You could even have a way to have your system ingest that data, and a separate play mode for unknown players, such that you could eventually learn a given player and profile them.
Good luck.
Great work. Loved the video. It seems like your friends worked out how to read the AI by the last round. When it checked after seeing each Ace and then the King, Brandon correctly guessed it had a bad hand. A human in the AI's position probably would have bluffed.
If the AI is working correctly, this is similar to saying that you picked up on the strategy of a nash equilibrium bot in rock paper scissors - where the bot's strategy is to randomly pick each action 1/3 of the time.
So the whole time i've been playing like AIs best performance?
8:12 LOL
What RL model did u use?
funny video. not sure who played worse: Bevio, Brandon or the AI :D
You need to rework your AI. Facing a river jam with bottom pair is a call at most maybe 5% not 30%.
Not if you have infinite chips or games i.e. you can afford consequences of losing few times.
@@OmateYayami I've been playing poker wrong all this time. Turns out I just need infinite money!
@@cpatterson365 you've been playing different poker. Not necessarily wrong. You manage risk differently depending on the relative bet, stack and pot sizes. The bot didn't. Look if you feel your chance of winning is 1 in 3 on a bluff, you won't go into it as likely with all-in as opposed to having 10 more games ahead.
If you have infinite chips or games there is no difference.
I'm just saying that calling vs folding being 5 or 30% is wrong. This percentage should change depending on the chips involved.
@@OmateYayami theoretically the Martingale system works on roulette. The reason it does this, is because it assumes you have infinite money, and infinite bet sizes. For these things to have any relevance, you need to look at non infinite. And funny enough, as soon as you look at non infinite at the Martingale system, it theoretically always fails.
Creating a system with an assumption that is not real, will make a system that is not relatable. Even then, this bot would not stand a chance against a pro, even with infinite money and time.
@@NSelsted i might have oversimplified but you do not have infinite bet size here. There is always a limit. Despite the name "no limit" the pot limit is the sum of all stacks which is always finite. You can't always place a bigger bet and reduce game to one win need.
I have oversimplified allowing to argue that infinite aggression will always win in an invite game, but that was not the point. The point was that finite aggression was too much and unstable. With invite games it would have been alleviated but that was too broad relaxation, which you properly exploited.
I am not sure if a pro could reliably win against a properly written bot. I have my doubts.
Nice work !
Hi! I was wondering what tutorial you watched at 5:55, i'm super interesting in this stuff lol
I would love to play your AI great job 🎉
Hi. i want to test this AI
Worth the wait
you should collab with some different poker creators and see how this actually stacks up to people who play the game lol
Good stuff
I can not train . it always fails on the turn or river . please help me
may I know how to install it?
Honestly it just played loose but still made good calls and good bets
12:51 Heads up, I wouldn't exactly call this a semi-bluff as BB's range should be way weaker than the AI's, so it's a value raise. Very standard.
13:26 This hand is atrocious. I'm guessing you guys are playing 50/100 this hand, hard to tell. In which case we're ~30bb effective and this should just be a fold for the AI as even against the bottom of SB's range, it's not looking too hot. Plus, AI blocks the more natural semi-bluffs like Qxhh for a second nut flush draw. If we can really call a 7.25x shove a natural semi-bluff.
Why does the AI call? If I *had* to rationalize a call, I would say that this shove is either the nuts or air and because we block the Qh, we have more equity against most air hands, except Khx. This is a horrible justification for a call but, it's what it might be thinking.
Interesting note, at 20bb effective, solver calls QhTd at ~10% frequency.
15:02 It might seem insane but this hand is not so horrible from your AI's perspective. It's actually correct from a GTO perspective. Reason for the call is that you are beating 137/185 hands (74.5% equity) and getting ~1.53:1 pot odds (~40% equity required to call and break even). If we folded every time, we would be over folding and losing money long run given the price is just so good.
I think your AI plays kinda okay, considering it's only learnt over 1m hands. The QT hand is crazy, one other consideration is that it's not gonna adjust to the mistakes made by other players so it's gonna make some whack decisions like this one. I'd be interested to play it and also see how it plays against GTO Wizard.
Please change the AI to have a Phil Helmuth voice xD
looks like the AI developed it's own ego😂
Lets go Steven!!
my number 1 supporter ❤️
what a sick video!
this AI plays worse than the biggest fishes I have ever seen lmao
as a player whos won +250k in earnings and a programmer, super entertaining
0:30 Please fix this issue in your audio where you played both channels at once and they got garbled.
It will be better if she can exploit from opposite player and applies it in to option percentages!
very cool
How can I run this code to be able adding my hand cards and board cards?
GTO bots already exist. Most of these poker sites don’t do anything to stop them.
GTO doesn't know what to do when a donkey over plays marginal hands and doesnt follow standard range conventions.
The strategy assumes both players play very rationally/reasonably with specific ranges of hands. Once you have a non-conformist player the strategy starts to show gaps. For example, OMC only plays AA and KK, GTO doesn't factor that in and therefore will consistently be at a disadvantage assuming the player is playing a more reasonable range (i.e. ~50% from button, ~10% UTG). If you adjust your bot on the fly to adjust for this it will perform as intended, but you'd need to know and make that adjustment for every player. Also when you watch top players, they play close to GTO but break away from it on occasion to exploit their opponents.
This AI trys to adapt to the player which is interesting since it makes it closer to exploitative than GTO, which is more profitable if done correctly.
@@wesleykim1758 of course, Exploitative will always be more profitable and deviating is the correct thing to do. GTO will always be a net zero(or lose to rake) Vs other GTO bots. However an exploitative player will never beat a GTO bot(if it’s truly playing perfectly). Not to mention bots have no human emotions and will always be playing it’s A game.
With all that being said most bots aren’t playing GTO perfect as of yet and a good amount of them are losing players that only profit to rakeback/promos.
Sick video!
Id say that you programmed it in a mathematical way using equity. I dont know much about programming so please correct me if I am wrong but a better way would be the AI narrowing the opponents range down using the information that the player gives and use that to make accurate decisions
You could have all kinds of heuristics for a player if you were recording their decisions but that might contravene any kind of rules of a provider. But for personal use you could do all what you suggested and go even further to exploit the Mersenne Twist algorithm that all casinos use to shuffle their poker decks.
Just tweak the algo. Tell it, when it faces elimination, it should tighten it's range and only call when it's 90% and above, or whatever .... why am I helping you? lol
The problem is that calling there 30% of the time is WAY too much lol. It seems to provide a percentage chance to take actions that should be a 0% frequency (like the call with Q high). This AI will hemorrhage money. I'm guessing the issue is that it's not able to account for your opponent's range, and assumes they could have any 2 random cards for any situation, though I don't know the exact details of setting the AI up, so that may just be an assumption on my part.
how can I use this software?
I'm missing the point here but so this is a 16minutes video on reinventing GTO Wizard / PIO Solver?
This bot isn’t even close to as sophisticated as those softwares
@@Gatorbait1869 I was kidding this is still a commendable endeavor.
a 4 person poker tournament is not a sample size to base any conclusions on
True but it’s not like the casinos will allow it
@@Tuturial464 That's, this is still a useless sample size
10:06 that card bending was so cringe
I watched and think you should look at poker math strategy books to improve your code
Neuralink, bionic eye, metal detector