The oversight you mentioned in your move-scoring algorithm (or at least the thing I think you’re talking about) is that you assume that in addition to your opponent playing randomly, you play randomly after the first move too. It would be better to take averages for your opponent’s moves, but decide on optimal (or suboptimal) responses when considering yours.
Yes, this is the one I was thinking of! I still think the measurement I came up with is kind of interesting, since it's a kind of average of all remaining paths left in the game. But I feel like the optimal solution for losing against a random player would be some sort of an altered Minimax algorithm... like a Randomax algorithm or something. Of course that would be a whole separate video!
@@marcevanstein It seems like if you look at the problem from the end of the game, it's seems more manageable. Because you know that your tactic is 1) block the move if you lose immediately otherwise, 2) win if possible 3) minimize based on expectation. And because expectation only depends on future moves, you move backwards, from the completed board (or board what lost) But i think it would heavily use bayes rule or something like that because you can't just say directly in some games where you would need to go, only in forced ones.
Also the three year old will most likely spot the winning move and the losing move (not garantueed but still) so games that have forcing winning/losing moves should end there instead of calculating different paths that could all happen.
I mean, if I found out now, at age 21, that my dad thought this thoroughly about how to give me the most fun yet educative experiance while playing a silly pencil-and-paper game, I'd honestly find it very wholesome. If I found out at age, like, 7, however, I might've felt upset because "I didn't win for real" lol
It could have been that Google somehow found out you had an interest in tic tac toe and suggested a video based on your demographic. Or maybe they just spied on you
@@TheRenegade...yeah im pretty sure when they ask to track you and you say no they do it anyways because if im on a call and i talk about something im interested in, when i go into youtube the videos will be like "how to do pottery" or like "how to crochet"
@@jaxpianoteacher i took the psat the other day and didnt do anything on google or mention it with my computer open like at all and now my yt recommended has a ton of videos about how to study for the sat
@@nileprimewastakendon’t worry too much, RUclips thinks I’m a old woman and keeps giving me ads on senior centres and will writing services so it’s clearly not perfect at the whole spying thing
I don't have words to describe how beautiful and poetic this comment as a response to the initial premise of the video is to me. I hope you live forever, but in a really good way.
>The year: 2031 >Location: edge of volcano >Son sees video >"SO IT WAS ALL A LIE FATHER" he shouts in an Anakin like scream >"It was my only choice. I had to teach you son!" >"YOU MADE ME A FOOL" >"BECAUSE I LOVE YOU" >"NO MORE FATHER, CHECKER IT SHALL BE!"
I've wondered about using similar strategies for chess ai. One thing that comes up with chess ai is that it assumes the other player is also a chess ai, so you end up with it sometimes rejecting a move because the opponent has a 20 move hard-to-find counter. The computer doesnt end up playing any fun gambits, even though it absolutely can afford to and doing so might even be optimal against weaker players
Leela Chess Zero has started using evaluations that give W/T/L percentage instead of just win percentage now for that reason. It can fairly accurately evaluate how dangerous a position is for the opponent, rather than simply what's best in a vaccum.
For chess in particular there's a simple and hilariously-named concept called the "Contempt Factor", in which you pick an artificial, non-zero value to assign to draws. Normally, the AI would happily play towards a draw if it evaluates its state to be losing (against its own AI). But perhaps a human player wouldn't be as capable of forcing that draw, so the contempt factor tells it that it should play to win unless the situation is particularly dire. This can affect its evaluation much higher up in the search tree, so it can noticeably affect the AI's behavior even if you're not currently close to a draw.
there are actually bots that do play in such a way to beat humans in a 'fun' way. i think it was called 'aggressive komodo' or something, but it was basically only allowed to play really dubious gambits and was encouraged to do unreasonable things early. but would then still claw back a victory due to basically perfect play. it was rated like 2700 and was waaay more interesting!
One huge oversight is that young children don't play as much randomly as imitatively. Your child would be far more likely to win if you took the optimal move every time you went first. "If Dad always plays center, maybe I should, too." Just that one move becoming a hanit gives your child a fairly strong advantage on each game they go first, while making a smaller disadvantage on the games where you olay first, as you're trying to be able to lose and he isn't. It may take more than 16 games to skew the numbers, but I imagine if you could instill just a tiny bit of opening strategy it would prepare your kid better for facing players that aren't as inclined to let him win, later in life. I'll have to try this with my 3 year old and see how she does.
This within the first five minutes has taught me that correctly sampling a value function is more important than the raw value of the number of samples. That is to say that a few Monte Carlo rollouts of some game states is better than one Monte Carlo rollout of a lot of states.
I will try to write a chess AI that takes a list of randomly generated boards makes predictions about the win probability along with an uncertainty and then according to UCB1 with respect to uncertainty samples Monte Carlo rollouts of the board.
Checkers and chess both have enough possible game states to make this functionally impossible. Neither is mathematically solved, like tic tac toe. Edit: checkers was in fact weakly solved in 2007. This basically means that, assuming zero mistakes, the game will always end in a draw. It took 18 years to brute force, apparently.
@@Jason9637 Well you're explaining to a three year old lol. Not to me. I've had far too much experience with floating point myself, devilish tripe that it is.
@@Jason9637It's a lot easier to explain to adults. In fact, many adults can probably intuit that -0 means something along the lines of "a negative number that was small enough to be rounded up to 0" without being told
@@delphicdescant with computers, a negative zero could mean zero, but it could also mean a really big negative number, and a positive zero could mean zero, but it could also mean a really small positive number
8:00 Does it have to do with intention? Since your play style focuses on intentionally choosing the least optimal play, that undermines the assumed normal/random probability of play?
RUclips algorithm has come up trumps! I literally was thinking about this problem yesterday for my four year old! He’s been slow to accept losing in snakes and ladders, but now he’s ready for tic-tac-toe basically if and only if I play like this! ❤
@@marcevansteinsounds like it! Simultaneously super exciting and rich ‘I can show them the concept of a game’ and seemingly completely Sisyphean ‘I can play snakes and ladders for a full hour’. The difficulty is in staying mindful enough to emphasise the former… hence turning it into a decision problem, I suppose!
You might enjoy the game "Walking Into It" by Andrew Schultz. The game is about enumerating all the ways to lose tic-tac-toe to a child without making any "obviously bad" moves.
@@marcevanstein latin doesn't use accents so it's a way to know the difference, also in music you'll see italian a lot and it's definitely close to french so i don't blame you lol
I once made a tic tac toe bot with the easiest setting being it just placing the Symbols litterally random. Found out it is actually an interesting challenge to try force a draw against it
@@davidsarea51Yeah, the son might just be a three-year old, but I think even he can identify "I can win in one move" already, so the usage of randomness to dictate his possible moves is flawed in this situation.
The condition isn't per se that he will "try to win if the other outcome is a tie", but rather that any obvious conclusions will be responded to - ie, he will not fail to win, or allow himself to lose, _when it would be obvious to a 3-year-old._ The oversight was related to that, though, in that the calculations were based on random play but _his_ play was not random.
I guess the question is also does someone who is 3yo and learning something actually acting randomly or pseudo-randomly at all as they are learning the rules. Do humans ever act randomly or pseudo-randomly when leaning? This is a question from psychology majors and other science buffs that like to bend their brains.
I mean, I'm being a little unfair to him; he's getting better at spotting winning moves and blocking winning moves. But still, I thing random play is a big part of learning, because that's the way that you explore a possibility space
I believe anither issue would be that the winning move is always played (assuming its spotted), so at 6:51, it should be the case that the left most is guarenteed if you and yiur daughter always play the winning move.
I assume the final oversight is that while your three year old can’t recognize symmetry, you are able to, and so that changes the number of possible moves
I figured out the second oversight instantly but not the first. Was very confused when you highlighted a different issue, and relieved when you got to the next part 😂
I love your vids❤not so much tic tac but the fact that you brought in your son wrote an entire program on how to lose😅the real question that begs an answer is, how do you know that your 3 year old wasn't just letting you win so you'll keep playing with him😁
I have a question. In a game where the goal is that you get 3 in a row, would playing optimally bad in that game be the same as playing optimally in a game in which the goal is to force the other player to get 3 in a row? Here's a conceptual example I think i can offer. In chess, you can play in such a way that you are always putting your king at the most risk (optimally bad in the regular context) but the best "worst" play would be to somehow force the other player to checkmate you, because you're taking into account that the other player is also trying to lose, and would never play that move voluntarily. One strategy for this is by taking the opponent's pieces in order to make their only legal move be a checkmate. That strategy would likely never come about from simply choosing the "worst" move. There might be a difference between avoiding winning and forcing your opponent to win.
The optimal bad move would be found by a game tree search. On your moves, you always evaluate the position as the worst of all the choices. On opponent moves, in this description, you evaluate the position as the average of all next moves. Note that this is still up to some interpretation. For example, you may have one move that, say, gives you a 50% chance of a draw, and a 50% chance of a loss. You may have another move that gives a 25% chance of a win, and a 75% chance of a loss. Depending on exactly how bad you think a draw is compared to a win, which of these is the correct move will change. For example, if you want to minimise your chance of winning, choose the first. If you want to maximise your chance of losing, choose the second.
first of all, when you calc, towards the end you have "forced moves" that need to be calced with minimax and not random. On top of that, since you are maximin-ing (as opposed to minimax-ing), you need to account for that in your evaluation as well, instead of random play transfers. These two combined are probably why the algorithm is only having a slight effect empirically. In terms of pseudo-pseudo code, you need: if you can win this turn, win if you can't win and opp can win, block one if neither apply and it's your opp's turn, assume random move (avg of all outcomes is score) if it's your turn, assume worst move (maximini, worst score is score)
I think another oversight you made was that the play isn't random if it will be a winning move. Then that move that can cause a fork isn't worth -0.5, it's -1 because not even your son throws the game. I wonder how that changes the numbers. There's a big difference between not doing definitive winning moves and not doing multiple turn long strategies.
You should have considered in the way you calculate position scores the fact that you said you would always take a win if the opportunity occurs and always block wins if applicable
wondering if counting ties in this strategy as 0 instead of something like "0.1"/"-0.1" (as your son taking a minor loss) might be an issue (or at least change the strategy slightly in some cases)
Could you have adjusted the algorithm to assume that the players will always take the opportunity to complete 3 or block 3? Would that have been a more accurate representation of real situation?
Also, in the examples you provided at 7:45 for example you are assigning a 1/3 probability of X winning. I'm assuming your 3 year old child is smart enough to see 3 in a row and you said yourself you won if you got the chance, so an immediate win should have a probability of 1 (unless there are 2 immediate wins which probably can't happen because of the next paragraph) Also also, you said you were blocking if you saw your son had an immediate win opportunity. Assuming your son also blocks you if you have an immediate win opportunity, a branch which blocks a win should also have a probability of 1 (unless there are 2 possible blocks aka a fork which shouldn't alter the expected value of the game)
Last time I mapped this game I only mapped the first 3-5 moves since at that point the game is always decided (at least if you assume that no player misses a 2 in a row.)
If you assume that both players always make a winning move if it's available, and block an opponent's winning move if that's possible, I wonder how that would change the numbers.
It's kinda hard to look at the numbers in the game. Like 0.343, 0.114 and 0.457. Instead something like "34%", "11.4%" and "45.7%" are much easier to see and understand. Add the option for colors and it will improve it even more. About the "Expected value" it can be a slide that moves depending who's winning like in Chess. Once again, looking at "-1.30" and "1.30" is unusual and people would rather see something more clear. For example the simplification of "-1.3-" and "+1.3+". Or that the negative and positive markers appear depending who's playing in instead of just the first player that moves.
Edge WTW WTW T*X*T Total Spaces: 8 Unique Spaces: 5 Instant Win: 2/5 Near Win: 1/5 (counts as half a point) Will Tie: 2/5 Final Score: 50% (4/8) Middle TWT W*X*W TWT Total Spaces: 8 Unique Spaces: 2 Instant Win: 1/2 Near Win: 1/2 (counts as quarter of a point) Will Tie: 0/2 Final Score: 62.5% (5/8) Corner WWW WTW *X*WW Total Spaces: 8 Unique Spaces: 5 Instant Win: 4/5 Near Win: 1/5 (counts as quarter of a point) Will Tie: 0/5 Final Score: 85% (6.8/8) Author's Note: As a person who has memorized every position for Tic-Tac-Toe (Not including symmetries, since that's a waste of time), I can say right now that Edge is the most fun, Middle is the most boring, and Corner is the best. Also, no. It is impossible to force them to win at the FIRST move. It is possible in the second move of the game.
you considered your own moves as being random, which they werent. you didnt account fot the fact that youd always play certain moves (three in a row and blocking three in a row) and play the rest according to the algorithm if there was a position in which you could force either a win or a loss, the expected value can be equal than a position that may result in either a win or a loss depending on your opponents moves, despite the fact that one is a guaranteed loss if youre following the algorithm
You are assuming all of those games are equally likely, but only your opponents moves are random! I actually caught this one and thought I had it the first time, I hadn’t considered the probability issue 😅 Congrats @NStripleseven being first :)
So I think the question you are trying to answer is this. You are playing a one-player game on a Tic-Tac-Toe board. Each turn, you choose an unoccupied space uniformly at random and place an X there. Then you place an O in an unoccupied space that makes three O's in a row. If there are none, you place an O in an unoccupied space that blocks three X's in a row. If there still are none, then you choose any unoccupied space and place an O there. If there still are no unoccupied spaces at all, the game ends. If there are three X's or three O's in a row, the game also ends. Take turns until the game ends. You lose if there are ever three O's in a row. You win if at the end of one of your turns, there are three X's in a row but not three O's in a row. Otherwise you draw. You want a strategy that maximizes the expected value of X, where X = 1 for a win, 0 for a draw, and -1 for a loss. (In this model, you play O in the real game and your son plays X. You "win" in my sense when your son wins. It can be modified slightly for the case where you play X.) So to work out the optimal strategy in all positions (a strong solution), you need to run an expectimax algorithm at full depth. The algorithm works like this. First, start with the full game tree but prune any games that violate the rules for the placement of O. Now look at each leaf and assign it the value of 1 if it's a win, 0 if it's a draw, and -1 if it's a loss. Now repeat the following. Look at each parent node. For O moves, assign the parent the maximum value of its children and record one move achieving that value. But for X moves, assign it the expected value of its children. (Since X moves uniformly at random, this is just the arithmetic mean of the values of its children.) Now you have a database with every legal position and the optimal move in each position for O, which is a strong solution for O. Your mistake was that you basically assumed uniformly random moves for both X and O up until the current node, then picked which child had the best value. This would be the optimal way to play if you only got to choose one move and moves were random from then on (but observing the restrictions on where O can go).
I've always hated tic tac toe since I was a kid You don't need python programs to notice that you've and your opponent have been basically playing the same boxes and the same way and there's little to no difference between this game and the one before it And that O has a massive disadvantage
Me, wanting to see this with chess so stockfish or any other engine will play so that the evaluation is equal. If you blunder, they'll try to bring it back to equal. If you are winning, they play stronger.
and since you won't usually have a mate in 1 then the bot would force the mate you let it have one like a mate in 5 or less. If mate in 6 then it'll ignore the move.
By your own playstyle, wouldnt it be more suboptimal to optimize for forks by the other player rather than endstates? Since there are many cases where the other player can win, but far fewer where they would win if you always block available wins
Yes, I think that's true. As usual, I set out to try to do a simple thing, but games are complicated, especially when you try to account for different play styles
You will black your opponent and win and that means you will play better than randomly when they make two in a row where a lot of random wins come from
Edge WTW WTW T*X*T Total Spaces: 8 Unique Spaces: 5 Instant Win: 2/5 Near Win: 1/5 (counts as half a point) Will Tie: 2/5 Final Score: 50% (4/8) Middle TWT W*X*W TWT Total Spaces: 8 Unique Spaces: 2 Instant Win: 1/2 Near Win: 1/2 (counts as quarter of a point) Will Tie: 0/2 Final Score: 62.5% (5/8) Corner WWW WTW *X*WW Total Spaces: 8 Unique Spaces: 5 Instant Win: 4/5 Near Win: 1/5 (counts as quarter of a point) Will Tie: 0/5 Final Score: 85% (6.8/8) Author's Note: As a person who has memorized every position for Tic-Tac-Toe (Not including symmetries, since that's a waste of time), I can say right now that Edge is the most fun, Middle is the most boring, and Corner is the best. I usually play with Edge, as you can truly say you aren't afraid of choosing the worst opening. Also, I am a person that takes Tic-Tac-Toe more seriously than needed. Thank you for your time.
I'm just finding out that there is an optimal way to play Tic-Tac-Toe. I play very randomly and tend to lose a lot at first before knowing how my opponent likes to play
You prefer the ad earlier? Or you just wish there was no ad? I can certainly understand; I'm not big on ads either. But FWIW, I genuinely love learning stuff on Brilliant. Learning about group theory right now, and it's just well done pedagogically.
@@marcevanstein Ending on an ad means that most viewers who reach the end of the video's main content will close the video before the end of its runtime, which heavily damages your metrics as youtube penalizes the viewcounts from viewers that don't watch the entire video. Placing the ad in the middle or beginning of the video has two benefits- fewer viewers will skip it, due to the increased effort required to do so, and more viewers will reach the end of the video's runtime. You will lose more viewers to people who immediately close videos when an ad starts, but your actual viewcount will be higher.
Seconded, while I typically let the end roll sponsor segments play out to look at comments on the video, a lot of people would probably just click off instead
The oversight you mentioned in your move-scoring algorithm (or at least the thing I think you’re talking about) is that you assume that in addition to your opponent playing randomly, you play randomly after the first move too. It would be better to take averages for your opponent’s moves, but decide on optimal (or suboptimal) responses when considering yours.
Yes, this is the one I was thinking of! I still think the measurement I came up with is kind of interesting, since it's a kind of average of all remaining paths left in the game. But I feel like the optimal solution for losing against a random player would be some sort of an altered Minimax algorithm... like a Randomax algorithm or something. Of course that would be a whole separate video!
@@marcevansteinYou could also make a video on super tic-tac-toe
@@marcevanstein It seems like if you look at the problem from the end of the game, it's seems more manageable. Because you know that your tactic is 1) block the move if you lose immediately otherwise, 2) win if possible 3) minimize based on expectation. And because expectation only depends on future moves, you move backwards, from the completed board (or board what lost)
But i think it would heavily use bayes rule or something like that because you can't just say directly in some games where you would need to go, only in forced ones.
Also the three year old will most likely spot the winning move and the losing move (not garantueed but still) so games that have forcing winning/losing moves should end there instead of calculating different paths that could all happen.
I was screaming this the whole video...
I can't wait for your son to grow up enough to watch your videos and find out you made an entire algorithm just to lose optimally to him
Also to find out that his dad spent at LEAST 8 months straight obsessed with Tic Tac Toe
Optimally lose and suboptimally win.
Dude really be min maxing here.
No way it's the sumo Minecraft realm guy
I mean, if I found out now, at age 21, that my dad thought this thoroughly about how to give me the most fun yet educative experiance while playing a silly pencil-and-paper game, I'd honestly find it very wholesome.
If I found out at age, like, 7, however, I might've felt upset because "I didn't win for real" lol
My 3 year old daughter just discovered tic tac toe a couple of days ago and I've been trying to lose ever since. Somehow the RUclips algoritm knew.
wow, that is eerily spot-on
It could have been that Google somehow found out you had an interest in tic tac toe and suggested a video based on your demographic. Or maybe they just spied on you
@@TheRenegade...yeah im pretty sure when they ask to track you and you say no they do it anyways because if im on a call and i talk about something im interested in, when i go into youtube the videos will be like "how to do pottery" or like "how to crochet"
@@jaxpianoteacher i took the psat the other day and didnt do anything on google or mention it with my computer open like at all and now my yt recommended has a ton of videos about how to study for the sat
@@nileprimewastakendon’t worry too much, RUclips thinks I’m a old woman and keeps giving me ads on senior centres and will writing services so it’s clearly not perfect at the whole spying thing
always starting with an edge teaches your son that starting at the edge must be good
it teaches his son that starting at the edge must be bad because the dad loses more than him when he does that
@@atlasxatlas Sadly, that's too high of a concept for a 3-year-old. Kids this small are still in the territory of "parent gotta be good".
Yep, it teaches them that edging is the best strategy to win🥰
I don't have words to describe how beautiful and poetic this comment as a response to the initial premise of the video is to me. I hope you live forever, but in a really good way.
Well once you learn an optimal tic tac toe strategy you don't wanna play the game
6:02 "I won five, despite my best efforts"
Hey, it's time to make a video on the younger but more complex brother of tic-tac-toe, _super_ tic-tac-toe
what about super super tic tac toe (took me over 2 hours to finish)
@@Nerdy1729now introducing fractal tic tac toe
@@Nerdy1729 it doesn't take that long to play a super tic tac toe game💀💀
@@Godwars15 Not super tic tac toe, super super tic tac toe. The board has 81 boards
@@Nerdy1729 my bad...........What the-!?
81 boards!
>The year: 2031
>Location: edge of volcano
>Son sees video
>"SO IT WAS ALL A LIE FATHER" he shouts in an Anakin like scream
>"It was my only choice. I had to teach you son!"
>"YOU MADE ME A FOOL"
>"BECAUSE I LOVE YOU"
>"NO MORE FATHER, CHECKER IT SHALL BE!"
I've wondered about using similar strategies for chess ai. One thing that comes up with chess ai is that it assumes the other player is also a chess ai, so you end up with it sometimes rejecting a move because the opponent has a 20 move hard-to-find counter. The computer doesnt end up playing any fun gambits, even though it absolutely can afford to and doing so might even be optimal against weaker players
Leela Chess Zero has started using evaluations that give W/T/L percentage instead of just win percentage now for that reason. It can fairly accurately evaluate how dangerous a position is for the opponent, rather than simply what's best in a vaccum.
For chess in particular there's a simple and hilariously-named concept called the "Contempt Factor", in which you pick an artificial, non-zero value to assign to draws. Normally, the AI would happily play towards a draw if it evaluates its state to be losing (against its own AI). But perhaps a human player wouldn't be as capable of forcing that draw, so the contempt factor tells it that it should play to win unless the situation is particularly dire. This can affect its evaluation much higher up in the search tree, so it can noticeably affect the AI's behavior even if you're not currently close to a draw.
@@an_asp I would like to see computers treat draws as losses unless there is no path to victory, because their current strategy is to force draws.
there are actually bots that do play in such a way to beat humans in a 'fun' way. i think it was called 'aggressive komodo' or something, but it was basically only allowed to play really dubious gambits and was encouraged to do unreasonable things early. but would then still claw back a victory due to basically perfect play. it was rated like 2700 and was waaay more interesting!
@@WoolyCow Honestly I think I'd have more fun with chess if I played it that way too! ...Not that I can pull off the "perfect play" part, but.
One huge oversight is that young children don't play as much randomly as imitatively. Your child would be far more likely to win if you took the optimal move every time you went first. "If Dad always plays center, maybe I should, too." Just that one move becoming a hanit gives your child a fairly strong advantage on each game they go first, while making a smaller disadvantage on the games where you olay first, as you're trying to be able to lose and he isn't. It may take more than 16 games to skew the numbers, but I imagine if you could instill just a tiny bit of opening strategy it would prepare your kid better for facing players that aren't as inclined to let him win, later in life. I'll have to try this with my 3 year old and see how she does.
This within the first five minutes has taught me that correctly sampling a value function is more important than the raw value of the number of samples. That is to say that a few Monte Carlo rollouts of some game states is better than one Monte Carlo rollout of a lot of states.
I will try to write a chess AI that takes a list of randomly generated boards makes predictions about the win probability along with an uncertainty and then according to UCB1 with respect to uncertainty samples Monte Carlo rollouts of the board.
Yep. Gotta keep not only the number of different outcomes, but also the probabilities of each outcome. Some branches have different branching factors.
Alright next to this for checkers, and then chess, so you are ready as the kid grows.
Checkers and chess both have enough possible game states to make this functionally impossible. Neither is mathematically solved, like tic tac toe.
Edit: checkers was in fact weakly solved in 2007. This basically means that, assuming zero mistakes, the game will always end in a draw. It took 18 years to brute force, apparently.
@@Ornithopter470isnt checkers solved though
@@theblinkingbrownie4654 to my knowledge, no. It's not solved mathematically.
@@Ornithopter470 Checkers was solved in 2007 by Jonathan Schaeffer.
@Ornithopter470 i'm pretty sure checkers is a solved game
Now explain to a three year old why some of the expected values are negative zero.
Floating point has positive and negative zero. Think of it as "a little bit less than zero".
In floating point "0" actually means any number from 0 to 2.46 * 10^-324, and negative zero is the opposite
@@Jason9637 Well you're explaining to a three year old lol. Not to me. I've had far too much experience with floating point myself, devilish tripe that it is.
@@Jason9637It's a lot easier to explain to adults. In fact, many adults can probably intuit that -0 means something along the lines of "a negative number that was small enough to be rounded up to 0" without being told
@@delphicdescant with computers, a negative zero could mean zero, but it could also mean a really big negative number, and a positive zero could mean zero, but it could also mean a really small positive number
8:00 Does it have to do with intention? Since your play style focuses on intentionally choosing the least optimal play, that undermines the assumed normal/random probability of play?
Babe wake up the Tic Tac Toe guy is back
8:34 aren't those symmetric to each other too? diagonal reflection. there's only 1
RUclips algorithm has come up trumps! I literally was thinking about this problem yesterday for my four year old! He’s been slow to accept losing in snakes and ladders, but now he’s ready for tic-tac-toe basically if and only if I play like this! ❤
I see we are going through a similar stage of life! Snakes and ladders is part of my life too :-)
@@marcevansteinsounds like it! Simultaneously super exciting and rich ‘I can show them the concept of a game’ and seemingly completely Sisyphean ‘I can play snakes and ladders for a full hour’. The difficulty is in staying mindful enough to emphasise the former… hence turning it into a decision problem, I suppose!
You might enjoy the game "Walking Into It" by Andrew Schultz. The game is about enumerating all the ways to lose tic-tac-toe to a child without making any "obviously bad" moves.
btw misère is a french word pronounced more like "miz-air" (or "meez-air" if the sound of the i wasn't clear)
Oh interesting! My bad. I was thinking of it as Latin because of my musical background
That is actually completely wrong
@@marcevanstein latin doesn't use accents so it's a way to know the difference, also in music you'll see italian a lot and it's definitely close to french so i don't blame you lol
I once made a tic tac toe bot with the easiest setting being it just placing the Symbols litterally random. Found out it is actually an interesting challenge to try force a draw against it
Is the problem that you dont account for always taking a winning move/blocking a winning move?
That's actually also an issue!
This is a bigger issue, and I'm surprised more folks haven't pointed it out.
@@davidsarea51Yeah, the son might just be a three-year old, but I think even he can identify "I can win in one move" already, so the usage of randomness to dictate his possible moves is flawed in this situation.
7:53 is the oversight related to the fact that you will try to win if the only other outcome is a tie, as shown at 5:46?
I was surprised when that wasn't the oversight he was referring to earlier in the video, since the one he actually brought up was more subtle.
The condition isn't per se that he will "try to win if the other outcome is a tie", but rather that any obvious conclusions will be responded to - ie, he will not fail to win, or allow himself to lose, _when it would be obvious to a 3-year-old._
The oversight was related to that, though, in that the calculations were based on random play but _his_ play was not random.
I guess the question is also does someone who is 3yo and learning something actually acting randomly or pseudo-randomly at all as they are learning the rules.
Do humans ever act randomly or pseudo-randomly when leaning? This is a question from psychology majors and other science buffs that like to bend their brains.
I mean, I'm being a little unfair to him; he's getting better at spotting winning moves and blocking winning moves. But still, I thing random play is a big part of learning, because that's the way that you explore a possibility space
I nominate this for the best title on RUclips
I believe anither issue would be that the winning move is always played (assuming its spotted), so at 6:51, it should be the case that the left most is guarenteed if you and yiur daughter always play the winning move.
I assume the final oversight is that while your three year old can’t recognize symmetry, you are able to, and so that changes the number of possible moves
I figured out the second oversight instantly but not the first. Was very confused when you highlighted a different issue, and relieved when you got to the next part 😂
I love your vids❤not so much tic tac but the fact that you brought in your son wrote an entire program on how to lose😅the real question that begs an answer is, how do you know that your 3 year old wasn't just letting you win so you'll keep playing with him😁
YFAI
I have a question.
In a game where the goal is that you get 3 in a row, would playing optimally bad in that game be the same as playing optimally in a game in which the goal is to force the other player to get 3 in a row?
Here's a conceptual example I think i can offer.
In chess, you can play in such a way that you are always putting your king at the most risk (optimally bad in the regular context) but the best "worst" play would be to somehow force the other player to checkmate you, because you're taking into account that the other player is also trying to lose, and would never play that move voluntarily.
One strategy for this is by taking the opponent's pieces in order to make their only legal move be a checkmate. That strategy would likely never come about from simply choosing the "worst" move.
There might be a difference between avoiding winning and forcing your opponent to win.
The optimal bad move would be found by a game tree search. On your moves, you always evaluate the position as the worst of all the choices. On opponent moves, in this description, you evaluate the position as the average of all next moves.
Note that this is still up to some interpretation. For example, you may have one move that, say, gives you a 50% chance of a draw, and a 50% chance of a loss. You may have another move that gives a 25% chance of a win, and a 75% chance of a loss. Depending on exactly how bad you think a draw is compared to a win, which of these is the correct move will change. For example, if you want to minimise your chance of winning, choose the first. If you want to maximise your chance of losing, choose the second.
Yes I think you're right
You should make a tic-tac-toe handbook
why didn't you remove symmetrical positions but give more weight in accordance to the symmetries?
first of all, when you calc, towards the end you have "forced moves" that need to be calced with minimax and not random.
On top of that, since you are maximin-ing (as opposed to minimax-ing), you need to account for that in your evaluation as well, instead of random play transfers.
These two combined are probably why the algorithm is only having a slight effect empirically.
In terms of pseudo-pseudo code, you need:
if you can win this turn, win
if you can't win and opp can win, block one
if neither apply and it's your opp's turn, assume random move (avg of all outcomes is score)
if it's your turn, assume worst move (maximini, worst score is score)
I feel so smart because I found the mistake at 3:29
Hey! Love this video, can you do a video in losing the Cracker Barrel Peg Game? I've lost with as many pegs as possible before. Very Fun
The goat is back
I think another oversight you made was that the play isn't random if it will be a winning move. Then that move that can cause a fork isn't worth -0.5, it's -1 because not even your son throws the game.
I wonder how that changes the numbers. There's a big difference between not doing definitive winning moves and not doing multiple turn long strategies.
Nah you just got beat by a 3 year old in tic tac toe and decided to make a video relating to it.
You should have considered in the way you calculate position scores the fact that you said you would always take a win if the opportunity occurs and always block wins if applicable
wondering if counting ties in this strategy as 0 instead of something like "0.1"/"-0.1" (as your son taking a minor loss) might be an issue (or at least change the strategy slightly in some cases)
we're so back!
Could you have adjusted the algorithm to assume that the players will always take the opportunity to complete 3 or block 3? Would that have been a more accurate representation of real situation?
amazing!
I just want the sound effects you used for symbol placing
Also, in the examples you provided at 7:45 for example you are assigning a 1/3 probability of X winning. I'm assuming your 3 year old child is smart enough to see 3 in a row and you said yourself you won if you got the chance, so an immediate win should have a probability of 1 (unless there are 2 immediate wins which probably can't happen because of the next paragraph)
Also also, you said you were blocking if you saw your son had an immediate win opportunity. Assuming your son also blocks you if you have an immediate win opportunity, a branch which blocks a win should also have a probability of 1 (unless there are 2 possible blocks aka a fork which shouldn't alter the expected value of the game)
Last time I mapped this game I only mapped the first 3-5 moves since at that point the game is always decided (at least if you assume that no player misses a 2 in a row.)
What happens if both players try to lose, but every player must block a win for the villain in the very next move?
The final vowel in «misère» is silent, by the way.
If you assume that both players always make a winning move if it's available, and block an opponent's winning move if that's possible, I wonder how that would change the numbers.
The last e in misère is silent. It's basically "miser". But pronounced like air in stair misair
no, it's a french word. it has three syllables.
@@MichaelDarrow-tr1mn no
@@MichaelDarrow-tr1mn exactly, it's a french word, the "e" is not a new syllable. it's not spanish or italian. "heure" vs "heureux"
It's kinda hard to look at the numbers in the game.
Like 0.343, 0.114 and 0.457.
Instead something like "34%", "11.4%" and "45.7%" are much easier to see and understand.
Add the option for colors and it will improve it even more.
About the "Expected value" it can be a slide that moves depending who's winning like in Chess.
Once again, looking at "-1.30" and "1.30" is unusual and people would rather see something more clear.
For example the simplification of "-1.3-" and "+1.3+". Or that the negative and positive markers appear depending who's playing in instead of just the first player that moves.
so what is the losing strategy? or does this video not actually tell you how to lose, only how to figure out how to lose?
Make a super tic tac toe video
When are you covering Meta Tic Tac Toe?
A strange game. The only winning move is not to play
Is it possible to always force a draw/loss if your goal is not to win ever?
i love the noises they put down lol
i love this
I wonder if it's possible to force the the opponent to win?
Edge
WTW
WTW
T*X*T
Total Spaces: 8
Unique Spaces: 5
Instant Win: 2/5
Near Win: 1/5 (counts as half a point)
Will Tie: 2/5
Final Score: 50% (4/8)
Middle
TWT
W*X*W
TWT
Total Spaces: 8
Unique Spaces: 2
Instant Win: 1/2
Near Win: 1/2 (counts as quarter of a point)
Will Tie: 0/2
Final Score: 62.5% (5/8)
Corner
WWW
WTW
*X*WW
Total Spaces: 8
Unique Spaces: 5
Instant Win: 4/5
Near Win: 1/5 (counts as quarter of a point)
Will Tie: 0/5
Final Score: 85% (6.8/8)
Author's Note: As a person who has memorized every position for Tic-Tac-Toe (Not including symmetries, since that's a waste of time), I can say right now that Edge is the most fun, Middle is the most boring, and Corner is the best. Also, no. It is impossible to force them to win at the FIRST move. It is possible in the second move of the game.
Guy solved TicTacToe!
you considered your own moves as being random, which they werent. you didnt account fot the fact that youd always play certain moves (three in a row and blocking three in a row) and play the rest according to the algorithm
if there was a position in which you could force either a win or a loss, the expected value can be equal than a position that may result in either a win or a loss depending on your opponents moves, despite the fact that one is a guaranteed loss if youre following the algorithm
I'm quite interested in collaborating to pursue this further. During what hours is your son available?
In retrospect, that joke came out remarkably creepy! 😬
@@Rubrickety lol
Imagine the son grows up thinking he was good and watches this video
You are assuming all of those games are equally likely, but only your opponents moves are random!
I actually caught this one and thought I had it the first time, I hadn’t considered the probability issue 😅
Congrats @NStripleseven being first :)
So I think the question you are trying to answer is this. You are playing a one-player game on a Tic-Tac-Toe board. Each turn, you choose an unoccupied space uniformly at random and place an X there. Then you place an O in an unoccupied space that makes three O's in a row. If there are none, you place an O in an unoccupied space that blocks three X's in a row. If there still are none, then you choose any unoccupied space and place an O there. If there still are no unoccupied spaces at all, the game ends. If there are three X's or three O's in a row, the game also ends. Take turns until the game ends. You lose if there are ever three O's in a row. You win if at the end of one of your turns, there are three X's in a row but not three O's in a row. Otherwise you draw. You want a strategy that maximizes the expected value of X, where X = 1 for a win, 0 for a draw, and -1 for a loss. (In this model, you play O in the real game and your son plays X. You "win" in my sense when your son wins. It can be modified slightly for the case where you play X.)
So to work out the optimal strategy in all positions (a strong solution), you need to run an expectimax algorithm at full depth. The algorithm works like this. First, start with the full game tree but prune any games that violate the rules for the placement of O. Now look at each leaf and assign it the value of 1 if it's a win, 0 if it's a draw, and -1 if it's a loss. Now repeat the following. Look at each parent node. For O moves, assign the parent the maximum value of its children and record one move achieving that value. But for X moves, assign it the expected value of its children. (Since X moves uniformly at random, this is just the arithmetic mean of the values of its children.)
Now you have a database with every legal position and the optimal move in each position for O, which is a strong solution for O.
Your mistake was that you basically assumed uniformly random moves for both X and O up until the current node, then picked which child had the best value. This would be the optimal way to play if you only got to choose one move and moves were random from then on (but observing the restrictions on where O can go).
At 8:42 the positions are symmetrical diagonally so only 1 optimally bad game
no, the games aren't symmetrical
@@MichaelDarrow-tr1mn I guess
iirc you can always force a non draw.
"I lost at Tic Tac Toe against my 3 year old son so I spent several days writing this algorithm to help me.“
It's kinda like a naive evaluation bar from chess
What about The other 256128 games that when o starts
no, x starts. that's the rules of tic tac toe
@@MichaelDarrow-tr1mnnah Bro, u missed the update
X always starts, you absolute heathens!
lol
as a perfect player i need to use this
I would love to see this do with ultimate tic tac toe. But I don’t think that is reasonable.
I've always hated tic tac toe since I was a kid
You don't need python programs to notice that you've and your opponent have been basically playing the same boxes and the same way and there's little to no difference between this game and the one before it
And that O has a massive disadvantage
Me, wanting to see this with chess so stockfish or any other engine will play so that the evaluation is equal. If you blunder, they'll try to bring it back to equal. If you are winning, they play stronger.
It'd be nice to also record the sections that are blunders so that the problematic line could be looked at
and since you won't usually have a mate in 1 then the bot would force the mate you let it have one like a mate in 5 or less. If mate in 6 then it'll ignore the move.
Hi
By your own playstyle, wouldnt it be more suboptimal to optimize for forks by the other player rather than endstates? Since there are many cases where the other player can win, but far fewer where they would win if you always block available wins
Yes, I think that's true. As usual, I set out to try to do a simple thing, but games are complicated, especially when you try to account for different play styles
Now do a video on the most average games of TTT
can you do a video about tick oat two?
Shame how many people dont educate their kids.
8:00 I literally paused the video to read it
You will black your opponent and win and that means you will play better than randomly when they make two in a row where a lot of random wins come from
your son is gonna see this in like 10 years and be traumatised
There will be a day he reinvents stockfish.
I mean, this IS basically how the earliest chess engines that could actually beat people worked.
The title has 2 spaces?
W son
Bro is taking suffering from success too literally
S tier segue
so every good tic tac toe game ends in a draw and every bad tic tac toe game also ends in a draw?
Misere tictactoe is also a forced draw
Dud played with a child that didn't develop a mind and lost
how about a tic tac toe that before second player take their 2nd move,throw a coin to invert winning condition by chance,then he takes the move
i have never seen someone who takes X&O too seriously
Edge
WTW
WTW
T*X*T
Total Spaces: 8
Unique Spaces: 5
Instant Win: 2/5
Near Win: 1/5 (counts as half a point)
Will Tie: 2/5
Final Score: 50% (4/8)
Middle
TWT
W*X*W
TWT
Total Spaces: 8
Unique Spaces: 2
Instant Win: 1/2
Near Win: 1/2 (counts as quarter of a point)
Will Tie: 0/2
Final Score: 62.5% (5/8)
Corner
WWW
WTW
*X*WW
Total Spaces: 8
Unique Spaces: 5
Instant Win: 4/5
Near Win: 1/5 (counts as quarter of a point)
Will Tie: 0/5
Final Score: 85% (6.8/8)
Author's Note: As a person who has memorized every position for Tic-Tac-Toe (Not including symmetries, since that's a waste of time), I can say right now that Edge is the most fun, Middle is the most boring, and Corner is the best. I usually play with Edge, as you can truly say you aren't afraid of choosing the worst opening. Also, I am a person that takes Tic-Tac-Toe more seriously than needed. Thank you for your time.
@@VoidfluxOrb get off me you psycho!
22h ago
Why can't 0 be O winning? My brain could process it way easier that way xD
The math doesn't work if you do it that way
They'll find a way
They could change it to 0 1 2 and then adjust afterwards
More complicated but easier to understand
At least for my brain
Bwao bwow bwao bwow bwao bwow Bwaoooooo
excuses
My ears feel eaten by your voice.
8 months later and this man is STILL making videos about Tik-Tak-Toe lmao
I'm just finding out that there is an optimal way to play Tic-Tac-Toe.
I play very randomly and tend to lose a lot at first before knowing how my opponent likes to play
Don't end on an ad. Good video still
You prefer the ad earlier? Or you just wish there was no ad? I can certainly understand; I'm not big on ads either. But FWIW, I genuinely love learning stuff on Brilliant. Learning about group theory right now, and it's just well done pedagogically.
@@marcevanstein Ending on an ad means that most viewers who reach the end of the video's main content will close the video before the end of its runtime, which heavily damages your metrics as youtube penalizes the viewcounts from viewers that don't watch the entire video. Placing the ad in the middle or beginning of the video has two benefits- fewer viewers will skip it, due to the increased effort required to do so, and more viewers will reach the end of the video's runtime. You will lose more viewers to people who immediately close videos when an ad starts, but your actual viewcount will be higher.
Seconded, while I typically let the end roll sponsor segments play out to look at comments on the video, a lot of people would probably just click off instead