This model of Menace just builds a [physical] FSM (a Finite-State-Machine knowing all game states) and slowly prunes edges that lead to known failure states. In particular, this works on games that are trivial (we can iterate all the states and their moves), and it can be done by just tracing edges from each failure state back and removing that edge - something done faster without humans or matchboxes. Also, it relies on human knowledge to solve the problem since so much is already represented by these connections (box-bead-box) - so it's really questionable if you can call this machine learning versus just filtering a state-machine. [For example: if we have a phonebook of all numbers in the world and if we randomly call a number and remove it if it's disconnected, we will eventually get a phonebook of all connected numbers - does the phonebook learn?] In larger real problems you need to both be able to explore the problem space, identify undesirable states and optimize at the same time, not just prune from all possible moves.
You make a valid point. And that's the challenge isn't it? The only guaranteed optimal solution is to examine the entire possibility space of a problem and find the optimal point (or points, if there are solutions of equal weight) in that space. Fine with small problem spaces, but impractical with larger ones, thus we need a way of getting a good (but not necessarily ideal) solution with less effort...
Nillie The whole point to getting good at coding is to first code what has already been coded. That way, you can then know lots of new stuff to use in your own projects.
I remember that Martin Gardner article (I believe he published it in Scientific American) and I built this and played it as a teenager in the 60s. This was one of the first steps I took toward becoming a Computer Scientist. That was fun!
"I never thought i'd have a sense of pride over a sentient pile of matchboxes, but here we are." This line was great enough by itself, but he really perfected it by saying "This must be what procreating feels like".
DENACE: Dueling, Educable Naughts-And-Crosses Engine. When they're pitted against each other in true adversarial learning fashion, they're still DENACE the MENACE :)
I'd say, start with like 4 of each color in each box, so it's harder to kill off routes early in development. It should learn a bit slower, therefore keeping it more fun at the convention, and it should end up knowing *all* Paths to Victory.
Corner is by the way the best opening move against humans because it's an unusual move. It's still a drawn game if played right, but people who aren't familiar has a greater chance of doing the wrong move.
Like the person before me said: corner is the best position to open with once you know the moves if you start in the corner if the you will win 100% of the time if your opponent goes anywhere but center, if that happens take the opposite corner and you still win 100% of the time your opponent doesn't take a side space, and only in that situation are you forced to draw.
Sorry but center is the best move. What's the counter to your opponent going corner first? Go center. As long as you know that the center is the most important position, its very hard to lose.
Marcel is simply right. You will get a win (at least once) against most humans by giving them a chance to use their usual centerplay strategy in cornerplay. But you will get only draw after draw after draw if you play center.
I remember in elementary school, thinking myself pretty good at the tic-tac-toe. But then a friend beat me with a corner starting move. I was quite amazed and have played with a corner starting move ever since. I'm surprised at the disparity between greens and blues in the starting box. Corner move is pretty awesome...
Corner move is a very specific way to win that requires you to pick a specific corner relative to your starting corner in the second round. This is one of those "local minima" problems that crops up an awful lot in machine learning, and it's why you need very specific reward structures to teach the machine right. In this case, it doesn't make the reward structure particularly more difficult: you just need to punish it for picking center. But the problem expands exponentially, just like any problem involving decision trees not reduced by real intelligence.
EPMTUNES wrong channel, but nice to meet you. Here I prefer Parker Square jokes as you may have guessed already. So I would be considered a Parker MES.
Back in the 1960s Reader's Digest had a "Book of Adventures" that had stories, puzzles, games and activities, all in hard bound. One of the activities was building a "computer" that would play "Hex-a-pawn." This was a game that used the nine square board (3x3) and three pawns on each side. The paws moved as traditionally and the object was to get your color in your opponent's home row. Like this experiment, you had matchboxes with the various board configurations on them and inside were colored beads to indicate the move. I came across this book in the 1970s (computers were becoming more of a reality by then) and spent a snowed-in weekend building the "machine" and playing the game. It was a lot of fun and taught me how programmes worked (basic anyway) and how a computer CAN make a mistake.
I love this! When I was 9 or 10, I got a copy of Martin Gardner's "Mathematical Carnival", which contains his piece about matchbox computers, and I was absolutely fascinated by it, though I never tried to build one. Forty-something years on, it still sticks in my memory -- I know exactly where I was (in a dinner queue at school) when I read it! It's great to see it in action. (Actually, I've been mourning for that book, unable to find it for years, and it's been out of print. Happily, a couple of years ago, an ex-colleague from my first job met my ex-partner, and returned it -- apparently I lent it to him sometime in the early 90s -- and I've very happily re-read it quite recently 🙂 )
Ages ago I found a description of a similar learning pile of matchboxes from an old Soviet-time puzzle book. That game was different (a breakthrough of pawns on a 3x3 cheassboard), but it inspired me to make a tic-tac-toe version. I took rotations and reflections into account and didn't need that many boxes (only about 20, don't remember how many exactly); I also used a simpler algorithm where nothing was added, only in case of loss the last move indicator was removed (and if this emptied a box then the used move indicator from the previous box et c.). The simpler algorithm was, of course, worse, because it didn't distinguish between wins and draws (this feature was carried over from the original pawn game where draws were not possible), so in the end my fully trained machine mindlessly cruised into draw even in winning position. I think I grew bored before coming up with the idea of rewarding wins by adding indicators. A slight problem with this algorithm seems to be that it quickly becomes a fan of lines that have brought success. I don't think that corner opening is any worse than centre opening; one might say it is better (because it only leaves the opponent one non-losing move, while the centre opening leaves four in a way), but MENACE apparantly happened to score its first win or two with centre opening and this filled the opening matchbox with green beads, after which it, of course, started to open with centre move and kept scoring its wins with that, and so it snowballed.
The path to victory in corner move first is much more narrow than center move as well, though. The first move reduces the second move to one possibility as well, so both you and your opponent are stuck with one winning move on corner move. It's actually a fantastic example of a local minima, and it's why ML models need good reward systems to achieve the right outcome.
Inspired by this and a previous video, in a fit on boredom, i programmed a bot to play Nim and let it go second 300 times against a perfect opponent, and the only reason it wasn't infallable is because i wouldn't let the probability of any move drop to 0. But with only 11 possible board states, it made for a very easy introduction into learning programs vs. trying to teach it 300-some board states and how to recognize reflections and rotations.
I might have missed this in the video, but I think an important thing to mention is that the initial state of the boxes _isn't_ one bead of every possible colour, but instead 8 each in the first box, 4 for the second moves, 2 for the third and one each in the rest (something which isn't even covered in the blogpost in the description…). The way Matt explained the setup would have a high likelihood very quickly dying out…
Wow, this is, in a way, machine learning brought outside of the machine!! I am currently doing a project on Neural Networks for school and this fits so perfectly well with that project! It basically is machine learning! Love it, never thought it would be possible with matchboxes tho...
Probably does use something similar. Inputs are what you like, a few hidden layers perform calculations, and then the output is the type of ad. Your feedback rewards or punishes the network.
Hey Matt I wrote a programm in c# which simulates your matchbox MENACE. Its mostly a replica of the matchboxes but I made some adjustments like a lower bound on how many different beed from each color stay in the boxes so that it cant die. I also added an auto-learn function where MENACE playes against himself and learns that way.
By forfeit I think it's meant that there are no beads in the box. That indicates to the stack of matchboxes that all moves and their continuations are losing in that position therefore the game is lost. Basically a forfeit. If you relate that to chess it doesn't matter if it's a mate in 1 or a mate in 5. Either way the game is over so don't waste my time making me play out a formality. Basically you should resign/forfeit. At least then you can say that you saw the mate.
Guilherme Kobori I've seen that ratio in other simulations. Its probably the smallest set of prime integers that converge nicely, without wild gyrations, or risk of dying prematurely.
If this really proves anything, it is that anything can "learn" how to do anything as long as it gets some kind of feedback from its environment. This, more than anything, is simple, concrete proof that intelligence and understanding of abstract things can arise from simple physical items and processes. ... We all are just a pile of matchboxes.
At most, the number of pixels to the power of the number of colors to the power of the number of degrees of freedom the player has to the power of the number of possible in game coodinates, or roundabouts.
Corner is better than center to start. Corner yields 7 wins and 1 possible draw on opponent's first turn. Center yields 4 won games and 4 possible drawn games on opponent's first move.
Rai Car but they can be shaken up and the piece delivered without human influence, not to mention a bit of tape could cover the clear bits. The bigger issue would be size limitation as it would fill up quickly and as it approaches its limit the ability for the pieces to move freely and any piece be equally possible begins to drop to almost zero.
but, if you start with 1-1-1 in each box, doesn't that completely erase an option upon losing? instead of just lowering the odds? Also, Menace going 2nd should result in more interesting results, since the opening move is a variable (humans don't always start center), so the countermove will have more variety and as such the result may vary more.
But only the last box contains just 1 of each bead. Which is fine as a loss from there should be discarded immediately. The ealier boxes contain multiple copies of each bead.
* the best move if you are going first is corners (in which you can actually win most times playing optimally), and if you are going second it is the edge (in which you will draw versing an optimal player)
I wouldn't call those matchboxes sentient. The matchboxes simply store the learned information, the one doing the learning here is actually the human using the matchboxes.
+Toreno13 What if a different human did each move for MENACE? They would not even have to be told why they are getting a bead and drawing a circle, just the steps to follow. Would you say the crowd of humans involved are learning even though no one person knew what they were doing?
standupmaths yes, with "humans doing the learning" I meant, that they are the process which is responsible for the distribution of colored beads in each matchbox in the end. Or the instructions themselves are the process that's doing the learning. Like for a processor executing instructions (itself not knowing what it's actually doing), and the memory (where the information of the matchboxes is stored), I wouldn't say that the memory is sentient, but the processor is doing the learning and storing the progress in memory.
I don't think this system is conscious, but your reason given is rather silly. Whether machine learning happens via metal wires or humans counting beads is irrelevant.
I would agree it’s not sentient. To me the term machine “learning” implies sentience as I suppose it does to most people outside of computer science. Industries have a tendency to develop their own terms as a way to raise the barrier of entry and it can lead to real miscommunication with the public at large.
Over 40 years ago, following instructions from a popular science magazine, I build a similar machine out of matchboxes. It was for a different game with less states than tic-tac-toe (so I didn't require that many boxes). But the learning strategy was different: wins were never rewarded. For a loss, you'd remove the bead of the last move where the machine still had a choice left (more than one bead in the box). This, IMO, is a superior strategy for several reasons: 1) You don't need an large supply of beads, and ever expanding boxes. 2) The machine will "die" if and only if the start position is a losing position. (And not "about 10%" as it is for Menace). 3) The opponents cannot cheat. With the learning strategy of Menace, you can manipulate it in making a bad first move by first, on purpose, losing a bunch of games. Once it has a fondness of a bad first move, you can exploit that. And since the rewards for "wins" (3 more beads for every move) are much greater than for losses (lose a random bead), for such a machine, it's much harder to unlearn bad moves. (It doesn't apply that much for tic-tac-toe where the machine goes first, as there's no bad move -- but you can exploit that strategy if you'd use Menace to learn second-player tic-tac-toe).
The perfect first move response is still center on a "bad" first move, though. Heck, it's the _only_ winning response on first move to corner. The only thing that changes is the second move response, which only depends on the second move. You _could_ teach it a bad second move response, but only if you didn't allow rotation and reflection. Since this machine depends wholly on unique game states that don't affect other possibility trees, that's a non-issue.
MasterHigure a ‘metric’ is a generic term for measurement, the ‘metric system’ is the standard units for a distance using meters. So you can have metric smoots, metric universes, metric Pomeranians and it is referring to the standard set by the companion word. Metric meters I guess would be more accurate but not necessary as it is the common use and when not speaking of it, you add the secondary defining word to define the standard you are using.
I'm glad a video about machine learning was finally able to tell me how it is programmed to learn, at least at a basic level. (I know I could have googled it but I couldn't be bothered most of the time that it came up)
You know a person is an idiot if they don't : - place their opening mark in the corner when starting, or - or the centre when going second. Anything else, betrays a complete lack of strategy... The corner square IS the strongest _(the centre square is poisonous and prevents hidden forks)_ ... but almost nobody realises this. If you make a rule that nobody can take the centre until they have a mark on the board, then every game can be won by force.
@@garychap8384 oh YES! i was hoping someone else realised! (idk but almost everyone i play with still plays the centre first it's annoying... haha i got bored once while waiting in the paediatrician back when i was 13 or something so i just started playing with myself)
people knowing how to play and it being a solved game definitely makes it tougher. it would be interesting to see this sytem vs only children (people who almost never have strategy) or a version of itself that does the other side
This isn't really neural networks, I'm sorry to say. It's just basic learning where the machine is aware of all the possible states ahead of time, and just assigns values to them based on past experiences. Neural networks are kinda based on this idea, but a bit more abstracted; they don't look at individual game states, and there are multiple 'layers' that each process information in a different way, influenced by their previous layer.
+Geogeo 3 Glad I could help! Remember this is only a first-order approximation and actual neural networks are much more complicated. But nothing a lot of matchboxes couldn’t do.
It seems like it would teach us a lot more about how the machine learns to program Menace B so that humans can move first. A deep analysis of the data generated by Menace A's centre-first strategy, compared to what Menace B does when the centre is left open by a human first move, might reveal some really interesting patterns. I think it could also learn very differently depending on how often it samples its results. Having its number of beads updated after _every_ player would respond in a different way than playing fifty games at a time and updating all those results simultaneously from the same start position. Or updating after every hundred games, two hundred, etc. Starting each box with multiples of each bead could also help smooth the numbers. A pre-learning state of having every option represented in triplicate might bring some insights. For example, those few anomalous blue beads in the opening-move box represented corner moves, and presumably led to some clever corner-based strategies that force mistakes. The machine might learn to give those strategies more weight if it has more opportunities to start with them.
I first came across this in Fred Saberhagen's 1963 short story "Without A Thought", the first of his Berserker stories. The (machine) Berserkers had a weapon which disabled higher brain function. the human pilot of a ship had to convince them it didn't work - which he did by teaching his pet to play the game using a Menace box/bead system
The way I remember the learning algorithm from long ago was to remove the loosing move from the last box. Only when that box is empty remove the loosing move from the previous box recursively. That way it only prunes loosing strategies and the only way it could die is if there was a strategy for the second player to always win. Regarding using different flavors of tic-tacs the winner could get to keep the tic-tacs as a reward.
They work together. "Katie works for Think Maths with Matt Parker, giving talks in schools around the country about engaging off-curriculum mathematics. She also does admin and project management for Think Maths".... source: www.katiesteckles.co.uk/
Now I want to get back into my attempt at programming a neural network into a MUD engine... I mean, thinking in terms of my favorite (PennMUSH), I have rooms (containers that players and items can occupy) and exits which connect rooms. All object types have programmable attributes....and those could be weighted values. Such as "likelihood that this exit is used by a wandering object when it picks one at random" (a rat, maybe). But tic-tac-toe would be a much easier to start with...and the immersive quality of a MUD could make for some fun roleplay effects. A rat maze, however, would be way more inline with the dungeon crawler intention of a MUD.
I believe Scientific American published an article on this around 60 years ago. But they suggested that you use a lot fewer match boxes by allowing for reflections and rotations. I believe only two or three dozen boxes were needed. And they suggested that when the boxes lost, the last bead selected in the last box should be removed. Nothing was added for a win. After loosing several dozen games to me, the match boxes became totally unbeatable if they had the first move. A lot simpler than what you are doing here. And it worked to perfection.
So you wanted to make matchboxes learn to win noughts and crosses but it only learnt to draw? That's a real Parker Square of a machine learning routine...
Daniel Titchener tic tac toe is a sufficiently easy game that each player can force a draw or win provided that one of the players uses the best strategy
You cannot win. The game is so simple that a human without a severe mental disability will always force a draw, no matter how much more intelligent or skilled you are.
10:10 "It's 10^27 universes, metric universes, across?" Good that you clarified so we wouldn't confuse it with imperial universes, which are way smaller.
SOOOOOOOOOO, ....... at the end of the match,the inanimate match wins the match!!?!?!?!?!?!?! That's MATCHLESS!!!!!!!!!!!!!!!!!!!!! (& menace says ; "YOU'VE MET YOUR MATCH!!!!!!!") hahaahaaaaa
"If the first box runs out, it has learnt to resign on the first move, and that is Bad......" BUT Wargames taught us that is the correct move! ruclips.net/video/6DGNZnfKYnU/видео.html
I can remember, in 1964, we had a publication called "L’Album des jeunes" from the Readers Digest Selection. There was a similar game. It was an "Hexapion". It was very easy to play even for a kid (like me) and a few (not a lot) of matchboxes. The principle was the same and it was very frustating (for me) to failed to win against a couple of matchboxes. At this time I was thinking that the way the "machine" was punished was not a real punition because this punition was helping the machine to win... And on my side I had nothing to help...
So, when I lose a game I can honestly say "I am dumber than a box of matches"
pile of matchboxes
No, but clearly something _could_ be said about the arrangement of your _"marbles"_ ; )
Yes, but the pile of matchboxes has practiced more than you
Not necessarily @@Septimus_ii
"Can a Match Box?"
"No, but it can learn."
The secret alternative answer to the impossible quiz...
John Joubran No but a tin can
9:20 That suggests to build Menace A and Menace B - and have them both learn by only playing against each other
Would actually work, adversarial machine learning is quite interesting.
And don't forget to let Robert Miles know!
This is how we end up with the matrix.
This model of Menace just builds a [physical] FSM (a Finite-State-Machine knowing all game states) and slowly prunes edges that lead to known failure states.
In particular, this works on games that are trivial (we can iterate all the states and their moves), and it can be done by just tracing edges from each failure state back and removing that edge - something done faster without humans or matchboxes.
Also, it relies on human knowledge to solve the problem since so much is already represented by these connections (box-bead-box) - so it's really questionable if you can call this machine learning versus just filtering a state-machine.
[For example: if we have a phonebook of all numbers in the world and if we randomly call a number and remove it if it's disconnected, we will eventually get a phonebook of all connected numbers - does the phonebook learn?]
In larger real problems you need to both be able to explore the problem space, identify undesirable states and optimize at the same time, not just prune from all possible moves.
You make a valid point. And that's the challenge isn't it?
The only guaranteed optimal solution is to examine the entire possibility space of a problem and find the optimal point (or points, if there are solutions of equal weight) in that space.
Fine with small problem spaces, but impractical with larger ones, thus we need a way of getting a good (but not necessarily ideal) solution with less effort...
well, i know what im coding tonight
Nillie The whole point to getting good at coding is to first code what has already been coded. That way, you can then know lots of new stuff to use in your own projects.
You know what, I'm gonna try this too now.
That makes sense, Trey Atkins and Elf Friend. Thanks for taking the time to make me a bit less ignorant.
i was just kinda bored and wanted to code something...
Elf Friend coding algebra ( ͡° ͜ʖ ͡°)
I remember that Martin Gardner article (I believe he published it in Scientific American) and I built this and played it as a teenager in the 60s. This was one of the first steps I took toward becoming a Computer Scientist.
That was fun!
reading that book I do not remember that many boxes. I believe he removed the mirror layouts. not sure. but yet, got me into A.I. LOL
"I never thought i'd have a sense of pride over a sentient pile of matchboxes, but here we are."
This line was great enough by itself, but he really perfected it by saying "This must be what procreating feels like".
MENACE, for when the machine goes first, and
DENNIS, for when the human goes first
DENACE: Dueling, Educable Naughts-And-Crosses Engine.
When they're pitted against each other in true adversarial learning fashion, they're still DENACE the MENACE :)
Dennis liao
This is secretly one of the best and simplest videos explaining machine learning
Matt 'chbox' Parker
+Mister Apple Damn you!
Classic Parker box.
A Parker pun 👍
+
He is a Parker matchbox, basically.
"This must be what procreating feels like."
UM. Okay, Matt...
He's a mathematician, he wouldn't know otherwise
Quote is at 8:32, had the same reaction as you.
A real Parker analogy.
Trust me on this: It feels different.
Who is looking for backdoors in the AI then.......?
Menace doesn't die, it just learns that the only way to win is not to play :D
ruclips.net/video/6DGNZnfKYnU/видео.html
Or Menace loses all hope, poor thing
Obligatory quote "The only winning move is not to play"
"War Games" - a great movie.
How about a nice game of chess?
No. Let's play Global Thermonuclear War.
Hello Joshua
I've seen that AI's decision in some AI youtube
I'd say, start with like 4 of each color in each box, so it's harder to kill off routes early in development. It should learn a bit slower, therefore keeping it more fun at the convention, and it should end up knowing *all* Paths to Victory.
8:33 @Matt, that's kinda what programming feels like too! The satisfaction of your watching your theory autonomously running, and correctly... Bliss!
Corner is by the way the best opening move against humans because it's an unusual move. It's still a drawn game if played right, but people who aren't familiar has a greater chance of doing the wrong move.
Not because it's an unusual position but because is mathematically the best starting position
Like the person before me said: corner is the best position to open with once you know the moves if you start in the corner if the you will win 100% of the time if your opponent goes anywhere but center, if that happens take the opposite corner and you still win 100% of the time your opponent doesn't take a side space, and only in that situation are you forced to draw.
Sorry but center is the best move. What's the counter to your opponent going corner first? Go center. As long as you know that the center is the most important position, its very hard to lose.
the game can reliably be won or tied starting in the corner, Menace gets to go first, it needs to take the corner.
Marcel is simply right. You will get a win (at least once) against most humans by giving them a chance to use their usual centerplay strategy in cornerplay. But you will get only draw after draw after draw if you play center.
As a dad, I can tell you that procreation carries a wide range of emotions, with pride being a small part. Fear and frustration are much more common.
You don't think searching for the right box so you can add or remove some beads all day would be frustrating?
andymcl92 I can't speak to that. He said it was like procreation, and it may be. I only know the procreation part
As your mum, I disapprove this comment.
You made a whole child? How many known universes could fit inside the sphere of radius in centimeters equal to the number of boxes that it took?
when two matchboxes love each other very much.......
I remember in elementary school, thinking myself pretty good at the tic-tac-toe. But then a friend beat me with a corner starting move. I was quite amazed and have played with a corner starting move ever since. I'm surprised at the disparity between greens and blues in the starting box. Corner move is pretty awesome...
Corner move is a very specific way to win that requires you to pick a specific corner relative to your starting corner in the second round. This is one of those "local minima" problems that crops up an awful lot in machine learning, and it's why you need very specific reward structures to teach the machine right. In this case, it doesn't make the reward structure particularly more difficult: you just need to punish it for picking center. But the problem expands exponentially, just like any problem involving decision trees not reduced by real intelligence.
This is absolutely amazing. I love the cross-over of high and low tech and this is the perfect synergy.
Parker sentient beings.
The human race is going to be destroyed by matchboxes!
TheTopazRobot they are just Parker sentient. They can learn how to draw with the human race only.
hes such a MES
EPMTUNES wrong channel, but nice to meet you. Here I prefer Parker Square jokes as you may have guessed already. So I would be considered a Parker MES.
achu11th good idea. I’m going to start to make Parker square references on mes’ vids
Back in the 1960s Reader's Digest had a "Book of Adventures" that had stories, puzzles, games and activities, all in hard bound.
One of the activities was building a "computer" that would play "Hex-a-pawn." This was a game that used the nine square board (3x3) and three pawns on each side. The paws moved as traditionally and the object was to get your color in your opponent's home row.
Like this experiment, you had matchboxes with the various board configurations on them and inside were colored beads to indicate the move. I came across this book in the 1970s (computers were becoming more of a reality by then) and spent a snowed-in weekend building the "machine" and playing the game. It was a lot of fun and taught me how programmes worked (basic anyway) and how a computer CAN make a mistake.
You could also teach matchboxes to play Dr. Nim.
I love this! When I was 9 or 10, I got a copy of Martin Gardner's "Mathematical Carnival", which contains his piece about matchbox computers, and I was absolutely fascinated by it, though I never tried to build one. Forty-something years on, it still sticks in my memory -- I know exactly where I was (in a dinner queue at school) when I read it! It's great to see it in action.
(Actually, I've been mourning for that book, unable to find it for years, and it's been out of print. Happily, a couple of years ago, an ex-colleague from my first job met my ex-partner, and returned it -- apparently I lent it to him sometime in the early 90s -- and I've very happily re-read it quite recently 🙂 )
The one dislike is from the person who lost to MENACE.
U Wot M8 now there are 7 of them
MENACE is getting better
As of Dec 26, 2018 it is 92 dislikes.
@@matthewwriter9539 ppl suck at tic tac toe lmao
I love that it can die out. The way to win is not to play at all.
ruclips.net/video/6DGNZnfKYnU/видео.html
No, no, no... Use Tic Tac boxes containing differently-colored toes!
DataCab1e that took me a second.
Abandoned for the lack of toe donations
Really puts the 'cure' in pedicure!
Ew
@@lucianodebenedictis6014 if the machine can't survive a lack of toes then could we say it is... lack-toes intolerant?
Ages ago I found a description of a similar learning pile of matchboxes from an old Soviet-time puzzle book. That game was different
(a breakthrough of pawns on a 3x3 cheassboard), but it inspired me to make a tic-tac-toe version. I took rotations and reflections into
account and didn't need that many boxes (only about 20, don't remember how many exactly); I also used a simpler algorithm where nothing was added, only in case of loss the last move indicator was removed (and if this emptied a box then the used move indicator from the previous box et c.). The simpler algorithm was, of course, worse, because it didn't distinguish between wins and draws (this feature was carried over from the original pawn game where draws were not possible), so in the end my fully trained machine mindlessly cruised into draw even in winning position. I think I grew bored before coming up with the idea of rewarding wins by adding indicators.
A slight problem with this algorithm seems to be that it quickly becomes a fan of lines that have brought success. I don't think that corner opening is any worse than centre opening; one might say it is better (because it only leaves the opponent one non-losing move, while the centre opening leaves four in a way), but MENACE apparantly happened to score its first win or two with centre opening and this filled the opening matchbox with green beads, after which it, of course, started to open with centre move and kept scoring its wins with that, and so it snowballed.
The path to victory in corner move first is much more narrow than center move as well, though. The first move reduces the second move to one possibility as well, so both you and your opponent are stuck with one winning move on corner move. It's actually a fantastic example of a local minima, and it's why ML models need good reward systems to achieve the right outcome.
How many matchboxes would be needed to learn Global Thermonuclear War?
And how much would it cost to buy enough for the nuclear winter DLC by EA?
settle down, joshua
Inspired by this and a previous video, in a fit on boredom, i programmed a bot to play Nim and let it go second 300 times against a perfect opponent, and the only reason it wasn't infallable is because i wouldn't let the probability of any move drop to 0. But with only 11 possible board states, it made for a very easy introduction into learning programs vs. trying to teach it 300-some board states and how to recognize reflections and rotations.
I might have missed this in the video, but I think an important thing to mention is that the initial state of the boxes _isn't_ one bead of every possible colour, but instead 8 each in the first box, 4 for the second moves, 2 for the third and one each in the rest (something which isn't even covered in the blogpost in the description…). The way Matt explained the setup would have a high likelihood very quickly dying out…
Yeah... nice catch
I made one a lot of time ago.
This is amazing as it demonstrates the very basics of what we call "Artificial Intelligence" or Machine Learning.
“I am now joined by the guy who's fault it is!”
This is the reason I follow you. Well, that and computers made of matchboxes.
Wow, this is, in a way, machine learning brought outside of the machine!! I am currently doing a project on Neural Networks for school and this fits so perfectly well with that project! It basically is machine learning! Love it, never thought it would be possible with matchboxes tho...
It seems like a little part of you died when you called it "Tic Tac Toe"
ONE OF My MOST FAVOURITE VIDEOS ON RUclips
"It's learned to resign on the first move."
So basically, all it's learned in that case is that it's bad at noughts and crosses.
Next episode. Teaching a pile of tic tacs and severed toes to play tic tac toes
I feel like this is the machine that RUclips uses for there adbot.
Nah. This is too advanced.
Probably does use something similar. Inputs are what you like, a few hidden layers perform calculations, and then the output is the type of ad. Your feedback rewards or punishes the network.
Machine learning only works when it makes mistakes. Google is unaware of that fact.
@@MrGeocidal when was the last time you rated an ad?
I feel quite strange that how much I find this video very entertaining. Excellent work!..
8:32 "This must be what procreating feel like." Lol.
Classic mathematitian
Pretty sure he said "proof-creating" :'D
when you've gone too far down the nerd hole you start referring to your machine learning algorithms as "your babies"
Hey Matt I wrote a programm in c# which simulates your matchbox MENACE. Its mostly a replica of the matchboxes but I made some adjustments like a lower bound on how many different beed from each color stay in the boxes so that it cant die. I also added an auto-learn function where MENACE playes against himself and learns that way.
Oh my god I was at the museum last week! I practically could’ve run into you!
Great fun, I did this at school at the end of the 1970's also inspired by the brilliant Martin Gardner :)
if the first box runs out surely the solution is the put one of each bead back in and keep going?
Unsure of whether this would work, since you also removed beads further down the tree. I don't really want to think about it though.
GEM4sta And there might also be a halting problem here - how could it self diagnose to know what forfeits are justified, and what forfeits are not?
Simple: Forfeit means loss, so it shouldn't forfeit at any point.
By forfeit I think it's meant that there are no beads in the box. That indicates to the stack of matchboxes that all moves and their continuations are losing in that position therefore the game is lost. Basically a forfeit.
If you relate that to chess it doesn't matter if it's a mate in 1 or a mate in 5. Either way the game is over so don't waste my time making me play out a formality. Basically you should resign/forfeit. At least then you can say that you saw the mate.
I have never seen anyone forfeit a game of Tick-Tack-Toe. I say never give up.
favorite channel. keep up the good work. really love the mix of humour and information!
Now that Matt Scroggs "has" this contraption... He can't be blocked except by two or more creatures.
What is this, some kind of _magic?_ What would a _gathering_ of matchboxes do to help him with that?
0:25 Yay! Katie Steckles from the Puzzle Hunters on Only Connect!
Is there a reasoning behind the rewarding distribution being +3 win, +1 draw and -1 loss?
Guilherme Kobori I've seen that ratio in other simulations. Its probably the smallest set of prime integers that converge nicely, without wild gyrations, or risk of dying prematurely.
If this really proves anything, it is that anything can "learn" how to do anything as long as it gets some kind of feedback from its environment. This, more than anything, is simple, concrete proof that intelligence and understanding of abstract things can arise from simple physical items and processes.
... We all are just a pile of matchboxes.
From now on Ill be counting things in "metric universes" xD
As a future maths teacher with experience in museum design, this makes me itch to go get a big pile of matchboxes and build one of these stateside....
How many match boxes would it need to learn how to play Mario?
At most, the number of pixels to the power of the number of colors to the power of the number of degrees of freedom the player has to the power of the number of possible in game coodinates, or roundabouts.
Atlas WalkedAway in other words, a big-ass number
Functionally (but not literally) infinite.
One step at a time. We need to get command blocks playing MarI/O first.
Atlas WalkedAway well, there is only one speed in Mario, right? You could divide it up in to steps, that would make it almost feasible
Corner is better than center to start. Corner yields 7 wins and 1 possible draw on opponent's first turn. Center yields 4 won games and 4 possible drawn games on opponent's first move.
Don’t beat yourself up about it. Tic Tac boxes are transparent
Rai Car but they can be shaken up and the piece delivered without human influence, not to mention a bit of tape could cover the clear bits. The bigger issue would be size limitation as it would fill up quickly and as it approaches its limit the ability for the pieces to move freely and any piece be equally possible begins to drop to almost zero.
1:16 "... the box that matches".
I see a connection.
but, if you start with 1-1-1 in each box, doesn't that completely erase an option upon losing? instead of just lowering the odds?
Also, Menace going 2nd should result in more interesting results, since the opening move is a variable (humans don't always start center), so the countermove will have more variety and as such the result may vary more.
But only the last box contains just 1 of each bead. Which is fine as a loss from there should be discarded immediately. The ealier boxes contain multiple copies of each bead.
@Damien Porter While that would make perfect sense, did he say so? If so, I missed that bit.
Joshua Rosen I don't think he says it, but it is in the discription that he links to.
Damien Porter Which I didn't read. Thank you, I now shall.
* the best move if you are going first is corners (in which you can actually win most times playing optimally), and if you are going second it is the edge (in which you will draw versing an optimal player)
I wouldn't call those matchboxes sentient. The matchboxes simply store the learned information, the one doing the learning here is actually the human using the matchboxes.
+Toreno13 What if a different human did each move for MENACE? They would not even have to be told why they are getting a bead and drawing a circle, just the steps to follow. Would you say the crowd of humans involved are learning even though no one person knew what they were doing?
standupmaths yes, with "humans doing the learning" I meant, that they are the process which is responsible for the distribution of colored beads in each matchbox in the end. Or the instructions themselves are the process that's doing the learning. Like for a processor executing instructions (itself not knowing what it's actually doing), and the memory (where the information of the matchboxes is stored), I wouldn't say that the memory is sentient, but the processor is doing the learning and storing the progress in memory.
I don't think this system is conscious, but your reason given is rather silly. Whether machine learning happens via metal wires or humans counting beads is irrelevant.
I would agree it’s not sentient. To me the term machine “learning” implies sentience as I suppose it does to most people outside of computer science. Industries have a tendency to develop their own terms as a way to raise the barrier of entry and it can lead to real miscommunication with the public at large.
This is brilliant, both the principle and its use at a science festival!
Thanks for the vid
+Starrgate Thanks for watching!
Over 40 years ago, following instructions from a popular science magazine, I build a similar machine out of matchboxes. It was for a different game with less states than tic-tac-toe (so I didn't require that many boxes). But the learning strategy was different: wins were never rewarded. For a loss, you'd remove the bead of the last move where the machine still had a choice left (more than one bead in the box). This, IMO, is a superior strategy for several reasons:
1) You don't need an large supply of beads, and ever expanding boxes.
2) The machine will "die" if and only if the start position is a losing position. (And not "about 10%" as it is for Menace).
3) The opponents cannot cheat. With the learning strategy of Menace, you can manipulate it in making a bad first move by first, on purpose, losing a bunch of games. Once it has a fondness of a bad first move, you can exploit that. And since the rewards for "wins" (3 more beads for every move) are much greater than for losses (lose a random bead), for such a machine, it's much harder to unlearn bad moves. (It doesn't apply that much for tic-tac-toe where the machine goes first, as there's no bad move -- but you can exploit that strategy if you'd use Menace to learn second-player tic-tac-toe).
The perfect first move response is still center on a "bad" first move, though. Heck, it's the _only_ winning response on first move to corner. The only thing that changes is the second move response, which only depends on the second move. You _could_ teach it a bad second move response, but only if you didn't allow rotation and reflection. Since this machine depends wholly on unique game states that don't affect other possibility trees, that's a non-issue.
tic tac would have been more interesting esp cuz you can reward the winner with a tic tac
Yes, that was my thought, if the player won, let them have one of the tic tacs that was drawn.
I saw this a long time ago in a french science magazine. Thank you very much to bring back this memory. I always thought it was from Von Neumann
+Fred G Glad I could remind you! Michie was the same era as Von Neumann but was over in Bletchley Park during WWII.
10:11 "Metric universes"
MasterHigure a ‘metric’ is a generic term for measurement, the ‘metric system’ is the standard units for a distance using meters. So you can have metric smoots, metric universes, metric Pomeranians and it is referring to the standard set by the companion word. Metric meters I guess would be more accurate but not necessary as it is the common use and when not speaking of it, you add the secondary defining word to define the standard you are using.
What a Parker square of a measurement unit
I'd like to see a Menace play against another Menace.
What happens if Menace plays Menace?
Edit: Also, extremely sensitive to initial conditions!
I'm glad a video about machine learning was finally able to tell me how it is programmed to learn, at least at a basic level. (I know I could have googled it but I couldn't be bothered most of the time that it came up)
8:32 "This must be what procreating feels like"
Oh Matt! I pity your better half.
If math nerds don't have sex, how do we get more math nerds?
We get MENACE.
It's interesting that MENACE learned risk aversion and tended towards the safer "draw" as opposed to the riskier attempt to win outright.
the only problem is every game of tic tac toe is a draw, unless one person is an idiot.
CormacMacCormac IKR
@@ilya8914 worst game ever... i play the infinte version tho with the one who has a 5 in a row wins
You know a person is an idiot if they don't :
- place their opening mark in the corner when starting,
or
- or the centre when going second.
Anything else, betrays a complete lack of strategy... The corner square IS the strongest _(the centre square is poisonous and prevents hidden forks)_ ... but almost nobody realises this.
If you make a rule that nobody can take the centre until they have a mark on the board, then every game can be won by force.
true and thats why i dont even count it as a game its just game for kids when u grow up it seems useless
@@garychap8384 oh YES! i was hoping someone else realised! (idk but almost everyone i play with still plays the centre first it's annoying... haha i got bored once while waiting in the paediatrician back when i was 13 or something so i just started playing with myself)
people knowing how to play and it being a solved game definitely makes it tougher. it would be interesting to see this sytem vs only children (people who almost never have strategy) or a version of itself that does the other side
Not a Parker Pile of matchboxes then?
It kinda is, because it's playing centre instead of corner.
'I've tossed a coin but I'm not going to tell you what it is'.
'Heads?'
'No'.
HOW ABOUT A NICE GAME OF CHESS?
Horsey to King Prawn 4.
@@senik_8766 d5
It's the first time I've listened to drum 'n bass in 5 years. Thanks Stand-up Maths. I needed that.
I thought, that the best way is to start with a corner.
Tazer Of you do it right, but it is very unlikely to stumble across it by chance. Watch 3blue1brown's videos on the topic.
I actually discovered it :D
Am I the only one who finds the face he makes when he says "tic tac toe" absolutely hilarious?
To be honest this made me truly grasp neural networks. Thanks
This isn't really neural networks, I'm sorry to say. It's just basic learning where the machine is aware of all the possible states ahead of time, and just assigns values to them based on past experiences.
Neural networks are kinda based on this idea, but a bit more abstracted; they don't look at individual game states, and there are multiple 'layers' that each process information in a different way, influenced by their previous layer.
+Geogeo 3 Glad I could help! Remember this is only a first-order approximation and actual neural networks are much more complicated. But nothing a lot of matchboxes couldn’t do.
That's close to Q-learning (with discount factor equal zero). A Neural network would be different.
It seems like it would teach us a lot more about how the machine learns to program Menace B so that humans can move first. A deep analysis of the data generated by Menace A's centre-first strategy, compared to what Menace B does when the centre is left open by a human first move, might reveal some really interesting patterns.
I think it could also learn very differently depending on how often it samples its results. Having its number of beads updated after _every_ player would respond in a different way than playing fifty games at a time and updating all those results simultaneously from the same start position. Or updating after every hundred games, two hundred, etc.
Starting each box with multiples of each bead could also help smooth the numbers. A pre-learning state of having every option represented in triplicate might bring some insights. For example, those few anomalous blue beads in the opening-move box represented corner moves, and presumably led to some clever corner-based strategies that force mistakes. The machine might learn to give those strategies more weight if it has more opportunities to start with them.
really disappointed neither katie nor matt said "link in the dooblydoo"
I first came across this in Fred Saberhagen's 1963 short story "Without A Thought", the first of his Berserker stories. The (machine) Berserkers had a weapon which disabled higher brain function. the human pilot of a ship had to convince them it didn't work - which he did by teaching his pet to play the game using a Menace box/bead system
"It's 10 to the 27 metric universes across?"
What about the old imperial universes?
moogthedog
*imperial March plays*
The way I remember the learning algorithm from long ago was to remove the loosing move from the last box. Only when that box is empty remove the loosing move from the previous box recursively. That way it only prunes loosing strategies and the only way it could die is if there was a strategy for the second player to always win.
Regarding using different flavors of tic-tacs the winner could get to keep the tic-tacs as a reward.
"This must be what procreating feels like" -- Oh Matt...
I'm pretty sure that Martin Gardner did this in the 1950s, but I am glad that Matt Parker is keeping the tradition alive.
tic tacs learning tic tac toe tactics, has science gone too far?
I say it hasn't gone too far enough!
Tic tac toe tactics😄😄 oh man I love that
What a wonderful idea! And very engagingly done! Kudos on all concerned.
Why are Matt and Katie always together?
They work together.
"Katie works for Think Maths with Matt Parker, giving talks in schools around the country about engaging off-curriculum mathematics. She also does admin and project management for Think Maths"....
source: www.katiesteckles.co.uk/
They are Parker married. Katie even Parker took his surname (which means she didn't).
Shiny Swalot I hear their subjects are similar somehow but I have no idea how.
+Shiny Swalot We’re maths buddies!
To learn what procreating feels like.
Now I want to get back into my attempt at programming a neural network into a MUD engine... I mean, thinking in terms of my favorite (PennMUSH), I have rooms (containers that players and items can occupy) and exits which connect rooms. All object types have programmable attributes....and those could be weighted values. Such as "likelihood that this exit is used by a wandering object when it picks one at random" (a rat, maybe). But tic-tac-toe would be a much easier to start with...and the immersive quality of a MUD could make for some fun roleplay effects. A rat maze, however, would be way more inline with the dungeon crawler intention of a MUD.
I want all those matchboxes to be lit up at once. I feel like that would be so satisfying
But... no matches...
How would you decide which goes first, and the opening move?
There's a Mythbusters episode about that. Ended with them lighting 1 million match heads at once. I think you'll enjoy it.
Can a match box? No but a tin can
Link to the Mythbusters match-head bomb: ruclips.net/video/poV6lc2b070/видео.html
I believe Scientific American published an article on this around 60 years ago. But they suggested that you use a lot fewer match boxes by allowing for reflections and rotations. I believe only two or three dozen boxes were needed. And they suggested that when the boxes lost, the last bead selected in the last box should be removed. Nothing was added for a win. After loosing several dozen games to me, the match boxes became totally unbeatable if they had the first move. A lot simpler than what you are doing here. And it worked to perfection.
So you wanted to make matchboxes learn to win noughts and crosses but it only learnt to draw? That's a real Parker Square of a machine learning routine...
Daniel Titchener tic tac toe is a sufficiently easy game that each player can force a draw or win provided that one of the players uses the best strategy
You cannot win. The game is so simple that a human without a severe mental disability will always force a draw, no matter how much more intelligent or skilled you are.
10:10 "It's 10^27 universes, metric universes, across?"
Good that you clarified so we wouldn't confuse it with imperial universes, which are way smaller.
SOOOOOOOOOO, ....... at the end of the match,the inanimate match wins the match!!?!?!?!?!?!?!
That's MATCHLESS!!!!!!!!!!!!!!!!!!!!!
(& menace says ; "YOU'VE MET YOUR MATCH!!!!!!!")
hahaahaaaaa
For go, that is an insane number of boxes.
That was so much fun!
"If the first box runs out, it has learnt to resign on the first move, and that is Bad......" BUT Wargames taught us that is the correct move!
ruclips.net/video/6DGNZnfKYnU/видео.html
*only winning move, not necessarily the best
I remember doing this when I was a kid based on a Martin Gardiner article. Thanks for the memories.
I did the same thing.
"This is what procreation must feel like." 😂😂😂 Wow, this is one of the saddest sentences I've ever heard!
I can remember, in 1964, we had a publication called "L’Album des jeunes" from the Readers Digest Selection.
There was a similar game. It was an "Hexapion". It was very easy to play even for a kid (like me) and a few (not a lot) of matchboxes.
The principle was the same and it was very frustating (for me) to failed to win against a couple of matchboxes.
At this time I was thinking that the way the "machine" was punished was not a real punition because this punition was helping the machine to win... And on my side I had nothing to help...
Matt Parker for Doctor Who anyone?
Aashish Nehete yeeeeeeeeeeeeeesss.
10:05. One query. How many imperial universes does go need if it requires 27 metric universes of match boxes?