For the perspective of a game designer here. Most of these games likely predetermine which lever, rope, or door will contain what value before the game even starts. So in browsers big blast the game predetermines that this round the green lever will explode and so on and so far. Where this question can get interesting is, if the computer characters are programmed to have access to this value and choose the wrong answer on purpose. So its more about how long the game lasts over how often you win. Like how many times to computers actually make it to the end of the hall or how many rounds of BIg blast you have to redo.
Exactly what I was thinking. And because of this, the ways he tested the mini games weren’t completely reliable. For example, the chances of it being the same hallway every time is actually a just a very low chance. It doesn’t really say much about the game at all if you loose a most of the time while picking that, it’s just math.
@@meebolover1777 But if the hallways are truly random, any sequence of door picks will have the same odds of survival. A-A-A is the same as 2-A-1; neither is special.
Right, if the purpose is to find out whether the CPUs are throwing games, I would have liked at the very least to throw out any games where you immediately lose, as that's based on your luck, not the CPUs. Also while I'm aware people don't just have unlimited time, I think 10 games was far too few, I'd prefer closer to 100.
@@TroyVan6654 that’s a fair point, I hadn’t thought of that but in the end that doesn’t really change the fact that how he tested it was flawed. Just as the first comment said. It would have been more insightful to look at how often the cpu picked wrong almost immediately, or how often the masters were able to make it to the end despite those odds. But this is hard to test without cutting the game short because you failed too. If all characters could be put on CPU, this would be a bit more workable.
i think that the little timmy effect would have an effect on PARTIAL luck based games more so then on fully luck ones as it tries to get you to win even when you’re doing bad, and the little timmy effect also appears on the board, which is not accounted here
This video was really, really well done. Concept pulled me in cause I was genuinely curious about the subject, and then you got all intellectual with the test pools and graphs. Good job with the script and editing too!
This was great! And editing was 10/10! I do have one criticism about the methodology: For Bowser's Bogus Bingo, wouldn't you want to test against an Easy and a Master Bowser as well? As Bowser's part is also entirely random, I wonder if Bowser somehow does better (say, with fewer duplicate rolls) on Master compared to Easy. He's more of your "adversary" in that minigame than the other CPUs, though comparing against the CPUs was also important to experiment on.
I have 1 criticism on the methodology: the testing in this video is trash, you need to test so much more in order to actually prove if something is luck based, this whole video could be a fluke, it's not even very unlikely
@@atruepanda1782 Yeah, he should have played each minigame *_*100*_*_ _*_times_* - both on "Easy" CPU difficulty and on "Master"/"Expert" CPU difficulty, so 200 times altogether for each minigame. [What is in that way, it is theoretically possible to reach exactly 25 wins - the average number of wins, if it truly was an equal chance of winning for any given player (CPU or human player) - out of 100 possible tries for each minigame.]
This is a really interesting idea. But I have to say as someone who does a lot of statistics, 10 games per category is so small a sample size that most of the results are probably not statistically significant. The editing is pretty slick, tho.
@@gutsFunnyman I want to preface this that Im not a statistic person. 20 is still insignificant. if you want statistical accuracy, you're looking at 1000's to even begin accuracy. The higher the number, the more reliable the result. And when I say higher, I mean the more 0's we add.
I will say, the power of the test (how likely you are to obtain a statically significant result) depends on the true effect size (how different the odds of winning are under different difficulties). For example, 25-75 vs 40-60 (100 games a piece, 25% vs 40%) returns a p-value of .03414, whereas 25-75 vs 30-70 (100 games a piece, 25% vs 30%) returns a p-value of .5267. Double the sample size, and 50-150 vs 80-120 (200 games a piece, 25% vs 40%) gives a p-value of .001906, whereas 50-150 vs 60-140 (200 games a piece, 25% vs 30%) gives a p-value of .3135.
@@RainbowDashShadesOfApproval Depending on effect size, 100 could work. You would only need 1000s if the effect size was very small but real and you needed to tease that out.
To be fair: 10 is not the best amount to usually use. There is this phenomenon in probability that the more you repeat the experiment the more accurate the result becomes. Tho I can totally understand just going with 10. Other numbers would just take too long.
@@turtlecat3507 But these are proportions. You want to go with the expected value of np and n(1-p) both being greater than 10, which assuming a 1/4 win chance (which might not hold in games that can have multiple winners, but still), you'd want to use at least 40. Higher would be better for more significance, but 40 would be the minimum I'd consider here.
This was a fun test! I think we can agree that this doesn’t prove a lot, but it seems to be that the chances are quite equal or at least very close to equal.
So to all you saying he needs to do more research,, yes 10 is not statistically significant to pull any solid conclusions on, however across all 5 games, he never did worse on easy than expert, and with the variation I would expect, especially with such a small sample size, I would have expected at least once across the games for him to do worse on easy than normal if his theory were false. More research definitely needs to be conducted, however, there is definitely merit to the hypothesis I would also like to see, rather than 1st place or bust, how many of the loss games you got 2nd, 3rd, or last, and if that data would change anything....
Interesting video! If you do more with this I would like to see deeper dives into a single game, since I don't think it makes much sense to average different games together. For example, maybe you could save and reload one turn over and over to see how it plays out, or compare how often things spawn near you in games like Dizzy Dancing. In Bogus Bingo, I don't think it makes much sense to compare "winning" or "losing" vs the CPUs, since you're all on the same team. It would probably be more interesting to play as Bowser and compare how many hearts they lose on Easy vs on Expert. Fun vid! A lot of this is hard to prove without actually looking at the game's code, but it's cool to see someone experimenting.
Yes! Fully agree with everything in your comment! I wanted to recommend the mario party analysis series by ZoomZike on RUclips, he dives really deep into the mario party games, currently only covered 1 through 5
When considering games such as hide and seek, changing AI is probably just affecting things like grouping chance, which is the chance for the CPU to group up in one hiding spot.
I had a theory about hide ans seek when I was a kid. When the curtains are closing you can see where the individuals are going, and they have an opportunity for ONE more input to mix it up. My theory is that on easy mode the CPUs were more likely to simply end up at where they were headed when curtains closed (If it appeared they were going for mushroom house at end, they probably were) While the Extreme CPUs were more likely to input an extreme juke and go to a completely different set piece than it appeared they were going to at curtain close.
You really should have kept track of placement, not just wins. You want the average placement in the average match to see anything of note. Looking only at the extremes says nothing. Remember, second place also gets coins, and third sometimes gets a pity coin. Only 4th is a true loss
Despite the small sample size, I think this is the best edited video that RedFalcon had EVER put out there’s no doubt about that. Great video Falc!!!!!
I'm coming back at work after a long disease. It's very hard. Tired. Anxious. ... It's night and I'm feeling so Bad. Then there's your video. Thank you. Love your editing btw
A collab with game theory with this would be amazing. We know Matpat can get huge numbers with his followers when it comes to surveys, so why not do it again with this experiment. Fantastic concept, but like others I agree more trials would help and this could possibly fix that concern. Just because I find it fun, I say random number generator to make decisions
In Mario Party DS there is a minigame called Cheep Cheep Chance That minigame is AWFUL and is completely based in luck with RNG It's unfair in few words
Savestates: We Gotcha Covered. Just Use In The Instructions Screen Before Playing. Oh. Also Hit X To Enter Practice Mode. The RNG Actually Changes So You Can Maybe Win With Enough RNG Manipulations Via Practice Mode.
For bowser’s bingo, I feel like it would have made more sense to either change Bowser’s difficulty or to play as bowser himself and see if it was easier to best the easier CPUs vs the Masters. Or both actually! I would be interested in seeing what would happen in both of those scenarios :) Regardless, this is such a cool little experiment!
I'm a math major, and I'm telling you that you might actually have enough evidence to conclude that you have better luck playing with easy cpus than with hard ones. You won 21 games against easy cpus and 15 games against hard cpus. According to my calculations, if the difficulty level didn't affect your luck, then there would be a 0.57% chance that you would win at least six more games against easy cpus than against hard ones. Generally, in statistics, we say there's sufficient evidence if that number is less than 5%, so this does seem to support the Little Timmy theory. There are a few caviats. One is that Bowser's Bogus Bingo isn't entirely luck based because there is some skill involved in picking your card, but that was actually one of the games where you won the same number against each difficulty, so I don't think that's an issue. Another is that my calculations assumed that you have a 25% chance of winning each minigame, which is not the case for Hide and Seek, and it's also not the case in Mecha choice because it's possible for multiple players to win. However, I don't think the results of the calculations would've been much different if I had accounted for those things. Finally, the fact that you had a pattern for the choices you made in each game might suggest that those particular choices are just more likely to be the right choice when he cpus are on easier levels, but I don't think that's likely. I think you have a strong case.
I'm a statistics Master's holder and I can conclusively say we can't conclude anything based on this, at least not without some very nuanced statistical modeling. The mere fact that it's possible for both the player and a CPU to win is enough to complicate things. Plus p-values are overhyped and 5% is an arbitrary cutoff (that's rarely used in practice).
Easier CPUs tend to be dumber and tend to pick mostly the wrong answers,while mastet CPUs are smarter and always knows which is the correct one,smaller chance of picking the wrong one
Wow, this is such an interesting video. I've been wondering this for a long time! I have a question, what would happen if everyone tried to tie in mario party? (In an actual board!)
Appreciate all the people coming in with the ‘small sample size’ comments, there’s definitely something to be said for needing proper accuracy with the way the video is presented. At the end of the day, though, it’s just supposed to be an entertaining video. Unfortunately, with how the games work, it would take *far* too long to setup/record large enough sizes just to get edited down in a short video (that also isn’t being submitted as academic record). Also, shoutout to Sime for the beastly effort in putting this together!
Fun fact for those who do not already know this, If you roll 3 sevens on triple dice, you will get 50 free coins. I am not lying, I witnessed it in a fairly recent game I've played.
Some of the "luck" minigames have subtle tells, which is why those ones have a positive bias for higher CPU while others don't. The idea is that your subconscious notices them in a way that gives players an edge without trivializing the entire minigame.
The chance of winning with easy cpus luck based mini game is about 21/50. And the chance of winning with Master cpus luck based minigame is about 15/50.
For Bowser's Bogus Bingo, you're playing against Bowser, not the other 3 CPU players, who are ostensibly your teammates. It stands to reason that Easy difficulty would make them lose *fewer* hearts, as would an Easy Bowser.
Timestamps: 0:33 Master CPU Bowser’s Big Blast 2:35 Easy CPU Bowser’s Big Blast 4:04 Hide and Sneak Master CPU 5:33 Easy CPU Hide and Sneak 6:51 Bowser’s Bogus Bingo Master CPU 7:59 Easy CPU Bowser’s Bogus Bingo 8:53 Mecha Choice Master CPU 10:23 Mecha Choice Easy CPU 11:13 Cut From The Team Master CPU 12:58 Cut From The Team Easy CPU
I would point out that 10 isn't obviously a big enough sample size but everyone else has already done that, so I want to point out something else instead: On Cut from the Team the chance of winning isn't really a straight 1/4. It'd be constantly changing based on the current state of the minigame - if you go first, the chance you lose is 3/10. The next person would be 3/9, then 3/8. When someone gets knocked out it changes from 3 to 2, then 2 to 1, so the odds of each player losing the game would go something like: P1: 3/10 P2: 3/9 P3: 3/8 (loses) P4: 2/7 P1: 2/6 P2: 2/5 (loses) P4: 1/4 P1: 1/3 P4: 1/2 P1: 1/1 (loses) I'm sure that in practice it probably will balance out to being a roughly 1/4 chance to win, but theoretically certain player slots have better odds of winning in general. I would try and figure out which player slot has the best odds of winning overall but I've got other things to do lmao
Does this idea technically work when he wasn't doing anything random, but instead picking the same choice over and over only letting the NPCs make different choices?
@@Dkgow I'm not entirely sure what you're asking but even if he's picking the same choices each time it'd still be random whether or not he picks the right or wrong one, so the odds would be the same. Again it's more of a theoretical, I'm sure the AI is probably programmed to pick the ones that are wrong and there are definitely other factors at play as well. Ignoring that the game is going to have biases helping the player win though, the expected random outcome should be something like in my original comment (unless I've missed something)
There is a modicum of intelligence for the heroes in Bogus Bingo, in choosing cards that have lower probabilities of bingo, like ones where no individual enemy can fire you. Also, you maybe should have played as Bowser, since you would be in direct opposition to the CPUs like in the other games. Fun video nonetheless!
With the small sample size, you can't really say if a difference in the number of wins is truly significant. Mecha choice for instance, I am not certain if that is just coming down to you having a good round of luck or not. Statistically for that one, the chance of surviving goes as (2/3)^n, which means with three rounds, you have a 29.6% chance of making it to the end. This is then further complicated with the possibility of ties and wins that happen sooner than three rounds of guessing, but with just pure not getting the wrong door, 29.6%. However, this doesn't mean you necessarily will see that reflected in 10 rounds of the game. You can flip a coin 10 times and by chance you can get nothing but heads in those 10 tosses. Its only as we continue the tosses that we see the law of large numbers come into effect and we see the percentages approach their true values.
For future experiments like this, it might be worth it to mark down 2nd, 3rd, and 4th place separately. As it stands now, this data only shows whether or not you win, but maybe the easy CPUs make it more likely for you to get higher places
i was so excited to see you playing mario party 8 minigames! i grew up playing that, and i would love to see you play some actual boards of that one! :D
I would say that the evidence lends itself more to the null hypothesis: that there is no explicit handicap (AKA the little Timmy effect). However, given the stronger correlation of Hide and Seek future studies can be performed to study if the CPUs collude at various difficulties to make less optimal decisions. Similar games like Look Away could give us insight into the decision making process the CPUs undergo.
You need to test more rounds...also may I recommend simply having easy vs master on 2v2/duels while easy/normal/hard/master for 4p or battle minigames and just seeing if after a large number of minigames if they are close to having the same # of wins
The genius of Bowser Blast is that all players have the same chance to win, although it seems that player 1 would have an advantage over player 4, who in the first phase with 5 possibilities has a higher chance to get to phase 2 than player 4, who has only 2 possibilities.
If I remember right, if the controller is never pointed at the screen during Cut From The Team and you allow the timer to run out, the game will default to the most center option (nearest to starting cursor position). I remember NCS literally not doing anything in that minigame and won a couple times in TheRunawayGuys' playthrough for some reason.
The issue with this test is that the CPU’s on master will intentionally make better choices on mini games like Mecha Choice, Bowser’s Bogus Bingo. So Falc intentionally not allowing himself to make optimal decisions was what lead to easy CPU’s losing more often than the Masters
For bowser’s bogus bingo wouldn’t make sense to change his difficulty and see what he rolls are if he was master and easy and see how it affects the team
Just because something has 1 less of something else doesn't mean that it is automatically less/more lucky. Thats like saying something is less dangerous then something else because 10 less people die per year from it 💀 Edit: also if u want an accurate statement, maybe use the same game?
What would improved luck even mean in the hide and seek minigame? In that setup it would just mean them being more likely to choose hiding spaces on the left, which from a game design perspective does not make sense as something that would improve odds on lower difficulty as there's nothing to suggest that players would pick left more often Unless the hypothesis would be that the CPUs actually don't pick a spot beforehand and the game just randomly decides after you pick a spot on whether or not they're there, which seems unlikely
I think most likely, for luck games, the options for what makes you lose are predetermined, and the game just directs easy cpus to pick the wrong ones more often. They don't really make a random choice, whether they choose correctly or incorrectly is decided for them
For the perspective of a game designer here. Most of these games likely predetermine which lever, rope, or door will contain what value before the game even starts. So in browsers big blast the game predetermines that this round the green lever will explode and so on and so far.
Where this question can get interesting is, if the computer characters are programmed to have access to this value and choose the wrong answer on purpose.
So its more about how long the game lasts over how often you win. Like how many times to computers actually make it to the end of the hall or how many rounds of BIg blast you have to redo.
Exactly what I was thinking. And because of this, the ways he tested the mini games weren’t completely reliable. For example, the chances of it being the same hallway every time is actually a just a very low chance. It doesn’t really say much about the game at all if you loose a most of the time while picking that, it’s just math.
*Bowser’s
@@meebolover1777 But if the hallways are truly random, any sequence of door picks will have the same odds of survival. A-A-A is the same as 2-A-1; neither is special.
Right, if the purpose is to find out whether the CPUs are throwing games, I would have liked at the very least to throw out any games where you immediately lose, as that's based on your luck, not the CPUs.
Also while I'm aware people don't just have unlimited time, I think 10 games was far too few, I'd prefer closer to 100.
@@TroyVan6654 that’s a fair point, I hadn’t thought of that but in the end that doesn’t really change the fact that how he tested it was flawed. Just as the first comment said. It would have been more insightful to look at how often the cpu picked wrong almost immediately, or how often the masters were able to make it to the end despite those odds. But this is hard to test without cutting the game short because you failed too. If all characters could be put on CPU, this would be a bit more workable.
“Peach chooses her own color, and therefore automatically loses.”
Meanwhile him: *chooses his color 20 times.*
There's a pink one, and Peach pushed the purple one, so it's actually not even her colour-
@@ultimatebrainrot4674 true, true, but it’s the closest one to her besides red.
Wait what pink one?
Yoshi is green
He was choosing white
@@RepostCollection 0:50 you sure about that?
@@sbf4605 and red is Mario’s color
i think that the little timmy effect would have an effect on PARTIAL luck based games more so then on fully luck ones as it tries to get you to win even when you’re doing bad, and the little timmy effect also appears on the board, which is not accounted here
This video was really, really well done. Concept pulled me in cause I was genuinely curious about the subject, and then you got all intellectual with the test pools and graphs. Good job with the script and editing too!
Nobody would know better as the king of randomizers himself!
Double check mark lol
This was great! And editing was 10/10! I do have one criticism about the methodology: For Bowser's Bogus Bingo, wouldn't you want to test against an Easy and a Master Bowser as well? As Bowser's part is also entirely random, I wonder if Bowser somehow does better (say, with fewer duplicate rolls) on Master compared to Easy. He's more of your "adversary" in that minigame than the other CPUs, though comparing against the CPUs was also important to experiment on.
I just played a round of bowsers bogus bingo while bowser was on master, he rolled 4 boos and 1 koopa 💀
He may be *your* adversary, but he's my best friend 🫂
I have 1 criticism on the methodology: the testing in this video is trash, you need to test so much more in order to actually prove if something is luck based, this whole video could be a fluke, it's not even very unlikely
@@atruepanda1782 The only way to really know is to look at the code
@@atruepanda1782 Yeah, he should have played each minigame *_*100*_*_ _*_times_* - both on "Easy" CPU difficulty and on "Master"/"Expert" CPU difficulty, so 200 times altogether for each minigame. [What is in that way, it is theoretically possible to reach exactly 25 wins - the average number of wins, if it truly was an equal chance of winning for any given player (CPU or human player) - out of 100 possible tries for each minigame.]
This is a really interesting idea. But I have to say as someone who does a lot of statistics, 10 games per category is so small a sample size that most of the results are probably not statistically significant.
The editing is pretty slick, tho.
Imo doing 100 would show alot better results but ofc it's kind of unreasonable to ask for 100
Maybe something slightly bigger like 20? Or add the other difficulties?
@@gutsFunnyman I want to preface this that Im not a statistic person. 20 is still insignificant. if you want statistical accuracy, you're looking at 1000's to even begin accuracy. The higher the number, the more reliable the result. And when I say higher, I mean the more 0's we add.
I will say, the power of the test (how likely you are to obtain a statically significant result) depends on the true effect size (how different the odds of winning are under different difficulties).
For example,
25-75 vs 40-60 (100 games a piece, 25% vs 40%) returns a p-value of .03414, whereas
25-75 vs 30-70 (100 games a piece, 25% vs 30%) returns a p-value of .5267.
Double the sample size, and
50-150 vs 80-120 (200 games a piece, 25% vs 40%) gives a p-value of .001906, whereas
50-150 vs 60-140 (200 games a piece, 25% vs 30%) gives a p-value of .3135.
@@RainbowDashShadesOfApproval Depending on effect size, 100 could work. You would only need 1000s if the effect size was very small but real and you needed to tease that out.
To be fair: 10 is not the best amount to usually use. There is this phenomenon in probability that the more you repeat the experiment the more accurate the result becomes.
Tho I can totally understand just going with 10. Other numbers would just take too long.
Yeah its why 100 is usually used for tests.
@@wolfwarrior1176 40 is the minimum we use in things like this but that's per group type this video would only be the control group tbh
That's not exclusive to probability. It's just a reality of science and statistics.
@@ayajade6683 tbh probably can use 25 to use N over t dists
@@turtlecat3507 But these are proportions. You want to go with the expected value of np and n(1-p) both being greater than 10, which assuming a 1/4 win chance (which might not hold in games that can have multiple winners, but still), you'd want to use at least 40. Higher would be better for more significance, but 40 would be the minimum I'd consider here.
This was a fun test! I think we can agree that this doesn’t prove a lot, but it seems to be that the chances are quite equal or at least very close to equal.
So to all you saying he needs to do more research,, yes 10 is not statistically significant to pull any solid conclusions on, however across all 5 games, he never did worse on easy than expert, and with the variation I would expect, especially with such a small sample size, I would have expected at least once across the games for him to do worse on easy than normal if his theory were false.
More research definitely needs to be conducted, however, there is definitely merit to the hypothesis
I would also like to see, rather than 1st place or bust, how many of the loss games you got 2nd, 3rd, or last, and if that data would change anything....
Interesting video! If you do more with this I would like to see deeper dives into a single game, since I don't think it makes much sense to average different games together. For example, maybe you could save and reload one turn over and over to see how it plays out, or compare how often things spawn near you in games like Dizzy Dancing.
In Bogus Bingo, I don't think it makes much sense to compare "winning" or "losing" vs the CPUs, since you're all on the same team. It would probably be more interesting to play as Bowser and compare how many hearts they lose on Easy vs on Expert.
Fun vid! A lot of this is hard to prove without actually looking at the game's code, but it's cool to see someone experimenting.
Yes! Fully agree with everything in your comment! I wanted to recommend the mario party analysis series by ZoomZike on RUclips, he dives really deep into the mario party games, currently only covered 1 through 5
When considering games such as hide and seek, changing AI is probably just affecting things like grouping chance, which is the chance for the CPU to group up in one hiding spot.
I had a theory about hide ans seek when I was a kid. When the curtains are closing you can see where the individuals are going, and they have an opportunity for ONE more input to mix it up. My theory is that on easy mode the CPUs were more likely to simply end up at where they were headed when curtains closed (If it appeared they were going for mushroom house at end, they probably were) While the Extreme CPUs were more likely to input an extreme juke and go to a completely different set piece than it appeared they were going to at curtain close.
You really should have kept track of placement, not just wins. You want the average placement in the average match to see anything of note. Looking only at the extremes says nothing. Remember, second place also gets coins, and third sometimes gets a pity coin. Only 4th is a true loss
That only applies to battle minigames.
Despite the small sample size, I think this is the best edited video that RedFalcon had EVER put out there’s no doubt about that. Great video Falc!!!!!
I'm coming back at work after a long disease. It's very hard. Tired. Anxious. ... It's night and I'm feeling so Bad. Then there's your video. Thank you. Love your editing btw
Great video. However, I do just want to remind people of the fallacy of small numbers. Sample size is way too small to make any real conclusions.
A collab with game theory with this would be amazing. We know Matpat can get huge numbers with his followers when it comes to surveys, so why not do it again with this experiment. Fantastic concept, but like others I agree more trials would help and this could possibly fix that concern. Just because I find it fun, I say random number generator to make decisions
8:22
I love how he says "it's a boo!"
DUDE THE FREAKING EDITING ON THIS EPISODE IS INSANE
SIME YOU BEAST
In Mario Party DS there is a minigame called Cheep Cheep Chance
That minigame is AWFUL and is completely based in luck with RNG
It's unfair in few words
Yeah, I would always die first when I played with my little brother for some reason 😅
Also, it gives different players different chances of winning.
Savestates: We Gotcha Covered. Just Use In The Instructions Screen Before Playing. Oh. Also Hit X To Enter Practice Mode. The RNG Actually Changes So You Can Maybe Win With Enough RNG Manipulations Via Practice Mode.
For bowser’s bingo, I feel like it would have made more sense to either change Bowser’s difficulty or to play as bowser himself and see if it was easier to best the easier CPUs vs the Masters. Or both actually! I would be interested in seeing what would happen in both of those scenarios :) Regardless, this is such a cool little experiment!
13:02 & 13:09
Mario & Luigi: AAAAAA-
Peach: Ow!
Yoshi: AIAIAIAI AIAIAIAI
13:14
Mario, Luigi & Peach: *Scream*
the editing in this one was really good!!!
Editing was clearly made by a master though!
Cut from the team is the most random luck based minigame with no indicator or difference in each string cut. I love it and hate it
Your editor really outdid themselves with this video!
I'm a math major, and I'm telling you that you might actually have enough evidence to conclude that you have better luck playing with easy cpus than with hard ones. You won 21 games against easy cpus and 15 games against hard cpus. According to my calculations, if the difficulty level didn't affect your luck, then there would be a 0.57% chance that you would win at least six more games against easy cpus than against hard ones. Generally, in statistics, we say there's sufficient evidence if that number is less than 5%, so this does seem to support the Little Timmy theory.
There are a few caviats. One is that Bowser's Bogus Bingo isn't entirely luck based because there is some skill involved in picking your card, but that was actually one of the games where you won the same number against each difficulty, so I don't think that's an issue. Another is that my calculations assumed that you have a 25% chance of winning each minigame, which is not the case for Hide and Seek, and it's also not the case in Mecha choice because it's possible for multiple players to win. However, I don't think the results of the calculations would've been much different if I had accounted for those things. Finally, the fact that you had a pattern for the choices you made in each game might suggest that those particular choices are just more likely to be the right choice when he cpus are on easier levels, but I don't think that's likely. I think you have a strong case.
I'm a statistics Master's holder and I can conclusively say we can't conclude anything based on this, at least not without some very nuanced statistical modeling. The mere fact that it's possible for both the player and a CPU to win is enough to complicate things. Plus p-values are overhyped and 5% is an arbitrary cutoff (that's rarely used in practice).
aside from 10 being a small sample for a unstandable reason it would be good to differ between the different places instead of a binary 1st or not 1st
8:42-8:44 When I got the joke the first time watching this 😂
Mecha Choice is apparently one of, if not the hardest Luck based minigame in Mario Party
Something something sample size something something.
Was a fun watch. Generally suggests that luck based games are luck based.
Easier CPUs tend to be dumber and tend to pick mostly the wrong answers,while mastet CPUs are smarter and always knows which is the correct one,smaller chance of picking the wrong one
RedFacon: Theres no luck-based Minigames in Super Mario Party...
Don't Wake Wiggler: ._.
Technically it's not complete luck since if you don't do anything you automatically lose.
@@JosiahRobert14 I guess you have a --> *.*
Wow, this is such an interesting video. I've been wondering this for a long time! I have a question, what would happen if everyone tried to tie in mario party? (In an actual board!)
It would be a tie
sime went OFF with the editing on this one bro . splendid content once again!
6:41 don’t wake wiggler!: am I a joke to you?
This would be interesting if the trials were played 100 or even 1000 times, you'd maybe have a more accurate result that way anyways
Appreciate all the people coming in with the ‘small sample size’ comments, there’s definitely something to be said for needing proper accuracy with the way the video is presented.
At the end of the day, though, it’s just supposed to be an entertaining video. Unfortunately, with how the games work, it would take *far* too long to setup/record large enough sizes just to get edited down in a short video (that also isn’t being submitted as academic record).
Also, shoutout to Sime for the beastly effort in putting this together!
Fun fact for those who do not already know this,
If you roll 3 sevens on triple dice, you will get 50 free coins. I am not lying, I witnessed it in a fairly recent game I've played.
It’s so funny that whenever he loses, he’s just like “YESS I LOST”
Beautiful.
And then for just ONE episode, he’s actually happy that he’s winning
Some of the "luck" minigames have subtle tells, which is why those ones have a positive bias for higher CPU while others don't.
The idea is that your subconscious notices them in a way that gives players an edge without trivializing the entire minigame.
The chance of winning with easy cpus luck based mini game is about 21/50. And the chance of winning with Master cpus luck based minigame is about 15/50.
Love how you gave Bowser in the thumbnail an "angry" eyebrow, when he already have an angry eyebrow 🤣
I really enjoyed the narration-style format of this video.
0:19
Mario with four eyebrows does not exist, he can't hurt you
Mario with four eyebrows:
The editing in this video was great!
For Bowser's Bogus Bingo, you're playing against Bowser, not the other 3 CPU players, who are ostensibly your teammates. It stands to reason that Easy difficulty would make them lose *fewer* hearts, as would an Easy Bowser.
Timestamps:
0:33 Master CPU Bowser’s Big Blast
2:35 Easy CPU Bowser’s Big Blast
4:04 Hide and Sneak Master CPU
5:33 Easy CPU Hide and Sneak
6:51 Bowser’s Bogus Bingo Master CPU
7:59 Easy CPU Bowser’s Bogus Bingo
8:53 Mecha Choice Master CPU
10:23 Mecha Choice Easy CPU
11:13 Cut From The Team Master CPU
12:58 Cut From The Team Easy CPU
6:50 since the cpus can choose options, so it really isnt full luck
Epic editing throughout really made it appealing and interesting
Wow that's some great editing!
10 rounds seems statistically insignificant. I don't believe you could be certain in your results with such a small sample size.
7:59 I like sync
Wow, nice editing on this video!
I would point out that 10 isn't obviously a big enough sample size but everyone else has already done that, so I want to point out something else instead:
On Cut from the Team the chance of winning isn't really a straight 1/4. It'd be constantly changing based on the current state of the minigame - if you go first, the chance you lose is 3/10. The next person would be 3/9, then 3/8. When someone gets knocked out it changes from 3 to 2, then 2 to 1, so the odds of each player losing the game would go something like:
P1: 3/10
P2: 3/9
P3: 3/8 (loses)
P4: 2/7
P1: 2/6
P2: 2/5 (loses)
P4: 1/4
P1: 1/3
P4: 1/2
P1: 1/1 (loses)
I'm sure that in practice it probably will balance out to being a roughly 1/4 chance to win, but theoretically certain player slots have better odds of winning in general. I would try and figure out which player slot has the best odds of winning overall but I've got other things to do lmao
Does this idea technically work when he wasn't doing anything random, but instead picking the same choice over and over only letting the NPCs make different choices?
@@Dkgow I'm not entirely sure what you're asking but even if he's picking the same choices each time it'd still be random whether or not he picks the right or wrong one, so the odds would be the same. Again it's more of a theoretical, I'm sure the AI is probably programmed to pick the ones that are wrong and there are definitely other factors at play as well. Ignoring that the game is going to have biases helping the player win though, the expected random outcome should be something like in my original comment (unless I've missed something)
Why am I thinking about doing my linear regression project on whether or not cpu difficulty affects “random” mini games in Mario Party
Nice editing! 🔥
There is a modicum of intelligence for the heroes in Bogus Bingo, in choosing cards that have lower probabilities of bingo, like ones where no individual enemy can fire you. Also, you maybe should have played as Bowser, since you would be in direct opposition to the CPUs like in the other games. Fun video nonetheless!
With the small sample size, you can't really say if a difference in the number of wins is truly significant. Mecha choice for instance, I am not certain if that is just coming down to you having a good round of luck or not. Statistically for that one, the chance of surviving goes as (2/3)^n, which means with three rounds, you have a 29.6% chance of making it to the end. This is then further complicated with the possibility of ties and wins that happen sooner than three rounds of guessing, but with just pure not getting the wrong door, 29.6%. However, this doesn't mean you necessarily will see that reflected in 10 rounds of the game. You can flip a coin 10 times and by chance you can get nothing but heads in those 10 tosses. Its only as we continue the tosses that we see the law of large numbers come into effect and we see the percentages approach their true values.
For future experiments like this, it might be worth it to mark down 2nd, 3rd, and 4th place separately. As it stands now, this data only shows whether or not you win, but maybe the easy CPUs make it more likely for you to get higher places
I like the added theory at the end to wrap up the video on a peak.
This editing is soooo good!
Quality Editing On This One
Randomly stumbled upon your vids a few days ago and you are officially one of my favourite RUclipsrs 🤣
the scoreboard always make the video more entertaining
Hide and seak at 4:00 is not 100% luck because you can see where the players go to when they press they button
Wow what a great idea for a video! Loved all 15 minutes!
You didnt even watch the full video
Why do people comment "Great video!" when the video just released?
When you say great video when it’s not even possible to watch it fully by the time you commented
Yall are right i didnt watch the whole video.. yall caught me. I usually skip the intro where he explains...
0:16 good bye Yoshi have nice trip from the mountain🏔 XD
Kudos to the editor, they did an amazing job :)
this couldve been a really confusing video if not for the great editing!! amazing job
Great editing
i was so excited to see you playing mario party 8 minigames! i grew up playing that, and i would love to see you play some actual boards of that one! :D
I would say that the evidence lends itself more to the null hypothesis: that there is no explicit handicap (AKA the little Timmy effect).
However, given the stronger correlation of Hide and Seek future studies can be performed to study if the CPUs collude at various difficulties to make less optimal decisions. Similar games like Look Away could give us insight into the decision making process the CPUs undergo.
The editing is amazing! Thank you
8:35 i think there was no change because browser was on the same difficulty
10:42it is rigged to have more chance to pick the unpicked ones maybe so picking the same thing is worse or sth-
At least in expert it does that
You need to test more rounds...also may I recommend simply having easy vs master on 2v2/duels while easy/normal/hard/master for 4p or battle minigames and just seeing if after a large number of minigames if they are close to having the same # of wins
Continuing (305) to tell Falc that we all love him and respect him.
Easy Mario Party, since when?
I feel like to really get a good sense if there's any difference, you would need to play each game a couple hundred times on both difficulties.
Fans : Wow! This is a really good idea for a video!
Falc : *uses as an excuse to play mario party on easy*
Woah I never got that Mecha Choice pun either and I played that game quite a lot
Editing on point this time 👌
why is the editing so GOOD
The genius of Bowser Blast is that all players have the same chance to win, although it seems that player 1 would have an advantage over player 4, who in the first phase with 5 possibilities has a higher chance to get to phase 2 than player 4, who has only 2 possibilities.
6:36 The chance to win for the team of 3 is anywhere between 25% to 75% depending on how well they play (I did the math)
Luigi tried to Death Stare that Chain Chomp. Unfortunately for him, it wasn't effective, and so he took it like a champ. XD
if the AI does know during luck based minigames...
we need a VS dream mode.
love how you edited this
If I remember right, if the controller is never pointed at the screen during Cut From The Team and you allow the timer to run out, the game will default to the most center option (nearest to starting cursor position). I remember NCS literally not doing anything in that minigame and won a couple times in TheRunawayGuys' playthrough for some reason.
I guess for Bowsers Minigame, you should change Bowsers CPU level, because their on your team or not?
pov:
falcon: and now we’re doing easy cpu’s
peach: ah!
mario: oh!
luigi: WA-!
The issue with this test is that the CPU’s on master will intentionally make better choices on mini games like Mecha Choice, Bowser’s Bogus Bingo. So Falc intentionally not allowing himself to make optimal decisions was what lead to easy CPU’s losing more often than the Masters
For bowser’s bogus bingo wouldn’t make sense to change his difficulty and see what he rolls are if he was master and easy and see how it affects the team
Just because something has 1 less of something else doesn't mean that it is automatically less/more lucky. Thats like saying something is less dangerous then something else because 10 less people die per year from it 💀
Edit: also if u want an accurate statement, maybe use the same game?
No offense, but just using 10 repetitions on an event with 4 possible outcomes doesn't really give a result with much statistical significance
What would improved luck even mean in the hide and seek minigame? In that setup it would just mean them being more likely to choose hiding spaces on the left, which from a game design perspective does not make sense as something that would improve odds on lower difficulty as there's nothing to suggest that players would pick left more often
Unless the hypothesis would be that the CPUs actually don't pick a spot beforehand and the game just randomly decides after you pick a spot on whether or not they're there, which seems unlikely
I think most likely, for luck games, the options for what makes you lose are predetermined, and the game just directs easy cpus to pick the wrong ones more often. They don't really make a random choice, whether they choose correctly or incorrectly is decided for them
For bowsers bogus bingo, I think the Luck with bingo’s would only be logically changed based on bowser’s difficulty level, and not the CPU’s.
Cut from the Team is a battle minigame, and one of the main piss offs of 8.
If I see my partner die, the video dies - Super Mario maker 2