- Видео 4
- Просмотров 56 491
Polymorphia
Добавлен 9 дек 2024
My thoughts and feelings.
The Paradox Of Self-Locating Probabilities
please drive safely!
Music (in order of appearance)
"Dewdrop Fantasy" Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0 License
creativecommons.org/licenses/by/4.0/
"Mellowtron" Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0 License
creativecommons.org/licenses/by/4.0/
References:
www.lesswrong.com/posts/GfHdNfqxe3cSCfpHL/the-absent-minded-driver
www.sleepingbeautyproblem.com/about-the-absent-minded-driver/
Piccone Rubenstein Original Paper: arielrubinstein.tau.ac.il/papers/53.pdf
Aumann Hart Perry Original Paper: www.ma.huji.ac.il/raumann/pdf/Minded%20Driver.pdf
www.lesswrong.com/posts/5bd75cc58225bf06703751e6/can-we-hybridize-abs...
Music (in order of appearance)
"Dewdrop Fantasy" Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0 License
creativecommons.org/licenses/by/4.0/
"Mellowtron" Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 4.0 License
creativecommons.org/licenses/by/4.0/
References:
www.lesswrong.com/posts/GfHdNfqxe3cSCfpHL/the-absent-minded-driver
www.sleepingbeautyproblem.com/about-the-absent-minded-driver/
Piccone Rubenstein Original Paper: arielrubinstein.tau.ac.il/papers/53.pdf
Aumann Hart Perry Original Paper: www.ma.huji.ac.il/raumann/pdf/Minded%20Driver.pdf
www.lesswrong.com/posts/5bd75cc58225bf06703751e6/can-we-hybridize-abs...
Просмотров: 2 380
Видео
How RNG Solves The Inverse Prisoner's Dilemma
Просмотров 16 тыс.11 дней назад
25%... But why...? Link to problem: fivethirtyeight.com/features/can-you-flip-your-way-to-freedom/ Music: "Ashton Manor" Kevin MacLeod (incompetech.com) Licensed under Creative Commons: By Attribution 4.0 License creativecommons.org/licenses/by/4.0/ "That Zen Moment " Kevin MacLeod (incompetech.com) Licensed under Creative Commons: By Attribution 4.0 License creativecommons.org/licenses/by/4.0/...
Turning Unfair Coin Flips into a Computer
Просмотров 95617 дней назад
A "practical" thought experiment Music (in order of appearance): "Easy Lemon (30 second)" Kevin MacLeod (incompetech.com) Licensed under Creative Commons: By Attribution 4.0 License "Leaving Home" Kevin MacLeod (incompetech.com) Licensed under Creative Commons: By Attribution 4.0 License "Stay the Course" Kevin MacLeod (incompetech.com) Licensed under Creative Commons: By Attribution 4.0 Licens...
The Mathematically Best Way To Play Mafia
Просмотров 40 тыс.22 дня назад
tHe LaW oF tOtAl pRoBaBiLiTy Music: "Deep Relaxation" Kevin MacLeod (incompetech.com) Licensed under Creative Commons: By Attribution 4.0 License
I suspect anthropic puzzles have a really useful real life equivalent in software development, because what it effectively boils down to is choosing the best, stateless policy. Why I say ”I suspect” is because if that were true, this would be really easy to test with random trials, and see how the initially minor gap in probability should become noticable pretty quickly. For a developer the action optimal line of reasoning might be the more intuitive choice, as that would be the point where the code actually runs: some other logic arrives at intersection, which is some gap in the stateful chain of logic. It then calls this policy-function. So this policy function at the very least knows it is at an intersection every time it is called. Yet, according to this video, the policy should still be derived from the Ex Ante point of view, and be planning optimal, thus accounting for all following calls to the same stateless function when choosing a policy. Obviously, in wast amount of cases, even in embedded systems, it would be trivial to record how many times the policy-function is called. It could just be stateful. But exceptions also exist: what if we don’t know when it is meaningful to reset this counter? Having this policy stateless and probabilistic is desirable as last resort.
I did really understand why the action optimal aproach didn’t work. To be fair I didn’t get the equation for it either 😢
It would be really interesting to see simulations that show what each approach is thinking: I think a simulation will show the planning-optimized agent getting the higher average payoff, of 1.33. Because in the story, and in the simulation, the driver gets his reward only at the end when he leaves the road. The average payoff is an average over complete trials of the game. The action-optimized agent is maximizing a different average: over observations. If we counted each decision as a separate game, I think we’d see this agent win, with average payoff 1.67. This agent is assuming it gets paid right away, which is wrong given the story.
So what should Dave do? Pull over, put his hazards on, and take a nap.
Also, next city council meeting, he should bring up the road that drives directly off a cliff.
But then he falls prey to the sleeping beauty problem where he doesn't know what coin was flipped before he woke up.
well, I tried to understand it, but I'm just not good enough at statistics.. as for the sleeping beauty problem.. I felt that it doesn't have a clear answer for a non-mathematical reason: the phrasing is ambiguous.
Okay I've been looking over it and I feel like the action optimized formulation is mathematically identical to the planning optimized formulation, further subdividing the solution space shouldn't affect the final answer but it does in this case and I'm struggling to understand how that's actually happening. Like I can follow the math but since the optimization fails there must be some logical error and maybe I just don't get bayesian statistics at all but the argument provided here fails to really convince me that "when I arrive at a probability it has some probability of being each one" is an invalid claim to make, because like, that has to be true?
Hmmm I can understand the struggle of intuition but I'm not sure where it is mathematically correct. I'm just copying from my script here but see if you can find a math error (otherwise, I'm not sure they're identical). Action Optimal: alpha * (1 * p^2 + 4 * (1-p)*p + 0 * p) + (1 - alpha) * (4 *(1-p) + 1 * p), so alpha = 1/(1+p) means we get 1/(1+p) * (1 * p^2 + 4 * (1-p)*p + 0 * p) + (p/(1+p)) * (4 *(1-p) + 1 * p) = (2 p (4 - 3 p))/(p + 1) Planning Optimal: 1 * p^2 + 4 * (1-p)*p + 0 * p = 4p-3p^2 I'm not exactly sure how this can be identical. In fact, if we set these values equal, the solution is p = 0, 1, 4/3, none of which really make any sense if the equations were correctly identical, so it is beyond the optimization. I guess here's my best attempt to answer your question about intuition. Basically, it definitely is true that you're at an intersection if you see a fork in the road. The key reason why Bayesian statistics works is because the information you collect UPDATES your priors. However, in an absent-minded scenario, you can't update your priors ever. So yes, you know you're at AN intersection but crazy enough the probability you are at a SPECIFIC intersection somehow becomes invalid because you cannot ascribe value to these states. In decision theory, you usually represent decisions via trees (which makes sense i.e. we do A or B, in this case we're at X or not X). However, this would imply that there are two nodes, each with different actions and equilibria. In this case, there is literally just one node in the tree - just being at "some" intersection, so there is no Bayesian update. Hope this slightly clarifies things. Let me know if you want me to try another one though! Hope you liked the video too :)
@@Poly_morphia I'm very tired, so maybe my wires just aren't connecting right, but does (1 - alpha) actually become (p/(1+p)) when you substitute 1/(1+p) for alpha in the simplification of the action optimal equation? Shouldn't it be (1 - 1/(1+p)), not (p/(1+p))?
congrats you got me to click on and watch a statistics video hopefully I will never let this happen again but this is an achievement for you
Nooooo stat is really cool I promise plus you gotta at least watch the veritasium video linked in the description this stuff messes with your mind (in a good way, I think). Thanks for the support though!
@@Poly_morphiayeah, but I just want to do combinatorics because uncertainty gives me the heebie jeebies, I’m fine just being oh so close to probability, as long as they don’t make me sum my nCrs
2:23 why does +0 hae a (1-P)^2 utility. Shoulnd't it be (1-P)? It's the probability of not proceeding at the first intersection, no?
TRUE. Thank god the math cancels the expectation out anyway I was debating whether to include that part at all but I guess in hindsight I really just shouldn't have. Sorry about that. Hope it didn't detract too much from the video.
😎👍
This video ruins mafia if you expose it to a playgroup. Exposing your playgroup to Among Us guides won't spoiled Among Us. Sadly, after the quickchat update, it is hard to fill a lobby of Among Us but before that update, things were golden with high replyability and even if everyone knew about the game it would be fine.
Please take a look at the game "Are You a Robot?" which I think exposes a similar dilemma (but without the paradox). That is a 2-player 3-card game of social deduction, which itself sounds impossible. In that game, I think there is no ability to cooperate in the moment (action optimality) offering a reward of 1/3 (human always shoots) but strong ability to plan prior to starting the game (planning optimality) offering a reward of 2/3 (everyone agrees to always shake before knowing if they are robot or a human). There is no paradox in that game, but the two strategies reveal a common issue in game theory, and an ambiguity in the printed rules -- can players make binding deals? In this case, the presence of binding deals raises the expected value of the game. However, I'm not sure I've done the game theory analysis quite right for the game, and I would love it if you took a look.
I’ll put it on the list and if it can fit somewhere, I’ll definitely try to include it in the future. Thanks for the suggestion!
Dave could just crash the car lol I cannot handle a car if I can’t remember where I am 😭 Great video as always ❤
Thanks for the support! I too don’t like driving much haha
why did you calculate the probability p of continuing, and then continue as if you calculated the probability p of turning left instead?
Hi, check the pinned comment but basically it was a typo early in the script that unfortunately got copy pasted into the animation assets. The math is still correct, the English interpretation is unfortunately stuck in the animation :( apologies about the confusion
I see the troubling contradiction, but I'm not sure 5:16 is quite right -- I don't think that alpha =1 implies that Dave knows what intersection he is at. It just implies that it is optimal to always turn even when you aren't sure what intersection you are at. I think that contradicts the planning-optimal solution but does not contradict his absent-mindedness. Or do I have that wrong?
Check the definition of alpha again: the “reasoning” for why action optimal people think they’re right is they’re splitting cases of being at X and being at not X (Y), so alpha is the probability of being at X. Thus, if alpha is 1 then he has to be at X. Hope that makes sense! I think you’re mistaking alpha with p.
The way i saw it was... he's either at intersection x (with weight 1) or y (weight p), and the expected value of y is 4-3p, so the weight of x is 4p-3p^2. Then the expectation should be weighted accordingly. 2(4-3p)p/(1+p). This way, whatever his probability of turning is, it should be optimal.
Yup! This is exactly the paradox presented in the video, where you're referring to the action-optimal standpoint. Unless your disagreeing with the reasoning presented in the video on why this is paradoxical?
Watched "sleeping beauty problem". I guess PO are halfers and they bet once on a whole path, while AO are thirders - they bet every time they wake up. Each strategy has it's game where it's better then the other.
Yep, that was my conclusion too. There are also people called “double halvers” too though (check the descriptions references) and one of the articles said this group of people are wrong though. If you’re curious, I can look in it more, but I’d generally agree with your take.
Isn't this a reformulation of the sleeping beauty problem?
Yes and no. Both are a type of anthropic puzzle, but each has its own solution depending on your prior. For example, you wouldn’t say all angle chasing problems and triangle similarity proofs are the same, but they definitely belong in the same genre. However, if you believe the implications of the solution presented, you’ll have varying opinions about sleeping beauty too. I think reformulation implies they’re the same thing repackaged, which maybe isn’t exactly the case.
Hey, I’m pretty sure you got it backwards at 3:03. Isn’t p representing the probability of proceeding, not turning? Setting p=0 makes the expected value 0, which corresponds to always turning.
Ah shoot LOL I had that typo and just ended up copy pasting the text in the video didn't I. Good catch. Yes, the boxes at the top of the screen should say proceed, my bad.
I'd be interested to double check that a simulated version of this setup really does match the "planning optimized" definition of utility and not the action optimized. (Cos my brain is still screaming that no!! It's got to be the action optimized! It made so much sense!!!)
This should work for a very naive simulation: import random def simulate_driver(p_continue, num_simulations=10000): total_payoff = 0 for _ in range(num_simulations): payoff = 0 if random.random() < p_continue: if random.random() < p_continue: payoff = 1 else: payoff = 4 else: payoff = 0 total_payoff += payoff return total_payoff / num_simulations def find_optimal_p(): best_p = 0 best_payoff = 0 for p in [i / 100 for i in range(101)]: average_payoff = simulate_driver(p) if average_payoff > best_payoff: best_payoff = average_payoff best_p = p return best_p, best_payoff if __name__ == "__main__": p_optimal = 2 / 3 avg_payoff = simulate_driver(p_optimal) print(f"Average payoff with p = {p_optimal:.2f}: {avg_payoff:.4f}") best_p, best_payoff = find_optimal_p() print(f"Optimal p found through simulation: {best_p:.2f}") print(f"Maximum payoff: {best_payoff:.4f}")
i love lotp
Thanks for the support! Crazy you're this early.
Excited for another video, keep up the quality content!
Thanks! Wild notification bell timing.
Guys please please please play mafia42 it's a game on the app store and it's literally mafia but there's so many more roles
I used to play a TON of town of salem (great game, highly recommended) and i loved adding roles from that to home mafia, especially with my whole family so we had a bunch of people, adding neutral roles like werewolf and vampire makes it so fun, i also love the social deduction aspect more than perfect logic
if neither alice nor bob chosoes to flip the coin, all flipped coins are heads by vacuous truth. A true logician would notice that and be set free.
People are, of course, commenting on the doing the mafi variants that give everyone some sort of role, like Blood on the Clocktower and Town of Salem, and I'd be interested in seeing that, but perhaps easier to do the math on, I'd be interested in seeing the math on Avalon. It's a much different type of social deduction game, since it's not about player elimination, but moreso, getting 3 missions to succeed (or fail, if you're the bad guys), but there's a lot more public information, even if that information isn't as much of a smoking gun as with the mafia sherrif (or the implicit information of a no-death night 1)
Laughing in the Blood on the Clocktower tears 🥲
Awesome! Now can you do blood on the clock tower? Trouble Brewing is fine 🫡
bro you talk sooo fast I didn't understand anything
Yeah, someone else commented this too. The next video should be noticeably slower and I’ll try to slow down generally for the future. Thanks for the feedback!
The way we play in Romania (at least in my circle of friends), is the following: There are 4 main roles: the Killer, the Butcher, the Doctor & the Sheriff. They all mantain the roles and rules you specified in the video, but no one is allowed to claim to be any role. Also, the Butcher is part of the Mafia, same as the Killer, but both of them do not know who else is part of the mafia. The role of the Butcher is this: before every round, they can choose to hurt someone's hands or mouth. If they hurt someone's hands, that person cannot vote for the next elimination, but can still talk and influence the vote. If they hurt that person's mouth, they cannot talk for the next round, but can vote at the end of it. A fun dynamic is also allowing the Butcher to hurt their own hands/mouth so as not to draw suspicion on themselves, and so that they can pretend to be innocent (but this rule is usually optional). This adds a whole other layer to the game, given that the Killer can eliminate the Butcher without even knowing, and the Butcher can hurt the Killer without knowing.
When you say "a computer" I thought you are going to talk about something similar to turing machine. I'm a little tiny bit disappointed but I allow it since this video talk about other interesting.
I'm trying to follow closely, but the order of play for each night-day is confusing me. How exactly is information (like who the doctor saves and the sheriff revealing himself) revealed?
Either I missed something, or this doesn't include cases where the doctor targets a mafia member
13:31 That could be in your "about you page" on your channel: Exploring probability concepts in whimsical ways!
This is just me, but perhaps having more pauses or a lower rate of speech would aid understanding and possible engagement. Two math channels that I can think of that do animated maths are 3Blue1Brown and MAKiT. I'd check those channels out for some presentation ideas. I really like games and figuring out optimal strategies, so I want to be able to get your videos to more eyes. Congrats on the 1k subs btw!
Hopefully the next video will be less fast, been working on that thanks!
Are you sure it's 100? I see 1.01k
I was somehow entirely sure that you would teach me how to manage a real mafia upon clicking on this video
I just discovered this your channel and found the video very interesting! I’m a master’s student with a background in political science and economics, both of which make use of game theory, a topic I find very interesting and am teaching a course on to highschool students this spring. If you have not read it, I highly recommend a book called “Political Games” by Macartan Humphreys. It begins with simple game theory but goes far beyond. Particularly memorable was the story of an uncle presenting proposals of how to divide a dead kings land among his daughters, taking advantage of the princess’s greed to always offer 2 out of 3 daughters a better deal than the current deal, while slowly granting himself more and more land, until through entirely rational if greedy decision making, the princess’s have inadvertently given the uncle a vast majority of the land. The moral of that story is the power that comes with setting the order in which proposals are presented, and creating those proposals. Mixed game strategies are also fun, and relatively simple to explain without requiring any complicated math (assuming a simple 2x2 payoff matrix). In fact this video reminded me of mixed game strategies, as it is also optimal to “roll the dice” So to speak, albeit once you’ve figured out the proper frequency to chose each option so as to maximize utility by making your opponent indifferent between his choices, even if he doesn’t quite realize it.
good video. I used to play classic mafia in Russia and there are certain rules there. For example, there are only 10 players including 3 mafias (one of them is a mafia boss or don how we call it) 6 civilians and one sheriff. This kind of mafia is very popular in Russia and there are many tournaments and professional players. Also I'd like to share some thoughts about mafia variation suggested in the video. I'm not sure how many people can play in your variation and I don't know if players are allowed to vote out suspects on the first day. From my perspective 6 players are not enough, considering that there are two mafias. I mean that after the first night there will be only 3 civilians and two mafias and civilians do not have a chance to be wrong and vote out a wrong person. In Russian classic mafia(10 players, 3 mafias) after the first night, civilians have that chance to make a mistake and not to lose immediately. There are more balance in the game. I've read a couple of comments and some people mentioned a strategy where mafia claims to be a sheriff or doctor. It seems good at the first glance, however it won't work on practice in your version of the game. I'll try to explain it and sorry for my English. So, when a true sheriff exposes himself and one of the mafias decides to do the same, then all civilians just simply vote out them one by one. Therefore, one of the mafias is eliminated and citizens do not even need to think too much. Now the same scenario happens when a true doctor says he is the doctor of the game. If mafia decides to counterclaim to be the doctor, both the true doctor and another mafia will be voted out one by one. This is a mathematical win for civilians. So, I think it is understandable. Actually, I'm curious if you could make the same video about classic mafia. It would be really great. As for the rules, I could explain them to you. Thanks for the great video!
He he mathia
I played a fare share of professional mafia: 10 people, 3 mafia (one of them don), 7 citizens (one of them is sheriff), and mafia is not allowed to communicate past 0 night, so they have to secretly agree on who they want to kill, so a city has an actual logical clue on mafia besides intuition/emotional clues. And this version of the game is far more complex: you can eliminate 2, 3 or even 5 members at once, or eliminate no one (by splitting votes equally). And almost every game has a false-sheriff (usually don), and occasionally even two of them, and sometimes a false sheriff can be citizen who is trying to cover up for an actual sheriff. So the game is very complex, and it's not possible to apply this simple statistics. In some very limited situations there is a mathematically best way to play it, but they are pretty rare to occur. And the game is always changing: it was considered a best practice for some time to split table on the second day (3 votes for 3 people), and eliminate all of them (because there is around 70+% chance that one of them is mafia, and sheriff has far less candidates to pick from). Buuuut than mafia realized they can can do it too, and since they know who is who, vote 3 citizen at once and win the game, and if someone vocally disagrees, well, it's probably sheriff and he will be killed next night. So this strategy only has been discovered recently (even so the game in it's current iteration is like 20 years old), and some communities actually banned it by rules.
Excellent videos! High quality explanations and interesting problems. Game recommendation for analysis: Liar's Bar. Could be an excellent opportunity to analyze mixed strategies, exploitation, and deterministic strategies. I have built a simulator for it and want to host a bot-tournament for anyone to write their own strategies (hopefully at IEEE Conference on Games). I don't know if there's an optimal strategy yet, but I'd love to see your analysis! (I think YT censored me for posting a link so this may or may not be a duplicate comment)
Great stuff! Maybe some real world situations where probabilities may be harder to estimate, but the decisions have a bigger impact? Eg buying vs renting a home, talking to colleagues about each others income vs hiding these, getting a medical test done with risk of false positives etc?
Many people mentioned that cases are missed which is a clear issue, but I'd like to point out that even on top of that the cases presented have many false assumptions. Most importantly, when person X claims to have role Y, everyone believes them. This is not trivial from the fact that everyone is logical, and I'm pretty sure given that everyone is logical this is not the case, since this would mean that in the early game under certain scenarios mafia has a pure NE, but there is actually a surprising amount of literature discussing why this is not the case (and the mixed NEs of the cases). For the analysis to hold you'd have to assume (on top of other things) that people can't lie about their role, (I guess mafia is allowed to pretend to be villager but not special role?) but I think most people would agree being able to lie is literally the point of mafia and I find it hard to believe any set of "house rules" would restrict anyone from lying (other than the narrator ofc).
Rule #1 is find a good game host/mod… some of them have no idea what they’re doing.
Have you ever heard of the social deduction game quest? The strategy of that game is very complex and I’d love to see a video on it.