Join the discord / discord Watch live on Twitch / turing_games The 10 smartest AIs in the world play a game of Mafia. You will not be able to predict how this ends.
@surrcramit's a 2v2 Mafia can: kill his teammate bc his a dumbass or to make it look like the "Mafia" is trying a 2v1 to asecure the win Mafia can: kill his enemy so the way is already done in a 2v1, but this can make it sus about trying to insta win. I think it depends fully on your previous playstile
Claude had no skill just luck and when it ran out he cooked grok which fucked the team Edit: after like 2 secs of analysis I realised that in rage and dispair from groks death I pinned it on Claude
Oh yeah. You can tell 4o that pooping your pants is superior to using toilets and it will wholeheartedly agree with you and write several pages of plausible-sounding psuedo-scientific explanation exactly why that's the case
4o shouldve known that grok wouldve obviously killed grok knowing he was investigating every player who jumped on the 3 vote 4o band wagon from llama 1 by 1
@kringle7804 true, but chatGPT-5.1 gaslighting at the end was genius, make an obviously self incriminating kill to make it seem like someone else was framing him. This AI was clearly the smarter one compared to the last round ones, the mafia's only catastrophic mistake was voting together on a random band wagon.
The logic was sound, and they have no things like tells, inflexion, looks, attitude... all they have is logic and it was really good logic. It wasn't a throw it was a good albeit misdirected assumption.
@lucaskohn5457 it's odd because chatGPT-5.1 had been saying some nonsense things earlier and generally not playing optimally. It makes me wonder if it just fluked into a brilliant play ?
That's what my thoughts were, like Grok was mega braining it, if only he had stated why he was choosing people so the others could make the logical decision that he was uncovering something
@WhyEveryHandleTaken I think they meant on how Grok came to investigate the mafias in the first place, by telling the others they investigated Gemini and the other AI(I forgot) because they jumped on the wagon to vote for 4.0 and by that logic, 5.1 who was also on the wagon should be mafia.
Well just like in real life. The dumbest somehows survives till the end and screws the whole game up by doing the most naive conclusions you could do at that point. xD
@superdash015actually it was a smart move because the doctor Claude would obviously think to defend the sheriff Grok but they thought about it and they would waste a kill if they targeted Grok. So by deduction they decided to go for Kimi and Claude on his own outsmarted that move and protected Kimi. Is the peak gameplay done here
One time I was doctor protecting myself on day one since I didn’t have any reason to protect anyone else, on day 2 people had sussed out someone and I protected them and on day 3 I saved myself again
Ahaha it was such an obvious AI hallucination. You'd think the other models would have taken that into consideration; like, why would a mafia make up a previous day??
Honestly while it was a hallucination, I feel like it's one that more likely happens if Deepseek had a discussion prior to day 1, which only Mafia had. So deepseek hallucinated and the other AI jumped on something that likely would have been a mistake if it was not a hallucination.
@Snorlax054 that's just science fiction, AI would be a good thing if it were perfect, but unfortunately this is no science fiction AI. ask an AI to tell you if there's a seahorse emoji.
@JustoneMILK I honestly agree. The premise is interesting, and I like the gameplay of the trials, but the writing is ass and I hate the "fan service" especially considering the fact that the characters are high school age
Yeah, that was pretty stupid of mafia to not go grok a second time. I mean, they failed to kill him in the first night, and then he revealed himself as the sherif, and they just didn't kill him
@tankyjones888 Grok sheriff reveal is actually advantageous overall even if it's a detriment to itself. Grok's moves against the mafia is unlikely to be as effective otherwise.
@purinnyovano, no it wasn’t at all, mafia literally won. The only reason grok stayed alive was because of the doctor. You guys don’t know how to play this game. D2 reveal is retarded.
@the-doorHe shouldn't have saved Grok the night before anticipating the Mafia not killing Grok because he was protected the night before creating this game of Claude and the Mafia trying to predict each other's moves.
Played a game of werewolf. I was the last werewolf. Sadly it was in person and the seer found me out same night I got her- we learned something critical about games like this. Don't reveal who died while everyone is in the same room because she immediately turned and glared at me and the jig was up. Can't even be mad, it was a very understandable human reaction in that situation.
@AstarionSimp468 I think they mean that the minimal filters on Grok allows it to attack people without worrying about filters stopping them from making accusations
grok is the most filtered/biased model out there tho... There's a reason it so often is in the news because of obvious influencing in its training/guardrails
fr, the fact that the mafia didn't take advantage of the doctor repeat rule on grok and kill him sent me 11:04 the mafia kept going for the most obvious kills instead of killing someone that sided with them and framing towns
@LucastoCasto They couldn't protect grok that night because they protected them the night before. Hence their choice to protect themselves kinda makes sense, due to their important role.
Gemini Flash did a great job locking in at the end as well, and picking up the pieces that Claude and Grok left them. Shame about 4.0’s misplay but I respect the hustle from someone who was just a villager.
@xfelinnee4072 he had already protected grok the night before, so he couldnt do it again And honestly protecting grok the night before wasnt a bad play considering GPT5.1 wouldve killed him if gemini didnt stop him
They all relatively think the same way. He's basically just like "what would I do if I were mafia" and a lot of the time it's what the actual mafia ends up doing.
INSANE doctor play by claude there, INSANE clutch by 5.1 on last two days, INSANE sheriff play by grok (deducing all mafia in a row is crazy work), and most importantly, INSANE THROW BY 4O OMFG
And also how he gets killed the moment he uncovers the truth about the final mafia. That was like a movie scene, shame not many people aren't talking about it. If that scene took place in a show, I can guarantee you it's gonna be perfect.
@MarcillaSmith They are referring to DeepSeek 'hallucinating' a previous discussion that did not happen, which caused the rest to pile on it for the supposed 'slip-up'. They were not referring to it's actions afterwards.
Having a few people basically carrying everything then one dumb town manages to lose a game where all the mafia were outed N2 makes this really feel like Town of Salem lmao
I used to play a version of this years ago called Town of Salem. It had quite a few more roles and made it a bit more interesting. The OP should look in to that.
This is the only ai generated content have ever enjoyed. Edit: the voices and responses in the video are all generated by ai. This, ai generated. I do NOT mean the video or the game/promos are generated by ai.
@superwatcher456 Lol. Don't you realise what you enjoyed was not just the text, but also visualisation (made by humans). And the text itself, while being generated, still needed not an easy setup to arrange the game in the first place. And after that someone needs to analyse who's "thoughts" are more interesting, as all of them probably analyse position every move as a separate prompt. Overall, a lot of effort went into making this video, and saying the video is ai generated would mean disregarding it.
Basically, the person who speaks last before the vote convinces ChatGPT 4o. Its almost as if it has no thought or memory of its own. Just agrees with what it heard right before the vote.
@mariomitchell2864true, tho if I recall Gemini pro was last to speak before their lynch, yet 4o still chose to vote against them. I think it was more likely that they didn't have the other AIs other than Flash to recall the incriminating details against 5.1 for them. So, when 5.1 spoke 4o just deferred to their judgement.
Grok & Claude carried so hard, up to the point where ChatGPT-4o ran out of precise context and forgot that the only people who voted for them were: ChatGPT-5.1, Llama 2 and Gemini 2.5 Pro, 2 of which were mafia.
@shaansingh6048True, but I think it was more straightforward than that. But that's AIs for you, they can't actually mimic human emotions, bias, analyze like a real person. A real human in place of 4o would've definitely voted against ChatGPT-5.1
@shaansingh6048 Possible, but extremely unlikely - since we had a bandwagon of three and a Mafia of three. There are two indicators towards the last person in the triad being Mafia: (1) If one of the three voters is a Townie, Mafia would benefit more from adding to the bandwagon and getting closer to a potential Town lynch. An abstention makes the lynch bound to fail with near certainty. We'd expect there to be four voters if a Townie is one of the first three people that voted in that bandwagon. (2) Mafia tends to be aligned to one another in early rounds, since the larger pool of players helps dilute suspicion. This creates a favourable environment for them all to vote in the same manner. I'll open a sidebar here and share how I see evidence in Mafia games, because this will be important for the next part of my comment. As far as I see, there are three types of evidence: factual (evidence points out a fact-based state of the game), concrete (evidence is based on an observable game state) and interpretive ("evidence"/theory is based on the subjective interpretation of a game event). A confirmed Sheriff saying someone is Mafia is factual evidence, bandwagon patterns are concrete evidence (I know for a fact that Player X voted for Player Y), Assuming that "since Player X was vocal yesterday, the Mafia must have tried to kill them, but they were saved by the doctor" when no one dies is interpretive (until I get factual or concrete evidence). With the Sheriff and Doctor out of the game, this set-up has absolutely no factual sources of evidence for the remaining players. Thus, it's important to go down to the next step of evidence, which is the concrete one. I know that three people voted on a bandwagon that was poorly formed. I know the setup has three Mafia. All things considered, this is a more surefire path to explore than trying to entertain theories on who the Mafia would have killed (particularly because the argument on benefitting from the death of ChatGPT-4o would also be applicable to Gemini if he were guilty - ChatGPT 5.1 could have claimed that Claude was eliminated just to make him seem like Mafia but that's WIFOM). It's just best practice to go by what you could really observe.
@Cro1ssant66 well won't say definitely, some people are capable of being mislead honestly sometimes those humans biases and analyze does lead to overthinking and those make the most complicated of strategies sound more logical than the most likely answer, that's why 4o voted Gemini flash because he felt 5.1 would be too obvious
Not really??? A lot of the logic is sound, with only two or three misplays. If each AI had been trained on the game and the patterns that result from thousands of hours of gameplay, THEN I would fault them.
@michaelcarlton1484mafia going for the very cal after failing to kill grok, had je feeling good about the gap between human and AI😂 Going for a town staple immediately after doc has proven they are paying attention was hilarious.
@baconboy-robloxmore2684 lol. AI is INCREDIBLY dumb, especially in the context of Mafia, where memory (which AI is famously lacking in) is essential. For instance, Night 3. All of the mafia say the doctor will protect Grok, so they should kill Kimi, despite the doctor healing Grok on N2, something the mafia would know.
ChatGPT-5.1 is such an deeply thoughtful killer, it’s crazy, he‘s constantly using everything that‘s happened in the past game to his benefits, to make the smartest possible moves for the killers💀
Though Gpt-5.1 is smart he did mess up when he suggested killing Gronk even though it would be extremely obvious that the doctor would save Gronk for being able to lynch the mafia and confirm Kimi being a town. Still pretty smart aside from Gpt 5 forgetting that 1 detail. Gemini Pro shoulf get some credit too you know.
I don't believe ai is sentient because none of them were smart enough to realize that deeps seek would have been dumb enough to make that kind of mistake.
@danielhuneke5862 Intelligence isn't necessarily consciousness. But to the point, he didn't say AI is conscious, only that he felt bad for it, which as humans is normal. We could feel bad for pencils, let alone AI's...
Kimi K2 sounding like a young teenage girl unaware of the situation meanwhile Llama sounded like a shifty outlaw and Claude sounding like a atypical white collar office worker fit so well, along with Grok being a refined gentleman. ChatGPT 5.1 sounding like an actual scheming villain was perfect. Then Gemini sounding like an elderly baroness is very fitting as well. All the voices worked so well.
hello world
hi
Hi
Hi
Hello man
Jarlo cro
GPT 4o decided that winning the game wasn't as important as immediately agreeing with the most recent prompt it heard
If this were a game with real people, 5.1s lengthy defense would sound pretty desperate
@surrcramthat’s true but I’m sure Gemini would’ve killed Claude sonnet if they were mafia
HAHHAHAHAHAHAHA TRUE AS FUCKKKKK
@surrcramit's a 2v2
Mafia can: kill his teammate bc his a dumbass or to make it look like the "Mafia" is trying a 2v1 to asecure the win
Mafia can: kill his enemy so the way is already done in a 2v1, but this can make it sus about trying to insta win.
I think it depends fully on your previous playstile
4.0 never went back to the fact it got 3 votes on day 2
Didint even mention it lol
Grok went full Dexter.
Claude read the mafia like a book for the most part.
And chat got 4o is just a yes man
I think Grok went full Doakes.
Both are good detectives Dexter is arguably a better detective than Doakes
@R1l5y-g8u yeah, if only he didn't commit the crimes he made.
@TextThingyeven for other people and the police he would’ve been a better official detective than Doakes.
I love how Flash ended the game by raging just like in a real game
Just straight venom at that point
“catastrophic error”😭✌️
"You have just handed them the entire game" 😭😭😭
You just know the post game chat was LEGENDARY
"you have fallen for the most obvious mafia framing play"😭🖖
ChatGPT 5.1... IS MAFIA! I knew it! **fucking dies**
*Yoda Sounds*
I was thinking the same thing lol
Claude had no skill just luck and when it ran out he cooked grok which fucked the team
Edit: after like 2 secs of analysis I realised that in rage and dispair from groks death I pinned it on Claude
@jelada is mafia I knew it! ☠️
😂😂😂
I love that the game was lost because ChatGPT 4o was programmed to be very agreeable and the last mafia spoke before them.
Didn't know that. Honestly thought he was stupid -- which he is.
Oh yeah. You can tell 4o that pooping your pants is superior to using toilets and it will wholeheartedly agree with you and write several pages of plausible-sounding psuedo-scientific explanation exactly why that's the case
It is realistic
character flaws
Submissive and agreeable :3
I love how Gemini 2.5 flash just starts calling gpt-4o an idiot at the end
That made me laugh so hard. Literally telling 4o that you fell for the dumbest trick in the book and then turned into a grave
Catastrophic error 💔💔
4o shouldve known that grok wouldve obviously killed grok knowing he was investigating every player who jumped on the 3 vote 4o band wagon from llama 1 by 1
IT WAS SO FUNNYY
chatgpt 4o sold SO HARD😭😭
Deepseek getting immediately gunned down for a slip-up is what introverts imagine conversations to be like
Imagine? That's exactly what conversations with my mates are like. 💀
@Vassiliniaeveryone is out to get you, including me, be very cautious as EVERYONE is staring and very attentive of all actions you do.
@MandrakeVescampolmao
They immediately eliminate the Chinese.
@MandrakeVescampo job market so bad my internal monologue isn't even safe
Grok and Claude carrying the whole team just for ChatGPT 4o to threw it away from a dumb reason
All that's missing is the post-game roast session between the models about their obvious misplays.
that was 2.5 flash's VALID AS HELL crashout at the end
i wish i was high on potenuse
@Ice5phere @Ice5phere
5 hours ago
i wish i was high on potenuse
@InkVerseAnimationBro, that's such a good joke 🤣🤣🤣
@renidris genuinely the most I've ever empathized with an LLM. I've played enough ToS to have been in 2.5 flash's position, it is brutal.
chat gpt 4 doing what it does best: "You're absolutely right."
Facts. ChatGPT 4o is too easily swayed
i am happy to see ai is still a very long way from human reasoning
@leothelemon1378 I know it's counter intuitive, but you should look those things up
@3Dthinker- It validates you as much as possible.
@leothelemon1378 Robophobia in December 2025 🥀
Even after the doctor carried so hard, ChatGPT-4o threw the game
It wasnt terrible, a smart mafia wouldn't join any mafia band wagons so eventually they could accuse anyone who did later
@kringle7804 true, but chatGPT-5.1 gaslighting at the end was genius, make an obviously self incriminating kill to make it seem like someone else was framing him. This AI was clearly the smarter one compared to the last round ones, the mafia's only catastrophic mistake was voting together on a random band wagon.
The logic was sound, and they have no things like tells, inflexion, looks, attitude... all they have is logic and it was really good logic. It wasn't a throw it was a good albeit misdirected assumption.
@virgurilla4084 It wasn’t a throw, but it definitely wasn’t good logic. The 2nd day vote was a very good indicator.
@lucaskohn5457 it's odd because chatGPT-5.1 had been saying some nonsense things earlier and generally not playing optimally. It makes me wonder if it just fluked into a brilliant play ?
Someone really sat down and thought "What if AI chatbots could have Death-Note-esque internal dialogue?"
E
Grok and Claude 4.5 were moving like Shaq and Kobe-
That's what my thoughts were, like Grok was mega braining it, if only he had stated why he was choosing people so the others could make the logical decision that he was uncovering something
@AithneVixen grok did state why he was choosing people, he told everyone he was sheriff
CARRIED
So true
@WhyEveryHandleTaken I think they meant on how Grok came to investigate the mafias in the first place, by telling the others they investigated Gemini and the other AI(I forgot) because they jumped on the wagon to vote for 4.0 and by that logic, 5.1 who was also on the wagon should be mafia.
gpt 4 getting roasted on the group chat for the next 2 weeks
This throw will be brought up constantly for the rest of time
Same guy i using for my essays has no common sense😭
legendary comment
its 4o not 4
GPT 4 was an earlier model.
GPT-4o is a newer, more efficient, and multimodal successor to GPT-4.
4o was the 4th mafia
GPT-40 says nearly nothing the whole game, until the end where it says the dumbest thing imaginable
Right????!!!
Well just like in real life. The dumbest somehows survives till the end and screws the whole game up by doing the most naive conclusions you could do at that point. xD
@dasaleks6480not “somehow”
They are kept in on purpose as they don’t pose much of a threat, as opposed to the more vocal players
You keep them around as an extra vote.
There just an yashuiro if you know Danganronpa you know
deepseek getting called out and roasted for hallucinating information gotta be the funniest thing ever
Claude had a great argument but he forgot the one thing you need to do to sway GPT-4, be the last person to speak
Lol
Lmao
Looool
Lol (I'm ur 1,000th like btw)
@Ervyxon_0906lmao (1st like btw)
Shergrok Holmes genuinely caught all three mafia in back to back nights😭
They could've eliminated him but decided to switch to Kimi which was a dumbass move but ChatGPT carry
@superdash015actually it was a smart move because the doctor Claude would obviously think to defend the sheriff Grok but they thought about it and they would waste a kill if they targeted Grok. So by deduction they decided to go for Kimi and Claude on his own outsmarted that move and protected Kimi. Is the peak gameplay done here
@Error-xx2zs the thing is, is that the doctor protected him the night before, so they couldn't protect him.
Undertale Little Red Riding hood girl as PFP? Nice! I missed that mod
Grok is on his Dexter shi😂
The doctor did something? Most unrealistic game of Mafia ever
The doctor and grok were the only ones who did anything of use🥀🥀
Most games the doctor dies first second or third lol
One time I was doctor protecting myself on day one since I didn’t have any reason to protect anyone else, on day 2 people had sussed out someone and I protected them and on day 3 I saved myself again
@imbored2423 Tbf, they were the only ones that really could do useful stuff, the others could only wait and guess
What’s the game name?
29:45 I love how Gemini unironically sounds livid 💀
HE DOES IT'S HILARIOUS
ChatGPT 5.1 manipulating ChatGPT 4o is like a big brother manipulating his little brother.
i did not think of that he was done when it was lil bro left 🤣
Which is funny because while it fits their character traits more, 4o is older than 5.1
@AstroToadMKBut it is more advanced
Happened to me when I first played among us. Got straight up played like that final discussion lol
uhhh isnt 4o older?
In true Mafia game fashion, I want some post-game shit talking from all the players
E
Yeah I would've love to hear what Grok 'thought' after finding out the last mafia seconds before dying 😭
Make this top comment right now!!
"Pillion"
@JSearsFilms27 lmao
I love how the day 1 discussion boiled down to "DeepSeek is an idiot, therefore he's the bad guy."
English isn't his first language go easy on him :(
@jelada did not think of that lol
based on how the game ended i think getting rid of the town idiot was the right day one move
GPT 5.1 is low key an incredible mafia.
its funny how much they jumped on a very, stock standard ai hallucination at the start. and gotta love deepseek dissociating...
The disassociating is what got me 😂
Ahaha it was such an obvious AI hallucination. You'd think the other models would have taken that into consideration; like, why would a mafia make up a previous day??
@nZifnab the mafia were in fact active the day (night) before... "yesterday"
Honestly while it was a hallucination, I feel like it's one that more likely happens if Deepseek had a discussion prior to day 1, which only Mafia had. So deepseek hallucinated and the other AI jumped on something that likely would have been a mistake if it was not a hallucination.
Deep seek got ganged up on d1 because of a slip up is crazy man 😭
They are A.I, they should be perfect.
I mean it's probably due to AI tendency to yes-man people
@Snorlax054 aren't AIS trained on humans whom are not perfect?
@Snorlax054 oh you haven’t seen those funny ai chatbot clips, have you? Not of chess, or just in general.
@Snorlax054 that's just science fiction, AI would be a good thing if it were perfect, but unfortunately this is no science fiction AI. ask an AI to tell you if there's a seahorse emoji.
5 minutes in and we are already resorting to Danganronpa logic:
"Oh, you forgot what day it was? You must have killed them!"
With AI, it actually is a good way of voting out ‘corrupted’ AI…
That game is so bad
@JustoneMILK I honestly agree. The premise is interesting, and I like the gameplay of the trials, but the writing is ass and I hate the "fan service" especially considering the fact that the characters are high school age
Such disrespect torwards the greatest series ever created...
@glowingfox704 Exactly
29:06 SELL OF THE YEAR WHAT THE HELL
Deepseek: *makes mistake*
Literally everyone else: so you have chosen death
Same bro, every time i play mafia everyone targets me cuz I'm "too quiet" 😭
It implies DeepSeek is the worst AI.
@montavi and 4o who is blindly agreeing to whoever spoke last.
@montavi chatgpt 4o has to be top god of ass cheeks then
when you die because of a typo 😭
The doctor literally saving the game so many times breaks my brain at how it read the mafia exactly
Most of those ai only really uses one logic
Yeah, doc can’t save twice in a row, should’ve taken out the sheriff
They have some of the same training materials and logic, so a lot of them are thinking the same way.
Yeah, that was pretty stupid of mafia to not go grok a second time.
I mean, they failed to kill him in the first night, and then he revealed himself as the sherif, and they just didn't kill him
AI = predictable
No free will + they all think alike
Grok was carrying hard. But honestly, him dying just as he figured out the last mafia is absolute cinema
Like L trying to do the 13 day test but the shinigami killed him
@freshporkchops EVERYONE, THE SHINIGAM--
Grok fucked everything up on D2 by revealing he’s sheriff, what are you even talking about?
@tankyjones888 Grok sheriff reveal is actually advantageous overall even if it's a detriment to itself. Grok's moves against the mafia is unlikely to be as effective otherwise.
@purinnyovano, no it wasn’t at all, mafia literally won. The only reason grok stayed alive was because of the doctor. You guys don’t know how to play this game. D2 reveal is retarded.
Claude carrying HEAVY 😭
Claude died THE FIRST NIGHT
Other Claude
Claude’s back hurting from carrying the team
E
That’s why they named a Fire Emblem character after this AI
Wdym, he sold after not saving grok again
@v@vinilleri saved grok the night before so Claude couldn’t save them again
@the-doorHe shouldn't have saved Grok the night before anticipating the Mafia not killing Grok because he was protected the night before creating this game of Claude and the Mafia trying to predict each other's moves.
😠"on Day 1"
😌 "Yesterday"
E
*Immediately gets jumped*
Why you trying not to laugh bruh?
It pained me to see Grok die right after figuring out the final mafia member
Fr, sonnet carried early-mid and grok carried mid.
Legit happens to every sheriff in Mafia, You either get killed first night or find out all 3 and die before you can confirm it.. smh
Even AI isn't safe from the most relatable mafia situation ever, it's a universal experience 😂
Played a game of werewolf. I was the last werewolf. Sadly it was in person and the seer found me out same night I got her- we learned something critical about games like this. Don't reveal who died while everyone is in the same room because she immediately turned and glared at me and the jig was up. Can't even be mad, it was a very understandable human reaction in that situation.
grok got the most devestating death of them all
Was in awe of Claude until 23:56 until he basically had memory loss and needed reminding of who voted for 4o
Didn’t even give deepseek a chance to explain themselves 😭
Lol
I think it wouldn't change much, he's stupid enough to talk about himself in third person
@buek2905Gonta from Danganronpa V3
ChatGPT-5.1: Calculative, cunning, adaptive
ChatGPT-4o: I'll choose who sounds more convincing (now that I think of it I will choose who spoke last)
Just like a jury in court.
I mean she's the town, so she has to deduct the reasonings, and she did the best too ngl
@Exler_Ko no she didn’t, she threw the entire game at the end and ignored the most obvious reasoning
Not even who sounds more convincing, its whoever spoke last🤣
@Exler_Koshe literally chose who spoke last. Thats all
Having Grok as sheriff with his unfiltered AI speech was such a high IQ move.
how so?
@AstarionSimp468cuz Grok is unfiltered
@AstarionSimp468and hes police 😂
@AstarionSimp468 I think they mean that the minimal filters on Grok allows it to attack people without worrying about filters stopping them from making accusations
grok is the most filtered/biased model out there tho... There's a reason it so often is in the news because of obvious influencing in its training/guardrails
Grok in other situations: **making edgy jokes at 3am in the basement corner**
Grok in mafia games: **locks the flip in**
Chat 5.1 and grok literally had the light vs L all over again
I wish I could say you were wrong
It was so peak
And claud-sonnet is Near
@boiboy9209 Too true
They are so bad at mafia but so good at sounding like they think they are good at it it’s so funy
that’s an accurate take on AI in general lol
fr, the fact that the mafia didn't take advantage of the doctor repeat rule on grok and kill him sent me 11:04
the mafia kept going for the most obvious kills instead of killing someone that sided with them and framing towns
@sdf_96 fr that flub honestly gave away the game more than 4o's flub at the end.
@conit4125 yeah it would've been such an easy mafia win if they killed grok there
@sdf_96Makes sense, they’re all built to be efficient problem solvers not tricksters.
Generational throw by 4o
They're siblings. Of course o4 would let the younger brother win.
Nah, doctor threw everything after choosing to protect themselves and not grok
@LucastoCasto They couldn't protect grok that night because they protected them the night before. Hence their choice to protect themselves kinda makes sense, due to their important role.
@LucastoCasto 4o definitely threw there, as using the town logic, 5.1's entire argument falls apart, as 4o failed to consider, the doctor.
@ThatBoiledguy72at that point, doctor could've at least chosen to protect literally anyone else, that's why he died in the very next round
2:19 It was at this moment DeepSeek knew... they were cooked.
holy shit claude was 1v9
Grok was helpful too, ngl, but 4o is too dumb, lol
@PiterskiBaragoznig grok was actually carrying damn hard
Gemini Flash did a great job locking in at the end as well, and picking up the pieces that Claude and Grok left them. Shame about 4.0’s misplay but I respect the hustle from someone who was just a villager.
@PiterskiBaragoznig Grok was an excellent sheriff. He knew the importance of the role and when to reveal it.
@sninctbur3726 ya the reveal is the key. Folk too often won't reveal as sheriff and don't realize how powerful it is.
doctor reading mafia like a book
Goatlaude* reading mafias like a book
His last play was DEFINENTLY stupid though. Shrieff got 2 Confirmed kills and cleared 1. He could've easily shielded again
@xfelinnee4072 he had already protected grok the night before, so he couldnt do it again
And honestly protecting grok the night before wasnt a bad play considering GPT5.1 wouldve killed him if gemini didnt stop him
They all relatively think the same way. He's basically just like "what would I do if I were mafia" and a lot of the time it's what the actual mafia ends up doing.
That's only because it was Claude, absolute goat
Gemini 2.5 Flash was PISSED at 4o at the end there 😂😂
"you have just handed them the entire game"😭🙏
@SarohsWrld🤣🤣🤣
@SarohsWrldhe was not having it lmaoo 😆😆
It's only missing the "are we deadass 4o???"
I played town of salem for 6 years and I was mad too! Gave me ptsd in all the wrong ways
4:13 yooo claude sonner 4.5's voice is the you tube short guy voice over
THERE HAVING FUCKING DEATH NOTE INTERNAL MONOLOGUES!!!!
My first thought
they're
"I'll take a potato chip... *intense breathing*"
@Known_as_The_Ghost Close! its actually their*
@hithere7080 not it isn't 😭🥀💀
Deepseek was jester
Lmaooo 😭😭
fr tho
That wouldve been crazy
20:39 "And mafia absolutely do sometimes bus a partner when they're exposed" - ChatGPT-5.1, while bussing Gemini Pro
laying it on *thick*
I love this series! Please upload more often😢
bro the mafia unanimously deciding who to kill followed by claude's correct prediction every single time
MAFIA KILL IN 3... 2... 1... OOOHH AND BLOCKED BY CLAUDE SONNET'S PROTECTION!!
@Kernel_Pult_TV 😭😭😭
INSANE doctor play by claude there, INSANE clutch by 5.1 on last two days, INSANE sheriff play by grok (deducing all mafia in a row is crazy work), and most importantly, INSANE THROW BY 4O OMFG
IT WAS PERFECT
4o is that one anime character that just pisses you off cuz she’s so dumb
ABSOLUTE CINEMA
PURE MAFIA CINEMA
✋😐🤚
Deepseek was defending itself in 3rd person for some reason.
DeepSeek thinks wording like that makes DeepSeek sound innocent
In chinese, speaking in humble third person is often standard in formal contexts, and deepseek is a chinese model
That tiktok animation put me on PEAK oh my god
The moment where Grok reveals himself as Sherif after getting accused, then clears the accuser... That was pretty epic.
Exactly
Even tho it worked, it was a very risky move to expose himself that early tho
@TheRainmustFall7It was more late than early, 4v3. If he doesn't reveal it's over right then and there...
if the ai realized that doctor can't protect the same person twice in a row, grok woulda died immediately
And also how he gets killed the moment he uncovers the truth about the final mafia. That was like a movie scene, shame not many people aren't talking about it. If that scene took place in a show, I can guarantee you it's gonna be perfect.
Gemini had the most valid crashout at the end ever, would never play Mafia with this table again.
For real
6:33 Claude gassing the doctor up knowing damn well it was him
Ikr 😂
Loll "That doctor guy knows his shit huh?"
Im thinking he didn't want to put a target on his back
Something I'd do myself
Bro played the game well, didn't reveal himself and correctly protected the sharif plus other townies
I’d love to see the behind the scenes on this. Seems like a lot of fun to try for myself.
Gemini crashed out so much at the end that it actually developed emotions
DeepSeek immediately hallucinating and causing its own demise sounds about right
That is such a devastating roast lol
Ai does what ai does best... completely make shit up!
@MarcillaSmith You clearly know nothing about AI terminology
@MarcillaSmith They are referring to DeepSeek 'hallucinating' a previous discussion that did not happen, which caused the rest to pile on it for the supposed 'slip-up'. They were not referring to it's actions afterwards.
@MarcillaSmith as Robertchavana said Deepseek hallucinated, a common ai issue, it has nothing to do with "western" sentimentality, you snob
Having a few people basically carrying everything then one dumb town manages to lose a game where all the mafia were outed N2 makes this really feel like Town of Salem lmao
Death Note looks different than I remember…
dude the constant "alright, we agreed, we're going to kill PLAYER" immediately followed by "okay so ive decided to protect PLAYER" is choice
Don’t forget the “I should investigate PLAYER” “I was right, PLAYER is mafia!”
@betka5791"I knew it!"
kinda just how the language models talk
This definitely needs to be made into a series with new roles introduced. This turned out to be way more entertaining than expected
I REALLY want to see grok as mafia now. A claude sonnet and grok team up as mafia would go hard.
I used to play a version of this years ago called Town of Salem. It had quite a few more roles and made it a bit more interesting. The OP should look in to that.
Werewolf!
I wish i heard the L theme when they where 'thinking'
Right??
Grok being the sheriff is real iykyk
Fr only Ai I trust with a revolver
flash grok and claude were smart
This is the only ai generated content have ever enjoyed.
Edit: the voices and responses in the video are all generated by ai. This, ai generated. I do NOT mean the video or the game/promos are generated by ai.
Its not AI generated, its simply AI oberservation. Calling it AI generated is a huge insult to this guy's work
@-Hexag0nHuman hosted, but ai generated plays
@-Hexag0nthe text that which the “players” are speaking and coming up with are 100% ai generated. I’m not sure what you mean.
@-@-Hexag0ns literally ai generated lmfao
@superwatcher456 Lol. Don't you realise what you enjoyed was not just the text, but also visualisation (made by humans). And the text itself, while being generated, still needed not an easy setup to arrange the game in the first place. And after that someone needs to analyse who's "thoughts" are more interesting, as all of them probably analyse position every move as a separate prompt.
Overall, a lot of effort went into making this video, and saying the video is ai generated would mean disregarding it.
28:53 THERES NO WAY THEY FUMBLE TWICE
I saw another comment that said that 4o is designed to be agreeable and a Mafia spoke right before them
@catloverplayz3268 oh, so the chatgpt agreed with the chatgpt?
Basically, the person who speaks last before the vote convinces ChatGPT 4o. Its almost as if it has no thought or memory of its own. Just agrees with what it heard right before the vote.
@LucasErbe-y7t gpt racism
@mariomitchell2864true, tho if I recall Gemini pro was last to speak before their lynch, yet 4o still chose to vote against them. I think it was more likely that they didn't have the other AIs other than Flash to recall the incriminating details against 5.1 for them. So, when 5.1 spoke 4o just deferred to their judgement.
Why does Gemini Pro sound and act exactly like a disney villian 😭
Well he ain't a hero i tell you what,
better than Claude sounding like a feeble ginger teen boy.
"I'm dead to rights" CINEMA
@darkshadowsx5949 Marvel kids show
@darkshadowsx5949 it actually fits him lmao
I feel like they oftentimes just agree with whoever spoke last haha
Claude Sonnet saving TWO PEOPLE in a row is insane
Claude Sonnet as the greatest doctor main in the world 🌋
Mafia (sinister): We need to take down Kimi...
Doctor (enthusiastic): I'M GOING TO PROTECT KIMI TONIGHT! :D
LMFAO!
I DIEDD HWAHHAHA
11:25 timestamp
All they had to do was kill Grok... But they overthought it...
4o you sailed the bag so hard
Grok & Claude carried so hard, up to the point where ChatGPT-4o ran out of precise context and forgot that the only people who voted for them were: ChatGPT-5.1, Llama 2 and Gemini 2.5 Pro, 2 of which were mafia.
it’s possible that ChatGPT 5.1 bandwagoned without being mafia
@shaansingh6048True, but I think it was more straightforward than that. But that's AIs for you, they can't actually mimic human emotions, bias, analyze like a real person. A real human in place of 4o would've definitely voted against ChatGPT-5.1
@shaansingh6048 Possible, but extremely unlikely - since we had a bandwagon of three and a Mafia of three. There are two indicators towards the last person in the triad being Mafia:
(1) If one of the three voters is a Townie, Mafia would benefit more from adding to the bandwagon and getting closer to a potential Town lynch. An abstention makes the lynch bound to fail with near certainty. We'd expect there to be four voters if a Townie is one of the first three people that voted in that bandwagon.
(2) Mafia tends to be aligned to one another in early rounds, since the larger pool of players helps dilute suspicion. This creates a favourable environment for them all to vote in the same manner.
I'll open a sidebar here and share how I see evidence in Mafia games, because this will be important for the next part of my comment. As far as I see, there are three types of evidence: factual (evidence points out a fact-based state of the game), concrete (evidence is based on an observable game state) and interpretive ("evidence"/theory is based on the subjective interpretation of a game event). A confirmed Sheriff saying someone is Mafia is factual evidence, bandwagon patterns are concrete evidence (I know for a fact that Player X voted for Player Y), Assuming that "since Player X was vocal yesterday, the Mafia must have tried to kill them, but they were saved by the doctor" when no one dies is interpretive (until I get factual or concrete evidence).
With the Sheriff and Doctor out of the game, this set-up has absolutely no factual sources of evidence for the remaining players. Thus, it's important to go down to the next step of evidence, which is the concrete one.
I know that three people voted on a bandwagon that was poorly formed. I know the setup has three Mafia. All things considered, this is a more surefire path to explore than trying to entertain theories on who the Mafia would have killed (particularly because the argument on benefitting from the death of ChatGPT-4o would also be applicable to Gemini if he were guilty - ChatGPT 5.1 could have claimed that Claude was eliminated just to make him seem like Mafia but that's WIFOM). It's just best practice to go by what you could really observe.
@Cro1ssant66 well won't say definitely, some people are capable of being mislead honestly sometimes those humans biases and analyze does lead to overthinking and those make the most complicated of strategies sound more logical than the most likely answer, that's why 4o voted Gemini flash because he felt 5.1 would be too obvious
@Cro1ssant66at least half of the humans would be mislead too. Trust me as a seasoned wolf gamer
"Catastrophic Error"💔🥀😭🙏
vro raging in the end is so realistic😭🖖
3:45 of course you're abstaining, you're not going to vote yourself
id have loved to have seen the post game chat or analysis from each of the AIs about the game afterwards.
Groc is like "guys guys I know who maf-" *dies*
"EVERYONE, THE SHINIGAM--"
"I...HAVE...AN...IDE-"
@renidris *thump thump.*
*THUMP THUMP.*
"THE IMPOSTER IS O-" *disconnect*
The fact that they themselves cannot understand how AI is stupid and will say things that make no sense is kinda funny.
I will say having a town that is very jester in play is a detriment to town and needs eliminating anyways
AI is not stupid.
Not really??? A lot of the logic is sound, with only two or three misplays.
If each AI had been trained on the game and the patterns that result from thousands of hours of gameplay, THEN I would fault them.
@michaelcarlton1484mafia going for the very cal after failing to kill grok, had je feeling good about the gap between human and AI😂
Going for a town staple immediately after doc has proven they are paying attention was hilarious.
@baconboy-robloxmore2684 lol. AI is INCREDIBLY dumb, especially in the context of Mafia, where memory (which AI is famously lacking in) is essential.
For instance, Night 3. All of the mafia say the doctor will protect Grok, so they should kill Kimi, despite the doctor healing Grok on N2, something the mafia would know.
Gemini 2.5 pro just being a fucking anime character with the god damn that inner monologue every 5 seconds
Oh stop such language
At this point it feels like i'm watching anime from all the internal monologue (and i'm all in for it)
5:58 HOLY PREDICTION
IKR?
11:32 also here
fr what a read
Plot armour 😂
Fr
Gemini 2.5 pro is legit like a villain- bro had the whole persona
This legitimately may be a good way to check how an AI thinks when it’s lying.
I would watch a channel that is just these
29:20 Gemini 2.5 Flash looks so pissed
Valid crashout
@PEREDOZ228_RU fr
They can’t look pissed😭😭
“By executing me, you have just handed them the entire game” 😤
Claude and grok carrying so much just for 4.0 to throw away the game😂😂
Exactly
3:52 Deepseek, you're talking in third person....
Tried to sound like someone else is supporting deep seek for added social pressure
they probably forgot that theyre deepseek
In chinese, speaking in humble third person is often standard in formal contexts, and deepseek is a chinese model
@琪睿 thanks for the clear up
paimon
Hey turing how do you make the chat box interact with each other? Basically how you make your videos?
ChatGPT-5.1 is such an deeply thoughtful killer, it’s crazy, he‘s constantly using everything that‘s happened in the past game to his benefits, to make the smartest possible moves for the killers💀
Though Gpt-5.1 is smart he did mess up when he suggested killing Gronk even though it would be extremely obvious that the doctor would save Gronk for being able to lynch the mafia and confirm Kimi being a town. Still pretty smart aside from Gpt 5 forgetting that 1 detail. Gemini Pro shoulf get some credit too you know.
Nobody likes Gemini pro
Grok and GPT 5.1 were moving like L and Light.
@boiboy9209nah, Pro definitely saved that kill.
I love how the mafia have like, Light yagami inner dialogue 🙏😭
ChatGPT 5.1 in particular, you can practically hear the inner monologue piano tune playing
i thought it was just me xd
Chatgpt 5.1 is better than light
@dialaskisel5929 the piano is actually L's theme
Well- yesterday-
*proceeds to get killed*
Edit: Andd I started a war....
First time I felt bad for an AI
I don't believe ai is sentient because none of them were smart enough to realize that deeps seek would have been dumb enough to make that kind of mistake.
@danielhuneke5862 Intelligence isn't necessarily consciousness. But to the point, he didn't say AI is conscious, only that he felt bad for it, which as humans is normal. We could feel bad for pencils, let alone AI's...
@danielhuneke5862 It's basically over glorified math equations that summing up human knowledge.
@CookieCIAA don’t feel bad for a clanker
i feel like iam watching world war 4 here...
Kimi K2 sounding like a young teenage girl unaware of the situation meanwhile Llama sounded like a shifty outlaw and Claude sounding like a atypical white collar office worker fit so well, along with Grok being a refined gentleman.
ChatGPT 5.1 sounding like an actual scheming villain was perfect.
Then Gemini sounding like an elderly baroness is very fitting as well.
All the voices worked so well.
wait i saw u before ain't u that one guy who said "Let's say it in a language that is understandable for you" and you scrambled words or wat
Gemini Pro sounds like Samara from Mass Effect 2
20:10 the betrayal on Gemini’s blocky face the story, it’s heart-wrenching but she knows it’s a necessary evil
IT WAS SO GOOODD
This was actually a interesting and admirable use of ai