researchers find an unfixable bug in EVERY ARM cpu
HTML-код
- Опубликовано: 9 июл 2024
- ARM is a great computer architecture with some great security features. In this video we talk about TikTag, a new attack that shows how one can use speculative execution to see the future.
arxiv.org/pdf/2406.08719
🏫 COURSES 🏫 Learn to code in C at lowlevel.academy
🛒 GREAT BOOKS FOR THE LOWEST LEVEL🛒
Blue Fox: Arm Assembly Internals and Reverse Engineering: amzn.to/4394t87
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation : amzn.to/3C1z4sk
Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software : amzn.to/3C1daFy
The Ghidra Book: The Definitive Guide: amzn.to/3WC2Vkg
🔥 SOCIALS 🔥
Come hang out at lowlevel.tv - Наука
haha wow that lowlevel.academy guy seemed pretty cool huh?
Whos that?
Never heard of that guy... Does anyone know that guy?
Yeah, I like his hair
😮 Very tempted by this assembly course. I’ve done a bit of assembly in some really low-level optimisation work (comparing what different Rust functions compile to), very nice very cool
my bitdefender gives warning on that werbsite.
Modern day computing is too unsafe lets all go be amish.
lmfao yea
when i retire i'm building chairs in a log cabin
@@WarDucc amish computing is too unsafe, let's go back to stone tablets 😅
@@LowLevelLearning i will be reinventing the wheel see you when you retire!
You are confusing the Amish with Luddites.
Every time I hear the phrase 'speculative excution', I am reminded of what a late friend of mine used to say: "CPU designs should never incorporate speculative execution or branch prediction. They will inevitably lead to security vulnerabilities." He was also a big fan of the ARM architecture, because it did not use to do this thing. He passed away about fifteen years ago, but as it turns out he was right...
Only in architectures where it was added long after the instruction set was finalized. The problem is not that CPUs have speculative execution, but that the 8080 they're based on didn't.
the problem is that specultive execution / branch prediction brings huge performance benefits, there is a reason as to why we have it and still use it
@@darrennew8211 Not true. The ARM ISA is not based on the 8080 architecture and now also seems to suffer from it.
My friend was very adamant about this at the time, that this would not be restricted to architectures that weren't built around it.
@@juhotuho10 That is the counterargument that I put before him all those years ago and I was treated to a lecture about why the benefits could never outweigh the costs and why especially in multiprocessor/multicore systems this would lead to all kinds of security vulnerabilities. And he pointed out exactly the kind of security vulnerabilities that were discovered in the past decade or so.
@@juhotuho10 It brings huge performance benefits if your architecture is such that it pretends to execute one instruction at a time in order. You don't need it if your instruction set is designed from the ground up to keep every computational unit busy all the time. You need it because you execute one load instruction then one add instruction and then one multiply instruction then one store instruction and expect the CPU to behave like it's not doing all that in parallel.
people that figure this stuff out are so amazing. like I understand it, after you explain it, and am like "yep I get it," but I could never actually figure it out beforehand or even consider that it exists.
@@c.ladimore1237 I’m not claiming that it is easy by any means, but these people spend everyday searching for bugs like these. Surely, at some point, they develop some kind of intuition.
That's also part of the skill of the presenter. A good presenter can easily make you feel like you know more than you do.
@@c.ladimore1237 I don’t professionally find exploits, but I have found unique ways of using things in unintended ways.
My understanding is exploits like this are either people looking at how things work and being like “wait, that means theoretically it will do this thing too” or people being like “I wonder if it will also do this thing too” and trying it.
So to me, it seems more akin to educated experimentation with the scientific method, while software development (although there is experimentation) is more akin to writing a book.
Beacuase it was a team of hundreds of people working on it
If you know how a cpu works on the low level, I guess you can think up of these things?.
"There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors." (Leon Bambrick)
Let me add two other hard problem. Memory allocation and bounds checking, hunter2
What a quote lmao
Don't forget cache invalidation
@@BobFlats7 cache invalidation is 0th in the list!
Funny, but naming things isn't hard at all.
Weeks ago UEFI, now ARM last year I joked about hardware backdoors this year
STOP JOKING! :D
THANKS FOR JINXING IT XD
Please stop helping...
Except neither were backdoors. In the first case it's just a standard buffer overflow bug, except because you're running directly in ring -3 there's no ASLR to save you. The ARM bug is actually a feature that speeds up the CPU, which is good, but accidentally was implemented wrong. The difference is that buffer overflows can be patched by a software update (if you haven't downloaded the UEFI security update please do so right now), but a bug in the CPU itself means you need a new CPU.
You are the guy that says "q***t day" in the office/chat aren't you
My God. I guess time to check off "security vulnerability found in something you worked on" off my bucket list.
I was an intern at Arm, on the team that worked on MTE. I did some work around the generation of the tags, and on simulating the overhead they would have in caches and memory.
I have such mixed feelings right now. :D
This seems like something we could have thought of. Meltdown and Spectre were fresh on our minds and a major topic of discussion in the company. I can imagine an alternate universe where I told my manager (or someone else on the team) "hey, have we thought about if tag mismatches could be a cache side channel?" Yet I don't think we ever discussed anything related to this? At least not in any of the meetings I was in.
But hindsight is 20/20. In retrospect, these things always seem obvious.
We were mostly focused on minimizing the performance overhead of memory tagging, because we were worried it would get in the way of adoption. We wanted our new optional security features to be supported by hardware manufacturers, who might not be happy with there was too much perf or memory overhead, extra hardware complexity, or cost / die area increase.
Though, I guess, despite this new vulnerability, it still delivers on its goals. MTE was supposed to be something that offers substantial security improvements for cheap. A "better than nothing" optional feature which, when enabled, has a good chance of catching some bugs that might not be found otherwise. It is probabilistic: even if it worked perfectly, there is still a small chance a memory bug might go undetected by it (if different allocations happen to be assigned the same tag by chance). It was not meant to be perfect, or any sort of bulletproof defense. Just a way to hopefully catch more bugs in the wild. If a vulnerability makes it less effective, that's still better than every other CPU that does not have something like MTE at all.
It has its value as a hardware address sanitizer. I used it on C code within an Android App on the Google Pixel 8, which supports MTE, and it helped to figure out and fix a hidden memory management bug (a use after free).
@@olafschluter706 Yep. "Hardware ASAN" is pretty much how we thought of it when designing it. The motivation for MTE was "imagine ASAN but with low enough overhead that you could deploy it in release/production builds and just enable it everywhere, and hopefully also catch bugs in the wild instead of just during development."
@@inodedentry8887 yeah. Arm have said that the tags aren't secret. The video is somewhat misleading. Not all arm CPUs have mte and it isn't used much it seems
That’s good to hear. It’s a somewhat obvious exploit in the context of meltdown and spectre so the question of potential value is a business decision (as you reference) and not an engineering one. And I assume intern means you were young and less experienced so you certainly aren’t at fault.
Is it better than nothing at all? That’s the hard question.
@@HayesHaugen I think if it helps to catch memory management bugs, it helps to reduce the attack surface and the number of possible exploits of software checked by it.
I am a (retired) professional programmer. I never wanted my programs to run as fast as possible. I wanted them to run as reliably as possible, i.e. rock-solid reliably. I have seen countless examples of programmers being led astray by the siren song of premature optimization.
It depends. ARM processors are often used in embedded devices with few resources and hard real-time requirements, and programs that are not as efficient as possible may not be appropriate.
@@NoSpeechForTheDumb This is a hard blanket statement to make because a lot of embedded systems will prefer stability over speed. You don't want life critical systems failing due to software bugs that can be mitigated at compile time.
@@TheMixedupstuff there are some instances of embedded systems where reliability is most important, of course. That's why I said it depends. The blanket statement was made by OP who said he ALWAYS wants his systems as reliable as possible when for some applications this may not be necessary or possible.
Great breakdown! Not surprised to see that speculative execution is causing vulnerabilities on more than just x86 - really feels like it was only a matter of time before something like this was uncovered. The way it was done, though, is absolutely wild.
Lets wait for dozen of fixes that will decrease productivity compared to leaving the feature off. No lessons learned whatsoever.
@@alexturnbackthearmy1907 Not doing speculative execution isn't really an option though...
That would cause a FULL pipeline stall after every branch. And not doing prefetching is even worse.
Complex problems require complex solutions and those oversights are sadly the cost of that.
We can only hope that most things are found and fixed before they can turn into widespread exploits in the wild or hope for memory to suddenly get 1000x faster without any other downsides.
@@Momi_V Eh, if thing were actually done the right way, we wouldnt have this conversation whatsoever. At least there is hope that they dont throw it under the rug (just like "superior" windows ARM hardware which isnt really).
@@alexturnbackthearmy1907 modern cpus without any branch prediction wont stand a chance in terms of performance to one that has all mitigations enabled, even the non applicable ones
I did not expect to find a MY here 😂😂
The way you explain in these videos even a golden retriever can grok these topics. No pun intended
Misleading title, there are ARM "chips" that do not have these extension, a lot of them even do not support virtual memory
You have in my opinion some of the best content over hosted on RUclips. If this existed in 2004 my early programmer self would have had a much easier time learning how to exploit for fun ;).
Every time I hear Speculative Execution is about about a security vulnerability
i mean, when else are most people profoundly affected by low-level cpu optimizations
@@rccliRow hammer. A brute-force trick that we are having a hard time dealing with for as long as VRAM exists.
Yep it's a complex system for a complex problem. It's been around for decades, but it's still not perfect.
If we were to completely disable it now, we'd see processor speeds jump back 2 generations across the board. (very rough guesstimate)
So yeah... fun world we live in, huh?
What's crazy to me is that these CPU optimizations basically exist since the 1990'ties. When I heard about the first speculative exectution vulnerability it reminded me immediately of some presentation I held as an undergrad student end of the 90'ties: 'RISC Processors - Pipelining' ... and all those optimizations like Speculative Execution and Branch Prediction were part of my talk. But back then the idea would have never come into my mind to look at that from security perspective. All you thought about and talked about was how it improved performance. So the Meltdown and Spectre Vulnerabilites were found already decades after those optimizations were introduced in the first processors"
So basically you can say all those vulnerabilities are out there, because these optimization technques have have been developped and gotten more and more sophisticated over several decades of processor development, starting with RISC processors in the ninetees. But the awareness to look at things like that as a possibile vulnerability and attack point was non-existent ... I'd say as cyber security research progressed and looked at similar mechanism in software and elsewhere, then the researches suddenly became aware that there is also this huge problem with CPUs, turning all these awesome optimizations suddenly into security vulnerabilities and only then everyone started looking into it, after decades of not thinking of that at all.
CPU vulnerabilities usually need relatively low hardware access in order to work.
But when I heard you saying somebody managed to exploit it from within V8 (being a web dev) it literally just hit me - We're f**d.
JS isn't as much of a toy these days. You can easily manipulate raw binary data in JavaScript. Some more tinkering and this would easily escalate to a sandbox escape and really, really low-level code injection... From within a browser...
reject modernity, let's go back to monke! err... I mean DHTML
Tbh v8 0-days are being discovered every week now. It's easy to get RCE without some crazy CPU bug.
@@theairaccumulator7144 Yes, but for good results you'd need to escalate privileges, injecting direct CPU instructions omits that completely.
@@theairaccumulator7144 Yes, but in general, first you have to escape the sandbox, then find a a way to execute your code in something like a shell, and then gain admin access.
The paper covered in this video describes how it was done all in one step.
also: does web assembly still exist? This is lower level than js so it should be more easy to predict which wasm instruction transpiles to native machine code, making side-channel attacks even easier & more reliable then using js.
My jaw dropped when you said it works inside the V8 sandbox. Bless the researchers for finding this.
I think specter and meltdown did also work in JS, in the browser. The speculation engine will see any code that runs on the cpu.....
V8 engine screams to me : "you can do this on your phone right now"
OK, interesting, but this is a way to defeat a secondary defence. The program still has to contain an exploitable memory corruption in the first place. I think describing it as an unfixable bug is to some extent click-bait.
@@sylviaelse5086 I agree. It's also not close to EVERY ARM CPU. Only newer Cortex-A CPUs, no M devices at all. Seems like a bad bug, but color me underwhelmed after that title.
Given how many "unfixable bugs" have been found and viola, fixed in one way or another, yeah, clickbait.
Clickbait doesn't win subscriptions, it wins unsubscriptions.
from what i understand, you need to achieve arbitrary code execution to achieve arbitrary code execution. it is a little silly.
@@nocakewalkthe M chips already have their own vulnerability lmao, they don't need this one
@@not_kode_kun which vulnerability?
"EVERY ARM cpu" article shows that it was introduced in arm v8.5
And everyone talks about Cortex A and forgets that Cortex R and Cortex M realtime and microcontrollers are massively different.
IA64 had a ton of problems, but I really believe that explicit speculation was a great idea. So many of these attacks would be impossible on Itanium. (Insert joke about them not being attacked because no one used them)
What is explicit execution?
@@deusexaethera IA64 puts the work of avoiding problems due to parallel execution in the hands of the compiler. I.e., no mechanism to back out unexplored paths like with speculative execution. The idea was to run the CPU fast and loose, and just force compiler writers to deal with the burden to take advantage of full speed. Problem is, there are lots of languages and compilers, and not everyone wants to incorporate this stuff into code generation, and not everyone is good at it.
so the "feature" was it didn't do anything special?
@@MadsterV more correctly, the CPU didn’t hard-code any of the behaviors: the pathways existed in similar ways to x86, but required explicit control via ultra-wide instructions (VLIW architecture) which meant explicit, multi-instruction parallelism. In some ways, this arguably complicated the CPU as it made instruction parsing many times more complicated; on other archs those features would run mostly on autopilot while the instructions remained easy to parse and prevent collisions/weird behavior.
Hitachi SH5 also had a very nice branch expliceit prediction architecture. Unfortunately that did go nowhere :/
the "hats off" right after talking about a hair cut was accidentally brilliant 😂
This reminds me of PAC introduced in iOS 14 that made jailbreaking very difficult. Eventually a couple Chinese researchers found a way to sign the pointers themselves to bypass it, but I still was fascinated enough by it that I did a college presentation on it in my computer architecture class.
OMG It's amaizing!, when you said they did it in V8 was... OMG, incredible! how many layers of security they get to bypass!
Access to leaked tags doesn't ensure exploitation. It simply means that an attacker capable of exploiting a particular memory bug on an affected device wouldn't be thwarted by MTE.
But since this re-opens the door for buffer overflows, which after all is the most commonly found attack vector, we're basically back to square one. If someone finds an exploitable buffer overflow bug in the V8 sandbox, then you're looking at unprivileged code execution, which can be problematic enough. If someone finds one in both V8 and a kernel call then you have complete device pwnage. This smells a lot like how the PS3 was pwned.
@@andersjjensen or uglier, crash-o-matic, one runs into race conditions if the software didn't return a clean abort.
Still, code should be able to work around, like all of the other "unfixable bugs" over the years.
I am Pentium of Borg, you will be approximated.
The door was never "shut" to buffer overflows by MTE, its a second line of defence, and to breach it you still need a memory vulnerability in a target program (which MTE in this specific case will never catch anyway, its not designed to be perfect) and an incredibly niche one at that for this exploit. Problems like this can be better prevented when we move towards safer languages for userspace like rust and the lot.
As is usual with security, you cant rely on any one countermeasure, you need defense in depth.
Damn this is such a good video, thanks for explanation. I have only recently started learning stuff abt comp architecture and security and this video is still explaining the paper in the most crystal clear way possible that even I understood it.
The first sponsorship I’ll click and use in my life 😆 thanks for your awesome content! 💪
Found Ed thru John Hammond, but since John doesn't seem to do vids that aren't just straight ads anymore, I'm excited this is still here to learn from. Thank you, sir!
Yea John hasn't been a reliable source of info in years, bros sold for real.
If you can run arbitrary tik tag code on the cpu, you don't need to break the memory tagging, just run whatever arbitrary code you want on the cpu.
Half true, this can be used for privilege escalation.
Sending my appreciation. Sometimes when searching for work you have a not so wonderful interview for various reasons including just forgetting a term you couldn't recall in a moment. Sometimes a few can affect your mental health especially if not handled with understanding that it has nothing to do with your worth. I had known and worked with assembly. I had known and worked with memory, pointers, understanding buffer overflows, operating systems, and so on building up to a good, extensive software engineering mastery, ethics, and leadership. All of the concepts you mentioned as part of my education. I felt so let down as it seemed no one cared that I knew this stuff and it made me question if I should have specialized in a different path (CE, CS, EE even, physics, etc) when feeling like things weren't working out. I was lifted up as I could follow everything you noted and that I was able to see how worthwhile my time and degree were at my university. I just mean to say I appreciated so much having a reminder when you feel a job struggle to see that you have value and no one can take that away, including in this small way like having an education even if no one is acknowledging it yet. 🙌🏾
spec. execution is not only about filling up the cache to be ready, it can actually execute part of the code in different execution units but later either keep or discard the results depending on the path taken
Tf is your pfp
Exactly. See Lex Fridman's first podcast with Jim Keller for a really good explanation of how modern processors work in this way.
it's not every ARM processor, only V9? so title is kinda clickbait
nice sponsor, heard good things about that dude
Remember Pointer is the variable holding the address not the address itself, Dope content, massive respect …
I find amazing that the people can speak about such advanced subjects, while I try simple to fit an excess 127 code for a normal overflow fix in a vhdl dsp fpu unit. My God, where do you have the time to read these subjects?
Thank you for your vids. Any update on that php vulnerability? Couldn't find further info on the details of it, beyond being related to language/encoding.
@@kiverismusic iconv chinese extended character bug, the fix is with a glibc update
I suspect we're heading towards a fundamentally unpatchable, ubiquitous and catastrophically effective exploit that forces us to fundamentally re-think chip design.
With software moving faster than hardware this has always be inevitable but it's still crazy to think this is probably coming in my lifetime.
Even crazier to think that the chip that's supposed to solve all these problems may end up being the Mark of the Beast described in the Bible
This just defeats a defense in depth measure. The computer is still secure.
The answer is rust. Rust all the way down.
@@mfaizsyahmi If an r0 exploit can for example manipulate any memory, nothing running on that system is secure, at any level. Not rust, not other drivers, literally every computer state can be manipulated - the entire stack even the bios.
@@74Gee A vulnerability is not automatically an exploit. If your computer only ran rust programs compiled with a trusted compiler, the chance of an r0 vulnerability leading to an exploit would be drastically reduced. Similarly, if I had a fully secure interpreter I could run untrusted interpreted programs on a CPU architecture without any hardware/firmware security features at all and still be secure.
Ergo any hardware vulnerability can theoretically be patched in software, with a certain performance penalty. In practice, any sufficiently severe exploit could take down the internet causing untold damage.
@LowLevelLearning
Just because I'm not sure if I've understood everything correctly.
This memory tagging is just an additional security mechanism in ARM processors and not the only one?
So this design flaw doesn't make ARM processors less secure than other processor architectures, it just makes them less secure than intended. Correct?
Or do ARM processors lack other security mechanisms that other architectures have?
Thanks for the video and book suggestions 👍
2024 - The year of the backdoor and the vulnerability
hold your popcorn... AI is comming hard
Spectre broke literally nothing. It was a hype wave that lingered for a couple weeks and went away. Nothing ever was heard about any hacks exploiting it after. I expect the same is going to happen to this bug too.
Thank you, very distinctive explanation ! Keep up ! Good luck ! I have some different CPU boards (AllWinners family) but luckily they are v6 and v7.
Amazing find by these researchers! This is the beauty of our community: ppl take time and try new things and find these bugs like this!
Jeez. What's up with all of those serious recent exploits?
honestly this is common, i'm just making more people aware of it. bugs are everywhere
Probably recency bias. Exploits come out all the time, but due to the big ones early this year people are on edge and more of them go mainstream.
@@LowLevelLearning all these code issues is why I'm waiting for the day computers program computers. Humans arguably suck at it, as we've seen.
@@IncertusetNesciothis kid really thinks AI is going to take over😂😂😂
@@IncertusetNescio I don't think that's happening anytime soon. AI is trained off human data, and thus makes just as many errors as the average human, if not more
The pacman vulnerability has existed for a few years, the big take away from this paper is that they found a pattern to exploit it in other code.
Love that they’re called gadgets, like in hardness proofs
Great video and information!
I just assume that all computers are inherently insecure and act accordingly
Spectre and meltdown did not break the internet.
It's a classic side-channel attack, more exactly a timing attack. It's pretty well-known in cryptography. Nice work, in a way. That's hardly a bug, but I suppose the title is more catchy.
Love this guy. Incredibly smart, incredibly articulate. Really impressed. An inspiration to us all.
Somewhere I read and/or saw John Hennessy and David Patterson. They discussed the limitations of current processor designs, emphasizing that security vulnerabilities like Spectre and Meltdown, as well as diminishing performance returns, stem from reliance on techniques such as speculative execution. They propose a shift towards domain-specific architectures (DSAs) and processors capable of executing high-level language constructs directly. This approach would enhance security, performance, and energy efficiency by reducing the need for complex compiler translations and leveraging the open-source ecosystem for rapid innovation. But then legacy support as we have it now digging back to the 70s would be hard to maintain .. ;)
0:09 You know that there's three computers in the term "ARM computer"?
First, the obvious "computer". Second, "ARM" stands for "ACORN RISC Machine", "Machine" referring to a computer. Third, "RISC" stands for "Reduced Instruction Set Computer", revealing the third computer.
Almost blew my mind when I first realized that XD
@@Lampe2020 so spell it out, Acorn Reduced Instruction Set Computer Machine Computer 😂
@@nicholasvinen
Exactly.
That brings to mind the people who say things like, "ATM machine" and "PIN number".
Arm no longer stands for anything.
It stopped standing for Acord and moved to Advanced RISC Machine in the mid 90s. And in 2017 moved from ARM to Arm.
(Source: I'm and employee.)
@@m1geoYour message explains a lot.
TMA = Too Many Acronyms
Apple: "It's not our Apple Silicon ARM chip, you're using your Macbook wrong"
Pretty awesome find by the team
There's a lot of 'IF's in there. If you can find the right code, if you can find the tag , if you can change it, if... if.. if...
Whilst this is a possible route for an attack has anyone actually used this in the real world, not just in the research lab.
@@kevintedder4202 if anyone did, it would probably be state level threat actors. These are the kind of zero days that sell for tens of millions.
Broken arm
Because of you I am more interested in assembly language and CPU architecture
This is fundamentally similar to a hash collision exploit, so the solution is the same. Increase the entropy on the memory tags so that the reuse is practically impossible.
Who would've thought that doing insane things just so you wouldn't have to admit to yourself that Moore's Law has been dead for a lot longer than people imagine would've caused so many security issues?
Seriously underrated comment.
wow, only option now is templeos
Always has been
Time Cube security is unmatched
Will this require a hardware level redesign or can it be fixed with compiler patches?
Kinda neat explanation of virtual memory, wish had it when wrote driver for Armv8 MMU. Also not the speculative execution exploit again
Reminds me of my introduction to Java. How to get rid of most security holes? Bounds checking. References, not pointers. Fantastic I thought. Security built into the virtual machine! But here we are literally decades later and we're still in the C/C++ paradigm. Billions of dollars a year this costs, yet we're unwilling to abandon thinking in terms of pointers and unwilling to make things like runtime bounds checking mandatory.
"We are unwilling to take 3 to 10 times performance impact"? I wonder why.
@@denysvlasenko1865 That's ancient news.
@@denysvlasenko1865 Not to mention Java's propensity to just not properly garbage collect.
but the JVM is based on C/C++ ain't it ?
@@denysvlasenko1865 ur paying the performance impact with the cpu trying to fix ur mistakes, bounds checks can also be compiled away in a lot of cases
also i feel like u made up the 3 to 10 times number, if the bounds checking always succeeds then isnt branch prediction just gonna be always right and u would have no impact? in hot loops at least
It’s NOT in every ARM CPU! Change this clickbait title. 😒
@@ArnaudMEURET in which arm cpus are they in? the snapdragon x cpus?
@@be8090
From what I could gather from Wikipedia, it’s in ARM Cortex X2 through X4, which means ALL Android-based smartphones of 2024 and 2023 and a good number of Android-based smartphones from 2022 (especially Samsung Galaxy S22 and co.). Note: usually only the performance cores are X2 or later.
Interestingly, MTE was introduced with ARMv8.5-A (so really all architecture revisions from 8.5-A through 9.4-A have MTE (though 9.0-A is really just 8.5-A with additional features); whether this bug was ever patched in any of the later revisions, I do not know). This means MTE has been on Apple A-series SoC since A14 Bionic and on EVERY Apple M-series SoC since the first. This means for Apple smartphones *and tablets,* it’s been present since iPhone 12, 3rd gen iPhone SE, 10th gen iPad, 4th gen iPad Air, 6th gen iPad Mini and 5th gen iPad Pro. For Macs, it’s been present since 2020 for MacBook Air, MacBook Pro and Mac Mini, 2021 for iMac, 2022 for Mac Studio, 2023 for Mac Pro and 2024 for Vision Pro.
Assembly code since the 70s here .. and yes, we're still longhaired and play music .. approaching 62 :)
I think calling speculative execution "execution in the future" is misleading as it conveys they idea of a "front-running thread", which is a very distinct and different thing.
The processor simply runs a program and if it needs to make a branch/turn and does not know which way to go, it speculates.
To keep a proper program state, this speculative execution cannot do certain things, but once the speculation is confirmed to be correct, the accumulated speculated results can be committed.
From the processors perspective running the program, it's just execution current code, just of a speculated branch.
There is of course a lagging program-state that represents the validated non-speculative outcomes.
It can restart from this state when the speculated code turned out to be the wrong code and resume with the correct code instead.
A processor is thus not "executing future code".
It might run the wrong code and discard the results, but it's not running ahead of the actual program.
That is a lot less mystic and magical to me.
holy authentication man, just tried to enroll in your arm course and I had to log in like 5 times. Would be worthwhile for you to look into that.
we are overhauling the auth
@@LowLevelLearning time to roll out Firebase Auth haha
@@LowLevelLearning after purchase I also had to log out and log back in
Explanation starts @5:50
thanks
zoomers focus attention for 10 minutes challenge [IMPOSSIBLE]
@@agibitable people have a life to live
@@rian0xFFF You would be better off just not consuming media at all, then. Especially if you aren't even going to engage with it in good faith.
That is a super cool exploit.
I remember reading that from Aleph1 back in the day 😯seeing that paper just took me way back!
it's another "we speculated, rewound and forgot to invalidate the cache" error. When will CPU designer learn to have cache invalidation be the default behavior in case of speculation rewind if there was a cache swap during the speculative block?
@@fluffy_tail4365 exept they never got cached, and thats how they figure out what the memory tag is, they iterate trough the numbers and see wich one was in cache, cuz thats the real one. The real exploit here is the side channel memory access.
performance hit from failed speculations would be a dog
This issue here is that there is no cache fill happening for the speculated code, which can be detected later on.
And as the wrongly speculated generates no error, they can keep trying with new tags until they found the correct one.
For me the real question is how they consistently fool the branch predictor to speculatively execute code for a branch never taken!
Because that is what bypasses the security here.
I would not call this a timing attack, but a branch algorithm attack.
@@TheEVEInspiration It's in the paper. You can see it in the short glimpse you see of the page before he zooms in (around 6:48). It says that they run the code multiple times with correct pointers and *cond_ptr true, to condition the branch predictor. They then make one guess with *cond_ptr false that triggers the speculative execution.
@@HerrNilssonOmJagFarBe Interesting, that is just changing data out after a few tries, so simple.
The existence of these kinds of bugs reinforces why most of these hardware security features are often not worthwhile. Making all these "secure enclaves" "secure boot" and such are all just waiting to be exploited and broken, and the fixes just make it even more complicated or slow. In the past we had viruses and such but at least that was just software that could be fixed with patches and at most reinstallation. Now we have hardware that will be perpetually flawed, and even closing some of the bugs through microcode updates might not be 100% effective. Now we have to live in fear that something has permanently exploited our systems because the hardware itself is breakable.
Yup, overthink the plumbing making it easier to stop up the drain - to paraphrase a particular engineer. I think you hit a key point that these are permanently baked-in features. Zero day one of these and let the fun begin! 😲
@@meltysquirrel2919 speculative execution specifically has such a massive impact on performance that not doing it just ain't an option.
It was to the point where users would go out of their way to disable spectre/meltdown patches and see a *significant performance increase* until the patches were improved.
And it's not like speculative execution was disabled, it was just reduced. And even that was noticeable enough to be a concern.
So yeah, in a case like this, the plumbing is simply complex, no way around it. That's just how computers are at the lower levels.
You aren't piping a sink to a drain. You're piping a thousand sinks to a thousand drains in real time according to a set of given instructions.
And as it turns out, it isn't easy. And the incentive for breaking that plumbing is massive, so a lot of people are working on doing so.
The end result is what we're seeing here. Complex plumbing getting broken by people with massive interest in doing so. Fun...
There's always a trade-off. You can have a simple, provably safe hardware architecture if you're willing to accept an arbitrary performance impact in return. You can have fast, secure hardware if you're willing to pay significantly more for overprovisioned hardware. You can run on insecure hardware with no risk if you airgap your system, drastically crippling its usefulness.
Sure, you can cut out a feature you think is unsafe. But what are you willing to sacrifice in exchange - security? Performance? Flexibility? Compatibility? The tradeoffs are where the real engineering happens.
All those hardware vulnerabilities require a software vulnerability first. That software vulnerability would still exist even if the hardware had no security measures to speak of.
At worst, the hardware security features do nothing and lull you into a false sense of security. However, they never directly decrease security.
Persistent (against re-install) viruses can only be stopped if you make the firmware read-only at a hardware level. That is one area where I agree with your assessment. A little toggle switch to write-protect all firmware would go a long way. Then if you think the hardware security does more harm than good you can still permanently disable firmware updates making persistence impossible.
Uefi has entered the chat
Seriously, the people behind that paper needs to be praised as heroes.
Its been a bad few months for security vulnerabilities
Speculative execution really is a double edged sword. On the one hand it made x86 what it is today (performance wise) but on the other hand introduces a lot of complexity and attack surface. And now ARM is affected too. Although this is not nearly as bad as Spectre/Meltdown.
Isn't this the exact thing that happened with Apple Silicon?
Or at least very similar?
Apple Silicon is Arm
Well, it is ARM, so I'm assuming yes.
@@gljames24 not really. Arm based but very custom.
Thank you, nice and simple. Not so much about the hack, but rather the details of the limitations of the hardware implementation. We need better hardware developers. Which is to fire the crappy software developers. What a wasted effort, on the part of ARM, in the realm of address security. So remove the tag, remove the interrupt, or remove the look forward. We should quit worrying about speed, and actually do the job that is required. But no, OMG we used 4 bits more than before, we used 3 clock cycles. I believe in perfection before speed or space. Anyway thank you so much for the details that you supplied, I really enjoyed your talk. Keep up the good work.
It seems like this is very similar to PACMAN except that paper breaks pointer authentication code instead of memory tag. Both takes the approach of brute forcing a 16-bit secret by abusing speculation.
I bet it has something to do with pointer authentication (control flow).
3:55 I wasn't that far off LOL
@@tablettablete186 you were actually on the mark since tags are used to authenticate pointers.
Is this the year of exploits?
sick video thanks
The mere mention of "speculative" and "prediction" already makes my neck hair stand up...
Unfixable bug? More like NSA engineered backdoor 😂
are you surprised all cpu's have back doors x86 has their arm having them is no surprise at all to me as it makes sense🤣🤣
This is why I use an abacus. Granted, AR/VR apps are tricky, but no viruses!
The JavaScript V8 engine uses a technique called NaN Boxing and Pointer Tagging which attaches the variable type inside the pointer address
Apple also have vulnerability on its M series
probably because it's ARM.
Snapdragon also. & pretty much every recent major ARM CPU safe the lower end models (e.g. Cortex M).
Should've used rust /j
:D unfortunately side-channel attacks are impervious to whatever rust throws at it if the hardware is unfit to provide for such security.
@@FrankHarwald Ah but in this case the vulnerability only bypasses a security system used to mitigate memory corruption vulnerabilities. If your program is written in rust chances are that there are no memory corruption vulnerabilities to begin with, so the attack is possible but useless.
Edit: Changed "prevent" to "mitigate".
copium
So this is bruteforcing tag speculating on cpus' assumption of outcome of a code to be ran? Brilliant!
awesome content
x86, M1, ARM, we're just building a collection of vulnerabilities.
How the hell they find these
Automated tools
@@trens1005 they apparently created fuzzers you can run to find these, but it is also a challenge to even know what you are looking for. But who would have thought that MTE is vulnerable. This was probably months of research
@@trens1005 They did create fuzzers true, but it is also a challenge to even know what you are looking for, they probably did months of research, like who would have thought MTE was vulnerable
fr
why did 2 of my own replies got deleted
Hi @LowLevelLearning I just took your course from Low-level academy... Would be great if u can add a detailed OS course to that... Also add more content for ARM and C
Pro tip: show hex values (like pointers with embedded info for tags or virtual memory) in a monospaced font. Programmers can visually parse the fields much more easily. Thanks.
Man, please let RISC-V be somehow safer...
If they dont include the feature...but they definitely will because it gives a much needed boost in performance. At first.
Speculative execution was a mistake
I do not like the death penalty in general, but they should at least properly trial people first.
A poison tree
This is crazy smart
.hey man, great courses you got there.. I am sure it would be worth the bucks..
thought, I am not enrolling.. I wish you a lot many others to enroll.. cheers..
I mean this was all fixed in the 80's with capability systems, but then C programmers wouldn't have the ability fuck themselves over so here we are... Absolutely no reason for software to have access to pointers. Just... lol man
In a system designed... not by a c programmer.... even if the pointers were printed out it wouldn't help because you can't address memory directly via pointers. Its not a thing there are ISA commands for.
Lol man...this tag thing is kind of capability system...
Also for capability system to work efficiently you need caching and speculation too...
@@AK-vx4dy Its a patch trying to be a capability system... but its not, which is why it got crushed like tin foil.
Could you please elaborate on such ISAs ? I find that quite interesting, but from a quick search nothing quite like it came up. (either now classical ISAs or capability based security disconnected from ISAs)
@@filip0x0a98 i don't have a link but i saw pdf study about realisation of capabilites on current processors with changes to compiler and kernel and even some possible compatibility with older software
@@AK-vx4dy If it is done in software its broken.
Arm, x86, and I think RiscV, all grant you access to anything you want if you have the magic memory address.
The Flex System, Tendra, and others used things like object addressed memory. Where you can ask for a memory object, so you can't use after free, be out of bounds etc, as its all mediated by hardware ensuring the interaction is correct and permitted.
the other advantage of hardware memory management and scheduling is you don't spend thousands of cycles context switching as you negotiate with the OS, you just focus on computation the whole time.
How is it possibly less intensive to try to predict the future and have it loaded than it is to just do the thing you want to do when you decide to do it? That makes no sense... Sounds like speculative execution is a built in attack vector for anything running on the device, that is meant to have some plausible deniability... Like oops sorry the whole entire system is a giant security vulnerabity... Why don't they hire the people that find this stuff and have them make an OS that doesnt have these problems ffs
Because of parallelism? It traverses multiple paths, and keeps going with the required path. It's kind of genius, except super exploitable apparently.
@@WaltH-sv6to You go to McDonald's, and you order a Big Mac. What's faster, them starting to cook it when you arrive at the counter, or them realizing there is a line of 20 people and deciding to bulk crank out burgers ahead of time?
Speculative execution is one of the major architectural speed improvements of modern CPU design. The fact you claim it "makes no sense", suggests you haven't even taken a few seconds to understand its purpose. Engineers didn't just add it in for funsies.
@@adamsoft7831 good analogy.
@@adamsoft7831I’d say it’s more like you see someone heading towards the entrance so you decide to start making a big mac but there is a chance the customer will order chicken nuggets instead in which case you will discard the big mac and start making nuggets from scratch
Not *EVERY* ARM cpu! I moved into developing 32 bit asm on the ARM2 and even had a go at an original ARM1 BBC Micro cheese wedge which never was really a product, just a dev system.
I can categorically say that this exploit will not work on either of those CPUs as they had exactly zero kilobytes of cache :) With 4k cache on the ARM3, and a 24/26 bit address bus and processor status stuffed into the remaining 6/8 of 32 bits... I still think you'd find it impossible.
It may do wonders for performance and optimisation, but nondeterministic processing is abysmal in terms of security. Cache management, branch prediction, and speculative execution, what an unholy trinity.