Speculative execution is there since the 80ties , since RISC enabled pipelining. The reason is that pipeliniing gives a speedup of m, where m is the lenght of the pipeline, as long as the pipeline is full. The cost is that you pay a startup latency in the so called "Windup" phase, the phase were you fill the pipeline. This price you pay every time you gotta branch since you have to refill the pipeline. Without brench prediction and speculative execution you would have to refill the pipeline at the final branch of every for loop , fully defeating the advantages of pipelining, actually making it worst that if you were not pipelining. I have an exam on programming techniques for super computers in some days, life is hard
@@indiano-99 I had just finished reading the first post, and indeed, I was curious how it went. Well done, you! I hope you've treated yourself to an ice cream.
Speculative execution in 1970 on an IBM 360/75. A program I was testing had a S0C4 exception, basically illegal address. When I reviewed the crash dump, I observed that maybe 5-6 of the following register_to_register instructions had been performed. No virtual memory, no page tables, just raw memory. So the machine was running ahead of the slowish instruction that accessed raw memory.
I started with COBOl but a few months later was transferred to "systems" department and did nothing but assembler work for the next several years. A couple weeks playing with PL/1. Worked on auto answers to outstanding WTORs, patched the MVT open routines so the operators log would write to a tape, improved performance of long running programs... normal stuff at that time.
Watching Prime spend 5 minutes trying to dig out Memory Management Units (MMUs) and Translation Lookaside Buffers (TLBs) from his school days memory is wild. Props to you man. Streaming live content is tough!
You should interview Christopher Domas. The TLB, the virtual cache table, makes memory storage more efficient, only storing the actual bytes in any register. The remaining bits can be used to store other pieces of information, having two or more items in a single register. Before virtual paging, you could only store, say 4 bits, in only one register, in 32bit registers having 28bits of empty unused space.
3:40 Aleph_0 is the size of the natural numbers. Aleph_1, well that get's complicated fast. The full answer requires somewhat deep infinitary set theory. It also requires that you get somewhat in the weeds with which set-theoretical axioms imply what. Here's an attempt to summarize: Assuming merely the Zermelo-Fraenkel axioms Aleph_1 is the second smallest infinity which can be well-ordered*. The axiom of choice** is equivalent to saying all sets can be well-ordered, so if you assume that, this simplifies to 'aleph_1 is the second smallest infinity'. The idea that Aleph_1 is the size of the real numbers is known as the continuum hypothesis. It was originally discussed by Cantor who invented infinitary set theory. Hilbert then made proving or disproving the continuum hypothesis one of the problems of the next century in the year 1900. Later Gödel proved the continuum hypothesis was consistent with the Zermelo-Fraenkel axioms and the axiom of choice (ZFC for short)*** Even later Cohen proved the negation of the continuum hypothesis was also consistent with ZFC.*** This means that the continuum hypothesis is independent of ZFC. Simply put, we can't prove or disprove it, without making assumptions we don't usually make. This means that beyond saying that aleph_1 is the second smallest infinity (that can be well-ordered) we don't exactly know how big it is. It might be the size of the real numbers, but it might be the size of some subset of the real numbers that's neither the size of the natural numbers, or the real numbers. It's perhaps worth noting that without the axiom of choice the real numbers might not be well-orderable at all, and hence would not be any aleph. * A well-order is a set joined with some kind of smaller than operation '
Short version: Aleph-1 is the cardinality of the set of ordinal numbers. The ordinal numbers represent the order of something. For example, 1st, 2nd, 3rd. You may think that the number of ordinal numbers and the number of cardinal numbers (positive integers) is the same, but it's actually not. For example, you can define an ordinal number called omega, which is one greater than infinity. Getting omegath place in a race means that an infinite number of people finished the race, and then you finished. That's as far as I can get without getting into the weeds of it lol
26:42 One way to mitigate "the weather man problem" is to set up your chair and your camera slightly to the left of your monitor, that way when you look at the screen you are always looking to your right.
Speculative execution is vital for processor performance, but its repercussions seem to be too hard for humans to really fully comprehend. We're truly borked nowadays.
It's like translation, it's hard to translate - well not anymore but it used to be - and easy to find mistakes or argue translations, aka finding bugs or exploits. But one needs to produce code/language, the other only proof-reads it.
@18:06 "We still get like 15% per year" Prime: "Yeah but 15% is nothing like it was in 2000s... like it 4x-ed over 10 years" BUT PRIME 15% increase per year for 10 years is almost exactly a 4x ! 1.15^10 is a 4.04x increase...
actually from 2000 to 2004 the common clock went from 800Mhz to 3.4Ghz. The computing power was doubling every 2 years for decades. Now we're far from that, since the core family were introduced in 2006. The number of cores and transistors still increase a lot but the computing power don't scale that much.
@@meowsqueak no. I think around 2006 there were 4Ghz pentium 4 (I was playing BF2 with it :-) ). The core architecture brought lower clocks for years, only recently they went back to high frequencies because they can't gain in architecture anymore
@@meowsqueak maybe you were barely born :-), yes young people probably don't realize how little tech has progressed since this time, only core number and probably cycles per instruction and vectored instructions are the cause of any computing power increased, as well as parallel processing used by software whenever it makes sense (not that often sadly), and a bit from low latency cache increase, memory bandwidth : frequency and bus width (bits) I have a laptop from 2008 with a quad core core i7.
About the pointer authentication you can write a pretty simple chip to encrpypt/ decrypt pointers with very little overhead. Dedicated chips that does just one thing like encrypting can be done using a "single" clock tick. PS: just realized that the whole ASIIC Miners is exactly that, an dedicated encrpyt chip that is able to perform those encrpyptions orders of magnitude faster than any cpu.
@@hanifarroisimukhlis5989 Honestly, I don't know. I never tough about implementing one, I just remembered my FPGA class where we had to code some hardware for specific problems. But i'm sure we could come up with a solution like we have now, something with Vtables.
"likely" is mostly just about ICACHE - code that's likely branch is there to be loaded and "unlikely" somewhere else behind branch. It makes you take branch on unlikely code and have next likely code in ICACHE.
@@TianYuanEX they're just sets, set can be defined as a number, thus you can type the type. you can have a set of sets, but category theory is abstract non-sense. if you can write it in mathematical notation, it is not infinite, isn't it ? it just discrete representation of a possibly infinitude, but it still not the biggest infinite thing. here's a new concept. y = Aleph(Aleph(x)) Because Godels Aleph number wasn't infinite enough.
That wasn’t a bug. It was an attempt by Intel to extend the life of 32-bit software, thinking that 64-bit wasn’t ready yet. IIRC, it allowed the 32-bit memory space to be mapped into a 48-bit physical space. No one program would have more than 4GB*, but many programs could be run with their own 4GB space. This reduced paging pressure on the system, something that was becoming a serious issue for operating system performance. * In theory a program could have supported PAE directly and use more than 4GB of RAM. Similar to extended memory in the DOS days. In practice, I don’t think that ever happened.
@@thewiirocks I didn't say that its a bug. Prime mentioned that 32-bit programs cannot access more than 4GB of memory which is technically true but not so much.
@@thewiirocks Also useful to allow a 32-bit program to access 4GB of RAM on a 64-bit system, handy if you're modding games so they need more RAM than they did originally.
It basically allowed applications access to multiple 4GB memory spaces by including extra bytes, 16 to be specific. You could therefore have 2^16 * 4GB of addressed virtual memory per process.
runahead optimizations were already a thing in the early 2000s, at least in research, it also enhances branch prediction greatly, but cache misses and memory latency were already big issues back in the 90s, there was just not enough resources in the CPU to do anything about it
Aleph 0 is the cardinality of the set of whole numbers. Aleph 1 is the cardinality of the set of real numbers. It is presumed (though not satisfactorily proven) that no set is larger than Aleph 0 but smaller than Aleph 1. The reason we aren't sure is that P = NP or the Riemann Hypothesis being false would break a lot of math, but the existence of sets with cardinality larger than Aleph 0 but smaller than Aleph 1 wouldn't break set theory, nor would their nonexistence. A proof of P = NP or a counterexample to the Riemann hypothesis would break decades of work that has held up to scrutiny for now, so those counterexamples probably don't exist.
I learned about this number theory in discrete mathematics. Aleph null is the set of all sets. I loved it because I introduced the concept of infinite infinites and counting Infinity and how some infinities are larger than others... I love this stuff.
How do you count an infinite set? By its definition it will take an infinite time, unless the counting per step time is literally zero (not even "effectively" zero).
7:45 All that a memory mapper does is translate a programmer limited by disk storage capacity virtual address into a real address by means of a very fast hardware table array. If the virtual address is not in memory at the current time, then it is paged from disk storage or whatever it is into the least recently used spot in real memory, and the memory mapper table entry for that virtual memory value is updated to contain the newly paged in program or data location of real memory. It's just a cool means of caching your disk space as virtual memory for program code or data.
When you try to do something crazy to improve performance, you often have to take a risk in it being easy to mess up and break or make a security hole.
Very interesting that stack smashing is seen as a security issue - well - it _is_ a security issue - no doubt about that, but some systems/languages actually use this type of manipulation as a feature! In the Forth programming language, the return stack is often manipulated to execute code! What an interesting domain we work in!
One game on like Playstation or something used buffer overflows to do the first OTA updates. They had a like pre screen that showed game news. They realised they could send the game news with a bunch of extra data that would patch the game live...
7:45 no, memory mapping didn't come from 32bit. It probably came from when computers were running one program at a time and wanted to introduce multitasking.
async socket code in c is messy. Windows implementation of iocompletion ports take in structs which can be filled differently depending on what mode you want them to run. Eg you can get notifications as a windows event loop message, a waitforsingleobject blocking call or getqueuecompletionstatus blocking call. By blocking calls i mean a method that blocks i mean something similar to how pselect works, it blocks until one of the multiplexed sockets has an event
I first read about Aleph in a book called "Mathematics and the Imagination" by Kasner Edwards and Newman James, 1949. Also introduced the infamous 'Gogoolplex'. The concept of infinity was very popular at the time, though nobody really understood it, if anybody actually does now. The book illustrates this fact by leveraging finite numbers that are so big that they are more less impossible outside of the imagination, whereas (regular!) infinity dwarfs that even further into nothingness (the equation of size and distance is a misnomer, it cant actually apply). It talks about a number of other fun and paradoxical mathematical subjects. It's one of the best books I have read and is still available for free online.
ℵ₀ is the cardinality of the set of natural numbers, the largest countable set. ℵ₁ is the next cardinality, the cardinality of the smallest uncountable set. 𝔠 is the cardinality of the real numbers. The Continuum Hypothesis (CH) states that ℵ₁ == 𝔠. This cannot be proven in ZFC because it has been shown to be independent of ZFC, that is, both it and its negation are compatible with ZFC.
"I smashed some stacks, but only unintentionally". Is that dev speak for "I smoked, but didn't inhale"? I did smoke some chips, unintentionally... Or was it unsmoked, because they let the magic smoke out, thus stopped working... Lack of magic issues...
ℵ[1] = cardinality of the continuum is the content of the continuum hypothesis, which Godel and Cohen showed is independent of the standard axioms of set theory (can be either true [Godel] of false [Cohen] in various models of set theory). What is always true is that pow(ℵ[0], 2) = cardinality of the continuum (by binary decimal representation). en.wikipedia.org/wiki/Continuum_hypothesis
While i was trying to predict the future of this video it got really easy after he said the thing about his haircut. Has my productive algorithm been hacked?
Speculative execution is not some hack to try to increase CPU performance after clock speed increases slowed down in the 2000's. Speculative execution dates back at least to the mid 90's and was needed to make up for the issue that CPU speed was growing exponentially while memory speed was growing more linearly. As CPU's would 2x their performance, memory speed might only grow by 20% and latency has hit a floor long ago. A CPU waiting for data from main memory would cycle more than 200 times waiting for the data. If every instruction required a main memory data retrieval, the CPU would be idle at least 99.5% of the time. This is why speculative execution is so important to modern processors.
yea he really has absolutely no clue whatsoever on cpu advancements he literally stated the opposite of the truth, it is insane to me. he knows nothing about this.
Didn't this already happen? It wasn't that long ago maybe 1.5-2y or so there was a vulnerability in the architecture that could only be software patched
As the old joke goes: There are two hard things in computing: Naming things, cache invalidation, and off by one errors.
Related to this: only 10 types of peope, those who know binary and those who don't.
Bro being cheeky with the off by one pun
there are only 10 types of people, those who know binary, those who don't and those who didn't expect a ternary joke
I have this on a t-shirt, I am in 2 minds about how it should be numbered (1,2,3 as current; or 0,1,2 which feels more appropriate, sort-of)
Works also as There are three hard things in computing: Naming things, cache invalidation, and off by one errors. Shit I fu up the joke.
As a cpp user, I like putting constexpr everywhere so that 100% of my code execute at compile time and I achieve infinite fps
If you constexpr everything, then the moment you run the game it immediately ends so you can move on to another one
Efficiency over usability! I like it.
By C++48 everything in C++ will be constexpr and the compiler will simply have the equivalent of the Java JVM built in
Constexpr doesn't guarantee compile time exec, it will only work on simple stuff anyway
Now that's a compiler that can *actually* see into the future (unlike the fake see-into-the-future that the speculative execution is).
Speculative execution is there since the 80ties , since RISC enabled pipelining. The reason is that pipeliniing gives a speedup of m, where m is the lenght of the pipeline, as long as the pipeline is full. The cost is that you pay a startup latency in the so called "Windup" phase, the phase were you fill the pipeline. This price you pay every time you gotta branch since you have to refill the pipeline. Without brench prediction and speculative execution you would have to refill the pipeline at the final branch of every for loop , fully defeating the advantages of pipelining, actually making it worst that if you were not pipelining. I have an exam on programming techniques for super computers in some days, life is hard
Btw exam went great! if anybody in this planet was wondering
@@indiano-99 You're built different my dude.
@@indiano-99 I had just finished reading the first post, and indeed, I was curious how it went.
Well done, you! I hope you've treated yourself to an ice cream.
if i had a dollar for every low level bug/cpu security feature discovered this year i'd have like 20
Good money
Yeah, I thought this already happened last year and this year; now twice this year for the ARM.
But still not enough to buy a carton of eggs 😔
If I had 2 to the power of the number of security bugs found this year, I would have around a million.
5:25 "I am not a hacker, I just ship bugs to production" said the hacker's best friend =)))
That joke just flew past me initally
She speculates on my execution 'til I crash.
Is that a buffer overflow in your pants or are you just happy to see me?
*[EXTREMELY LOUD STREAM ALERT]*
Speculative execution in 1970 on an IBM 360/75. A program I was testing had a S0C4 exception, basically illegal address. When I reviewed the crash dump, I observed that maybe 5-6 of the following register_to_register instructions had been performed. No virtual memory, no page tables, just raw memory. So the machine was running ahead of the slowish instruction that accessed raw memory.
Sir, were you working with COBOL?
I started with COBOl but a few months later was transferred to "systems" department and did nothing but assembler work for the next several years. A couple weeks playing with PL/1. Worked on auto answers to outstanding WTORs, patched the MVT open routines so the operators log would write to a tape, improved performance of long running programs... normal stuff at that time.
Watching Prime spend 5 minutes trying to dig out Memory Management Units (MMUs) and Translation Lookaside Buffers (TLBs) from his school days memory is wild.
Props to you man. Streaming live content is tough!
Just need to remember that without TLB cache the MMU sh*t wouldn’t execute fast enough on its own to be worth a d*mn
@@TheSulross Could you say it's not worth a DIMM? Sorry i'll see myself out
@@TheArrowedKnee Well, we all know how touchy YT can be with those explicit words - that was very brave of you to go there
Me trying to sleep: zzzz
That one mosquito: 5:36
Underrated
I'm dying
You should interview Christopher Domas.
The TLB, the virtual cache table, makes memory storage more efficient, only storing the actual bytes in any register. The remaining bits can be used to store other pieces of information, having two or more items in a single register. Before virtual paging, you could only store, say 4 bits, in only one register, in 32bit registers having 28bits of empty unused space.
3:40
Aleph_0 is the size of the natural numbers. Aleph_1, well that get's complicated fast. The full answer requires somewhat deep infinitary set theory. It also requires that you get somewhat in the weeds with which set-theoretical axioms imply what. Here's an attempt to summarize:
Assuming merely the Zermelo-Fraenkel axioms Aleph_1 is the second smallest infinity which can be well-ordered*.
The axiom of choice** is equivalent to saying all sets can be well-ordered, so if you assume that, this simplifies to 'aleph_1 is the second smallest infinity'.
The idea that Aleph_1 is the size of the real numbers is known as the continuum hypothesis. It was originally discussed by Cantor who invented infinitary set theory.
Hilbert then made proving or disproving the continuum hypothesis one of the problems of the next century in the year 1900.
Later Gödel proved the continuum hypothesis was consistent with the Zermelo-Fraenkel axioms and the axiom of choice (ZFC for short)***
Even later Cohen proved the negation of the continuum hypothesis was also consistent with ZFC.***
This means that the continuum hypothesis is independent of ZFC. Simply put, we can't prove or disprove it, without making assumptions we don't usually make.
This means that beyond saying that aleph_1 is the second smallest infinity (that can be well-ordered) we don't exactly know how big it is.
It might be the size of the real numbers, but it might be the size of some subset of the real numbers that's neither the size of the natural numbers, or the real numbers.
It's perhaps worth noting that without the axiom of choice the real numbers might not be well-orderable at all, and hence would not be any aleph.
* A well-order is a set joined with some kind of smaller than operation '
Short version: Aleph-1 is the cardinality of the set of ordinal numbers. The ordinal numbers represent the order of something. For example, 1st, 2nd, 3rd. You may think that the number of ordinal numbers and the number of cardinal numbers (positive integers) is the same, but it's actually not. For example, you can define an ordinal number called omega, which is one greater than infinity. Getting omegath place in a race means that an infinite number of people finished the race, and then you finished.
That's as far as I can get without getting into the weeds of it lol
it's a big sign of respect when Prime highlights your whole sentence
Prime just casually insulting top-notch cyber security researchers as basement dwellers. 😂
They won, but at what cost.
Win some, lose some.
Top-notch? When has he covered any?
26:42
One way to mitigate "the weather man problem" is to set up your chair and your camera slightly to the left of your monitor, that way when you look at the screen you are always looking to your right.
It's called "Aleph null" (pointer exception)
Speculative execution is vital for processor performance, but its repercussions seem to be too hard for humans to really fully comprehend. We're truly borked nowadays.
Starting with the Penium 4 (!) we had the branch prediction. In that time 50% of the silicon was just to detect if the prediction was wrong 🎉🎉
The difference between a programmer and a hacker. One creates exploits and the other exploits them.
@@Telhias that's more like the difference between a developer and a hacker. both are programmers.
It's like translation, it's hard to translate - well not anymore but it used to be - and easy to find mistakes or argue translations, aka finding bugs or exploits. But one needs to produce code/language, the other only proof-reads it.
take my muney for the "felt cute, might execute later" shirt
@18:06 "We still get like 15% per year"
Prime: "Yeah but 15% is nothing like it was in 2000s... like it 4x-ed over 10 years"
BUT PRIME 15% increase per year for 10 years is almost exactly a 4x !
1.15^10 is a 4.04x increase...
actually from 2000 to 2004 the common clock went from 800Mhz to 3.4Ghz. The computing power was doubling every 2 years for decades. Now we're far from that, since the core family were introduced in 2006. The number of cores and transistors still increase a lot but the computing power don't scale that much.
“… to 2004” - did you mean 2024?
@@meowsqueak he did mean 2004 as he’s talking about the time of the Pentium 4
@@meowsqueak no. I think around 2006 there were 4Ghz pentium 4 (I was playing BF2 with it :-) ). The core architecture brought lower clocks for years, only recently they went back to high frequencies because they can't gain in architecture anymore
Ok, I didn’t realise clocks hit 3 GHz+ by 2004, but it was a long time ago…
@@meowsqueak maybe you were barely born :-), yes young people probably don't realize how little tech has progressed since this time, only core number and probably cycles per instruction and vectored instructions are the cause of any computing power increased, as well as parallel processing used by software whenever it makes sense (not that often sadly), and a bit from low latency cache increase, memory bandwidth : frequency and bus width (bits) I have a laptop from 2008 with a quad core core i7.
How do you follow that chat stream it’s like a gaggle of chickens clucking
basedment++
About the pointer authentication you can write a pretty simple chip to encrpypt/ decrypt pointers with very little overhead. Dedicated chips that does just one thing like encrypting can be done using a "single" clock tick.
PS: just realized that the whole ASIIC Miners is exactly that, an dedicated encrpyt chip that is able to perform those encrpyptions orders of magnitude faster than any cpu.
Then how you can do offsets?
@@hanifarroisimukhlis5989 Honestly, I don't know. I never tough about implementing one, I just remembered my FPGA class where we had to code some hardware for specific problems. But i'm sure we could come up with a solution like we have now, something with Vtables.
“All hackers are cool…ummmmm!!” Had me lol and smashing a like😂👍🏼
"Im not a hacker, I just ship bugs to production"
That got me
If a bug is not fixable, it’s not a bug, it’s a feature.
Yeah 😅🫠😬
That moment when all the researchers have Korean names: Oh no....
"likely" is mostly just about ICACHE - code that's likely branch is there to be loaded and "unlikely" somewhere else behind branch. It makes you take branch on unlikely code and have next likely code in ICACHE.
Aleph 1 = Cardinality of the smallest uncountable set
isent it aleph null?
@@forest6008 No that's the cardinality of the smallest infinite set (countable infinity). Smallest uncountable set is the second smallest infinite set
@@TianYuanEX ohh thanks
@@TianYuanEX they're just sets, set can be defined as a number, thus you can type the type.
you can have a set of sets, but category theory is abstract non-sense.
if you can write it in mathematical notation, it is not infinite, isn't it ? it just discrete representation of a possibly infinitude, but it still not the biggest infinite thing.
here's a new concept. y = Aleph(Aleph(x))
Because Godels Aleph number wasn't infinite enough.
@@monad_tcp I'm gonna be honest with you, nothing you wrote makes any sense
I think there was something called PAE instructions that let you access more than 4GB of memory on 32 bit intel processors.
That wasn’t a bug. It was an attempt by Intel to extend the life of 32-bit software, thinking that 64-bit wasn’t ready yet.
IIRC, it allowed the 32-bit memory space to be mapped into a 48-bit physical space. No one program would have more than 4GB*, but many programs could be run with their own 4GB space. This reduced paging pressure on the system, something that was becoming a serious issue for operating system performance.
* In theory a program could have supported PAE directly and use more than 4GB of RAM. Similar to extended memory in the DOS days. In practice, I don’t think that ever happened.
@@thewiirocks I didn't say that its a bug. Prime mentioned that 32-bit programs cannot access more than 4GB of memory which is technically true but not so much.
@@thewiirocks Also useful to allow a 32-bit program to access 4GB of RAM on a 64-bit system, handy if you're modding games so they need more RAM than they did originally.
It basically allowed applications access to multiple 4GB memory spaces by including extra bytes, 16 to be specific. You could therefore have 2^16 * 4GB of addressed virtual memory per process.
V8 is the armv8 the successor to armv7
1:58 They are in
nested mother's basement😂
runahead optimizations were already a thing in the early 2000s, at least in research, it also enhances branch prediction greatly, but cache misses and memory latency were already big issues back in the 90s, there was just not enough resources in the CPU to do anything about it
Aleph 0 is the cardinality of the set of whole numbers. Aleph 1 is the cardinality of the set of real numbers. It is presumed (though not satisfactorily proven) that no set is larger than Aleph 0 but smaller than Aleph 1.
The reason we aren't sure is that P = NP or the Riemann Hypothesis being false would break a lot of math, but the existence of sets with cardinality larger than Aleph 0 but smaller than Aleph 1 wouldn't break set theory, nor would their nonexistence. A proof of P = NP or a counterexample to the Riemann hypothesis would break decades of work that has held up to scrutiny for now, so those counterexamples probably don't exist.
I learned about this number theory in discrete mathematics. Aleph null is the set of all sets. I loved it because I introduced the concept of infinite infinites and counting Infinity and how some infinities are larger than others... I love this stuff.
No, with ZFC there isn't any "set of all sets". Aleph null is cardinality of a countably infinite set.
How do you count an infinite set? By its definition it will take an infinite time, unless the counting per step time is literally zero (not even "effectively" zero).
7:45 All that a memory mapper does is translate a programmer limited by disk storage capacity virtual address into a real address by means of a very fast hardware table array. If the virtual address is not in memory at the current time, then it is paged from disk storage or whatever it is into the least recently used spot in real memory, and the memory mapper table entry for that virtual memory value is updated to contain the newly paged in program or data location of real memory. It's just a cool means of caching your disk space as virtual memory for program code or data.
When you try to do something crazy to improve performance, you often have to take a risk in it being easy to mess up and break or make a security hole.
"a little exciting, but mostly dangerous"
double basement of knowledge is my new quality metric for these types of topics
"Gate all around - the future of transistors."
Good youtube video.
5:57 in chat "Yea its the rainbow hats" has got to be the funniest tech joke I've seen in a minute.
so was the joke at 4:57 funnier
4:50 Self modifying code. It's as cool as it is scary.
5:38, I legit thought he was doing the Captain Crunch whistle phone hack tone.
Very interesting that stack smashing is seen as a security issue - well - it _is_ a security issue - no doubt about that, but some systems/languages actually use this type of manipulation as a feature! In the Forth programming language, the return stack is often manipulated to execute code! What an interesting domain we work in!
8:15 it's a vpt prime, a virtual page table. the 'translation' is named virtual address translation, accelerated via translation look-aside buffers.
You can't do that sort of research in a basement; you need like an abandoned factory, maybe an entire underground complex.
Did I just hear mention of the cardinality of different infinities? This is why I pay the man money.
I would have thought we'd have transitioned to syn-diamond from silicon already for the substrate
@26:06 ultimate bumber sticker
The golden era of CPU acceleration wasn't in the 2000s at all, more from the 70s to the 90s.
you mean when they went from wardrobe to printer size?
@@Kiyuja yes, when they went from decahertz to kilohertz to megahertz, that was amazing
obviously not, because it hasnt achive gigahertz where the gap is big >
@@retropaganda8442There were already computers running at 1MHz before 1970...
1965-2013 was about exponential acceleration.
i wonder if just submitting both execution paths to different cores would be faster than branch prediction
Speculative execution came way before the CPU speed wall. It was already there in Pentium Pro processors in the 90's. It's nothing new.
"we dont wanna touch eachs others memory"
/proc/*/mem:
One game on like Playstation or something used buffer overflows to do the first OTA updates. They had a like pre screen that showed game news. They realised they could send the game news with a bunch of extra data that would patch the game live...
This whole thing flew over my head.
7:45 no, memory mapping didn't come from 32bit. It probably came from when computers were running one program at a time and wanted to introduce multitasking.
Top quote here: russian-doll basement.
async socket code in c is messy. Windows implementation of iocompletion ports take in structs which can be filled differently depending on what mode you want them to run. Eg you can get notifications as a windows event loop message, a waitforsingleobject blocking call or getqueuecompletionstatus blocking call. By blocking calls i mean a method that blocks i mean something similar to how pselect works, it blocks until one of the multiplexed sockets has an event
lmao the Russian doll knowledge/basement had me rolling
I first read about Aleph in a book called "Mathematics and the Imagination" by Kasner Edwards and Newman James, 1949. Also introduced the infamous 'Gogoolplex'.
The concept of infinity was very popular at the time, though nobody really understood it, if anybody actually does now. The book illustrates this fact by leveraging finite numbers that are so big that they are more less impossible outside of the imagination, whereas (regular!) infinity dwarfs that even further into nothingness (the equation of size and distance is a misnomer, it cant actually apply).
It talks about a number of other fun and paradoxical mathematical subjects. It's one of the best books I have read and is still available for free online.
8:07 It was 2G or 3G with Large Address Aware.
All these newer bugs around speculative execution make me scared about all the times I implemented it. Some of these are just too hard to catch.
You never tried? I still have some memories of a school project writing a stack overflow vulnerable kernel driver ...
thinking 10 seconds ahead is what you do on LSD lol!
No wonder i seem drawn to cybersecurity stuff
If I had a dime for every bug in "ALL ARM PROCESSORS" which is actually a f**k-up by Apple, I would have the market capitalisation of ARM.
GaN based proccessors could go to the hundreds of GHz. P-type gan transistors are just really difficult.
ℵ₀ is the cardinality of the set of natural numbers, the largest countable set.
ℵ₁ is the next cardinality, the cardinality of the smallest uncountable set.
𝔠 is the cardinality of the real numbers.
The Continuum Hypothesis (CH) states that ℵ₁ == 𝔠. This cannot be proven in ZFC because it has been shown to be independent of ZFC, that is, both it and its negation are compatible with ZFC.
4:02 wild "yeh"
"A russian doll basement situation to have that level of knowledge" Wait, isn't that just Mr. Zozin, aka Tsoding? 😆
Stack smash can be prevented by double stack process but nobody cares to implement this in current processors
So the mechanism is basically Spectre but for memory tags? Yeah that sounds bad.
Branch prediction was a mistake...
...which is a HUGE problem since we don't seem to have any other way to speed up CPUs.
"I smashed some stacks, but only unintentionally". Is that dev speak for "I smoked, but didn't inhale"? I did smoke some chips, unintentionally... Or was it unsmoked, because they let the magic smoke out, thus stopped working... Lack of magic issues...
ℵ[1] = cardinality of the continuum is the content of the continuum hypothesis, which Godel and Cohen showed is independent of the standard axioms of set theory (can be either true [Godel] of false [Cohen] in various models of set theory). What is always true is that pow(ℵ[0], 2) = cardinality of the continuum (by binary decimal representation). en.wikipedia.org/wiki/Continuum_hypothesis
While i was trying to predict the future of this video it got really easy after he said the thing about his haircut. Has my productive algorithm been hacked?
Speculative execution is not some hack to try to increase CPU performance after clock speed increases slowed down in the 2000's. Speculative execution dates back at least to the mid 90's and was needed to make up for the issue that CPU speed was growing exponentially while memory speed was growing more linearly.
As CPU's would 2x their performance, memory speed might only grow by 20% and latency has hit a floor long ago.
A CPU waiting for data from main memory would cycle more than 200 times waiting for the data. If every instruction required a main memory data retrieval, the CPU would be idle at least 99.5% of the time.
This is why speculative execution is so important to modern processors.
Basements all the way down.
14:20 learning moment for me. Aware
2:01 😂 Double basement
I think the guys joke at 22:00 was digging at himself, hope it gets cleared up before the video ends :')
Things like this don’t get thought up “whole cloth”… Still pretty genius, but it takes a lot of experimentation and “99% perspiration” to get there. 😅
Its not you who smashes stacks, its stacks that smash you
I think primagen never took an OS course at college. Because they will teach you how virtual memmory actually works.
Accuuually!
Oh yeah, ... ARM is the perfect CPU to have this type of bug given close to every instruction can be executed conditionally.
Ever heard about c3 lang?
9:47 it’s not costly in performance because the HW handles it in parallel
Hackers begat furries, in the same way eggs pre-dated modern avian dinosaurs and all chickens. Evolution, ya heard?
I love these videos
Every CPU is vulnerable ... No matter what.
so is this basically just spectre & meltdown for ARM?
You should definitely learn assembly... Sure way to go insane
Crazy how hardware limitations are able to be exploited so hard even when it’s theoretically safe
5:42 it’s called furry swag
on top of that, you can search for gadgets and use genetic evolution algo to generate ROP chain attacks
Prime's alien!
yea he really has absolutely no clue whatsoever on cpu advancements he literally stated the opposite of the truth, it is insane to me. he knows nothing about this.
@@JohnSmith-pn2vl He is a web dev, what did you expect lol
Didn't this already happen? It wasn't that long ago maybe 1.5-2y or so there was a vulnerability in the architecture that could only be software patched
"Just saying.. it seems weird I'm seeing my own thing." It's ok dude, you get used to it over time. I'm seeing my own thing at least twice a day. 😏
Yikes! I have been trying to learn ARM for the last few months now (fun by the way). Yeah. Gotta watch!
5:40 I don’t like that group, but I gotta admit they have a really fun name.