I always love this low level computer stuff. I would like to see things go even a little bit lower, like what exactly executing a command looks like in mathematical terms though
If you really want to know then check out Ben Eater he builds the various pieces of a computer on breadboard and over several videos you start to see and understand how computers do their thing. After watching and realizing how much goes into the simplest of tasks it really puts into prespective how amazing something like the smart phone is. It's truely magic.
you know, i think on the level of binary digits these things have structure of boolean algebra, that is kind of abstraction over physical device, ( like usually math)
Some is boolean algebra, which is then broken down to transistor levels. Most transistors use Silicon in their construction, hence, "Silicon Valley". Other's uses more complex circuits, which those can be broken down to boolean algebra, etc, etc
Sir, Thank you for doing this lesson. I m sitting for BCS HEQ exams this november and this channel is my source of knowledge. I always find it difficult to understand that bubble concept in the pipeline, but now I do. thanks again.
@Supreme the large-ish black boxes used in the breadboard computer series are integrated circuits, essentially a packaging around the actual logic gates to make it possible to handle as a human. There's a tiny assembly somewhere in there with the actual logic gates, connected to the IC's outside connectors via tiny wires. The logic gates in a modern CPU are even more miniaturized, of course. As for where they are in a CPU, well, they pretty much make up the entire thing, storage (lots of caching going on in a modern CPU, also some microcode) and wiring connections (most of the cpu's package is filled up with wiring that connects to the connectors on the bottom; the actual cpu is only a few square centimeters in the center) aside.
I think it would have been worth of at mentioning that executed instuction may also cause program to jump to different part of code. Therefore executing next command that is already in pipeline would be invalid and whole pipeline need to be flushed before continuing.
Tiikoni And maybe also that the execution of a pipelined instruction may override the value of a memory address already read by the fetch module? In this case a partial flush is also necessary, I think. Although it must be noted that changing the code section of memory is not a "usual" thing to do.
@Flankymanga That's not the problem with goto. Need an if() statement? That's a jump. Need a for() loop? That's a jump. Need to call a function? That's also a jump. The problem with "goto" is that it makes code hard to read for puny humans.
You folks should insert a link to Ben Eater's "Building an 8-bit Breadboard Computer" series right here on YT. He's brilliant at simplifying the complications of a CPU to a level which the ordinary person can understand. The Breadboard Computer which Ben Eater builds and explains over the course of the series can be built by anyone. The only really big complication is finding all of the parts because some of them have become quite scarce since the book which Ben used as his guide was written.
Great video! Could you guys also cover on different instruction sets for processors? Since I always found the difference between x64, x86 , RISC to be confusing. Would be awesome if you could make a video on it :)
oh boy I hope there is an extended or part two, so many interesting options in CPU functioning, also a vid for GPUs and openCL (GPU for non graphics computation) to contrast with the CPU.
after all it is usually the longest one. back when i was in school some teachers would do this and it was awkward or funny at first but it makes some sense.
I used to point at things on paper with my middle finger, until my friend start correcting my every time i used the middle finer, and now, i no longer do it lol
Is there an animation of this anywhere? The mechanism that fetches the bits and bytes, how it corrals them and brings them back. What is moving, electrons? If they move, how does a copy stay behind? Or is it like Morse code where a signal is sent by the storage using some sort of transmitter that reads the info and sends out what it reads. I'm trying to imagine this tiny world where nothing moves, but a lot happens.
Generally a cool video, but I think you should've split it into one about pipelining (there are other hazards as well like conditions). And then you (or Steve actually) could have expanded about the actual steps inside the CPU. I saw a video recently of someone building a CPU (+ Memory) on breadboards using separate chips for the registers and so on. And I found that video really helpful to see what actually means "fetching", "decoding" and "executing".
Oh the times of the 6510 on the C64. I could read (and disassemble) hex code like reading a book. Modifying the OS, writing cracks and use the space used by the copy protection code for more useful stuff was so easy back then.
CPU is way more complicated than the video entails. Modern processors do all types of things to keep the pipeline filled such as pulling more data than required. For example: rather than fetching one instruction at a time, it could fetch an arbitrary volume of memory (let's say 64 bytes). It would then decode as many instructions as it could from that block and prefetch the next block when the fetch bus isn't in use. Then, it can cache the decoded instructions to prevent fetching and decoding recent code. The CPU can also make notes about the instructions prefetched to predetermine holes in pipeline and try to fill them by reorganizing instructions. This also helps with branch prediction and register renaming.
Cheers! I worked for ARM Gemany about 7 years ago. In the simulator group. They had a just in time ARM to x86 translator (done by a goup in the UK) and a GUI tool to build entire SoCs from ARM parts (done by us). You stiched your SoC together in the GUI and a simulator of that system was built by generating code and compiling it to an executable. Interesting stuff. Sadly they closed the site.
I still have question. How does Assembly Language which is a software code communicates with Silicon Chip which is a hardware i.e. how is conversion done to chip of assembly language.
That green barred line printer paper is a blast from the past! Is it manufactured seriously any more, or is that just for fun? I used to load deafening band printers with that stuff, and it would frequently mash it all up, and the whole print job would have to be redone.
Interesting that you give an example of a system where most of the parts of the CPU are idle, then compare it to a 6502... Which does instruction decoding and execution in parallel. (it's like a short, 2 stage pipeline, but not quite.) compared to some other processors from the era the instruction times for 6502 code were very short and consistent. I miss The 65x family. But it died out because it's entire design is built around having RAM that is faster than the processor. And since the mid 90's it's pretty much guaranteed that the processor is faster than RAM. That's why cache memory exists. If your main memory was fast enough you wouldn't bother implementing a cache, because it would be redundant. But... When main memory is slow... Cache helps keep the CPU busy...
cant the bubbles be removed if the instruction memory and data memory were seperate? the structural hazards can be avoided that way since we can access both at the same time
This is actually why CPUs have multiple caches in series and parellel; The instructions and stack variables tend to be towards one end (or both ends) of allocated memory, while heap variables are towards the other end (or the middle). While the bubble can still exist, it's toned down many orders of magnitude, and still allows the possibility of treating instructions and data interchangably.
When multiple steps are occurring in the CPU at the same time, is that how simultaneous multi-threading works on an AMD CPU (or hyper-threading on Intel)?
I still have a big question: how does this translate into transistors? the piece piece I am missing of the puzzle is how adding more transistors increases the speed, specially knowing that there are tasks that require to be sequential.
Just adding more transistors doesn't increase the speed, but some programs can be structured so that several parts of the program can be run at the same time, only in different parts of the CPU. In that case, "more transistors" equals more "cores" in the CPU, allowing it to literally do more than one thing at a time. But as you guessed, this does not increase speed where all parts of the program have to be run in sequential order.
So I cannot estimate his age be looking at him, at all. I googled "Dr Steve Bagley"and it auto-completed to "Dr Steve Bagley age", so clearly someone else thought the same thing.
Can you explain the "on an ARM processor all instruments are 32 bits long"? I'm going to take that to mean "the same bit length" vs 32 bits. But besides that, I remember doing ASM on a NXP chip and some instructions take a few cycles. But I could've swarn some of the Java and thumb 2 stuff had a different instruction length...?
The (original) ARM processor used 32 bits (4 bytes) to encode its instructions unlike the 6502 which used 8 bits (1 byte) to encode its instructions (a lot of bit patterns were not used). The 6502 used synchronous memory access - every clock cycle it read/wrote memory, and so used a single byte program counter pipeline - the next byte of the program (instruction/data) was being read: in his example of A9 43 when the A9 was being decoded the 43 was loaded, executing the instruction shifted the 43 to the A reg so whilst that was being done the next instruction 20 was being read in, and so on. The biggest problem with bubbles (in the pipeline of the 6502) was branch instructions - they could take 2, 3 or 4 clock ticks to execute (2 if no branch, 3 if branch to memory with the same page number (top 8 bits of address), 4 to a different page). Most instructions take 2 clock ticks plus extra tick if absolute addressing (3 bytes long) plus extra ticks for different addressing modes (causing bubbles). The ARM processor executes near enough 1 instruction per clock tick by avoiding branching (bubbles in its pipeline) by using some of the 32 bits of the instruction as a condition - the instruction only executes if the condition is met (there is the condition TRUE which means the instruction is always executed). I'm trying to remember but i think the 32 bits may have included memory addresses and immediate data.
so if the pipeline infrastructure cant fetch a command and execute a command at the same time, doesnt that just mean you need another data bus? It seems to me that if you need one data bus for fetching instructions and another for accessing memory, that you should have every possible part necessary to execute a command redundant and in parallel, basically one bus for accessing memory, one for fetching instructions, a decoder for both, and then no matter what the instruction says, you always have a bus ready for it to be used on the next tick, so you always have an incoming pipeline and a parallel pipeline for things required in the actual instruction. If you have pipeline flow issues, make the pipe bigger or in parallel :P
Well my wonderful peoples, I've been searching far and wide and I am yet to find an answer, how does the computer actually generate the clock pulse that determines the speed. Is it a tiny capacitor being charged and discharged as i suspect or am I completely wrong and is it something entirely different. The internet seems stumped by this and I can only seem to find videos like this telling me the software side of things. I would be much obliged to recive any information about this subject and would greatly apreciate some further reading links. -yours sincerely, some random internet person
I still don't get what's actually happening inside the CPU. How does it "know" to put a value in the point counter? How do the CPU and memory "talk" so that the memory knows, or is forced to send, an instruction from a specific address? Why does running two voltages (1s and 0s) through a CPU do anything? Seems like the CPU "knows" certain instructions, but where does the "knowing" come from?
The CPU doesn't "know" anything. Usually, in a Windows system, the CPU has to run a sub-program which translates the numeric value into the graphic for that letter or other character (eg 65=A).
Does anyone know how many reviews does it take for subs to get approved? Or how it even works? I got some ppl to review my Spanish subs, but they don't appear yet (in this video nor the MegaProcessor one).
Interesting but I don’t understand how in the past there was a race with CPU manufacturers to have the highest number of MHz and that basically said how fast the CPU is. Why is that no longer a thing?
It was mostly just marketing. The same CPU running at double the MHz can theoretically perform calculations twice as fast. But you can also design the CPU to do more at slower speeds. It's no longer a thing because a) consumers realised that MHz only tell part of the story, and b) higher frequencies are exponentially harder to keep cool and stable
Basically to increase clock speed you need to pack transistors closer to each other, as well as evacuate the heat from smaller volume, and physical limits have been approached
Can this pipeline ‘bubble’ dilemma be a solution quantum computing can solve with its ability to compute instructions simultaneously regardless of a cycle?
But in order to execute it you have to know what it does. Is it a math operation? Is it a compare? Is it testing for flag settings? Is it accessing a memory location?
arcuesfanatic that makes sense to me. Since you seem to know a lot about this, why is it that when an assembly instruction uses for example the hex code 22FF, in assembled machine code it becomes FF22? Things are written backwards for some reason
It really depends on the architecture, but the general idea is the instruction might not neccessarily be an atomic instruction. For example: var += 5 in most C family languages translates to var = var + 5. Some CPUs might implement a += operator in addition to the regular + and = operators. This allows the program's footprint in memory to be effectively two instructions shorter. The CPU would probably have the operation implemented in it's decoder by inserting load(var) -> add(5) -> save(var) atomic instructions into the instruction queue, rather than implementing a circuit for doing all three at once in the arithmetic operators unit.
Id like to know how the design effects performance, and why AMD has trouble competing with Intel, why has the improvement slowed down in the last 5 years, what are the challenges in making a better processor etc. Is the whole approach with the way the CPU designed wrong? Not wrong but are there different ways not explored yet? Does functional programming have anything to with it?
@@davidwebb4755 You are correct. The problem is that integrated circuit components have reached the point where their transistors consist of so few atoms that to make them any smaller would introduce too much electrical resistance to allow the circuit to work. Much of the work now is going into packing more and more CPUs into a single chip so that the computer can literally do many things at once. In some applications, such as 3D rendering or spreadsheet calculations, this speeds up overall processing speed by a huge amount. In others, not so much.
I always love this low level computer stuff. I would like to see things go even a little bit lower, like what exactly executing a command looks like in mathematical terms though
If you really want to know then check out Ben Eater he builds the various pieces of a computer on breadboard and over several videos you start to see and understand how computers do their thing. After watching and realizing how much goes into the simplest of tasks it really puts into prespective how amazing something like the smart phone is. It's truely magic.
lower? so...Algebra
10101010110011001010111011101100
you know, i think on the level of binary digits these things have structure of boolean algebra, that is kind of abstraction over physical device, ( like usually math)
Some is boolean algebra, which is then broken down to transistor levels. Most transistors use Silicon in their construction, hence, "Silicon Valley". Other's uses more complex circuits, which those can be broken down to boolean algebra, etc, etc
I’m watching many of these years after publishing and extremely grateful for these explanations! You truly have a talent for teaching.
ok?
Sir, Thank you for doing this lesson. I m sitting for BCS HEQ exams this november and this channel is my source of knowledge. I always find it difficult to understand that bubble concept in the pipeline, but now I do. thanks again.
I have a Computer Architecture exam tomorrow. I am so glad that youtube recommended me to watch this. Thanks Computerphile
If anyone wants to understand this stuff at a very fundamental level, I would highly recommend Ben Eater's series on building a breadboard computer.
aye i was just watching that
@Supreme the large-ish black boxes used in the breadboard computer series are integrated circuits, essentially a packaging around the actual logic gates to make it possible to handle as a human. There's a tiny assembly somewhere in there with the actual logic gates, connected to the IC's outside connectors via tiny wires.
The logic gates in a modern CPU are even more miniaturized, of course. As for where they are in a CPU, well, they pretty much make up the entire thing, storage (lots of caching going on in a modern CPU, also some microcode) and wiring connections (most of the cpu's package is filled up with wiring that connects to the connectors on the bottom; the actual cpu is only a few square centimeters in the center) aside.
He’s fake. He admits he uses boards.
??
Thanks, Dr. Bagley, you are an excellent public speaker and explained the CPU cycle quite clearly.
I hope this turns into a series. More on this topic please!
ok?
That on the pen distracted me for about 2 minutes
He probably broke some rule by writing something else than c with it.
Autism doesn't have medicine, KiloSierra.
C sharpie
i feel u man :P
??
Love the computerphile logo on the end of the marker.
Some lovely computers in the background
I think it would have been worth of at mentioning that executed instuction may also cause program to jump to different part of code. Therefore executing next command that is already in pipeline would be invalid and whole pipeline need to be flushed before continuing.
Tiikoni And maybe also that the execution of a pipelined instruction may override the value of a memory address already read by the fetch module? In this case a partial flush is also necessary, I think. Although it must be noted that changing the code section of memory is not a "usual" thing to do.
This is exactly why it is a bad habit to use "'goto" in programming languages....
That's what branch prediction is for
Neural networks ftw
@Flankymanga That's not the problem with goto. Need an if() statement? That's a jump. Need a for() loop? That's a jump. Need to call a function? That's also a jump. The problem with "goto" is that it makes code hard to read for puny humans.
I noticed your Amiga 1000 in the corner. I also had owned this machine back in 85/86. Good times!
You folks should insert a link to Ben Eater's "Building an 8-bit Breadboard Computer" series right here on YT. He's brilliant at simplifying the complications of a CPU to a level which the ordinary person can understand.
The Breadboard Computer which Ben Eater builds and explains over the course of the series can be built by anyone. The only really big complication is finding all of the parts because some of them have become quite scarce since the book which Ben used as his guide was written.
Thank you Bilbo. You are my IT mithril.
Great video! Could you guys also cover on different instruction sets for processors? Since I always found the difference between x64, x86 , RISC to be confusing. Would be awesome if you could make a video on it :)
Next you introduce memory interleave architectures.
Enjoyable video - takes me back - wrote my first program in '69
oh boy I hope there is an extended or part two, so many interesting options in CPU functioning, also a vid for GPUs and openCL (GPU for non graphics computation) to contrast with the CPU.
Serious bonus for using the SGI/Irix buttonfly buttons in your animation!
I like how he points with his middle finger.
after all it is usually the longest one. back when i was in school some teachers would do this and it was awkward or funny at first but it makes some sense.
GoodOlKuro its also held to be rude to point with index
also is in the middle of the hand.
At least he didn't point with the V sign.
I used to point at things on paper with my middle finger, until my friend start correcting my every time i used the middle finer, and now, i no longer do it lol
Respected Sir.
Your explanation is very amazing. I have a great interest in low level computer stuff. Keep making these kind of videos. 👍👍👍
Is there an animation of this anywhere? The mechanism that fetches the bits and bytes, how it corrals them and brings them back. What is moving, electrons? If they move, how does a copy stay behind? Or is it like Morse code where a signal is sent by the storage using some sort of transmitter that reads the info and sends out what it reads. I'm trying to imagine this tiny world where nothing moves, but a lot happens.
Look up Ben Eater here on YT and look for his Breadboard Computer series of videos. He's fantastic at simplifying the complicated.
3:15, that's a very big "ish"
false.
Generally a cool video, but I think you should've split it into one about pipelining (there are other hazards as well like conditions). And then you (or Steve actually) could have expanded about the actual steps inside the CPU. I saw a video recently of someone building a CPU (+ Memory) on breadboards using separate chips for the registers and so on. And I found that video really helpful to see what actually means "fetching", "decoding" and "executing".
Fabian Neundorf , Can you link that video? Id be very interested to see it.
please link that video ;-;
PureMotionHD well in my reply I linked it. if it doesn't show up, search for "Ben Eater" who is the uploader.
ok?
BBC Micro with a zip drive attached? What sorcery is this?!! Also, nice shelf usage, much better than the other guy! :)
and being displayed on an Atari monitor, sweet.
Oh the times of the 6510 on the C64.
I could read (and disassemble) hex code like reading a book. Modifying the OS, writing cracks and use the space used by the copy protection code for more useful stuff was so easy back then.
What assembler did you use? I didn't think the Commodore 64 had an assembly monitor in rom like the Apple 2 did.
CPU is way more complicated than the video entails. Modern processors do all types of things to keep the pipeline filled such as pulling more data than required. For example: rather than fetching one instruction at a time, it could fetch an arbitrary volume of memory (let's say 64 bytes). It would then decode as many instructions as it could from that block and prefetch the next block when the fetch bus isn't in use. Then, it can cache the decoded instructions to prevent fetching and decoding recent code. The CPU can also make notes about the instructions prefetched to predetermine holes in pipeline and try to fill them by reorganizing instructions. This also helps with branch prediction and register renaming.
Woweee..This is like seeing your street on TV.. Ex-intel cpu and current ARM cpu group engineer checking in..Any fellow computer uarch/arch folks?
skeptic youravg so where did you study and what (from someone genuinely interested in the career)
Electrical Engineering (courses in computer Arch & Digital design).. Arizona state univ..
Cheers! I worked for ARM Gemany about 7 years ago. In the simulator group. They had a just in time ARM to x86 translator (done by a goup in the UK) and a GUI tool to build entire SoCs from ARM parts (done by us). You stiched your SoC together in the GUI and a simulator of that system was built by generating code and compiling it to an executable.
Interesting stuff. Sadly they closed the site.
ARM is hiring like crazy right now..plans to double in the UK in the next few years..You can still make it back :)
What are "threads" and how do they emulate a CPU on a virtual machine?
So..next: Hyper Threading?
More universally, Symmetric Multi-Threading (SMT).
Edit: sorry, *Simultaneous* multithreading (SMT)
Yeah, but you'd need a CPU with more instruction level parallelism if you want SMT to make sense then what this example provides.
SMT means Simultaneous multithreading
Xei Yes, thanks for the correction!
I'd suggest out-of-order execution
Each cycle:
Fetch instruction from memory
Decode instruction
Execute instruction
Best video to explain execution of process in cpu
I still have question. How does Assembly Language which is a software code communicates with Silicon Chip which is a hardware i.e. how is conversion done to chip of assembly language.
What about the Memory Access and Write Back phases?
That green barred line printer paper is a blast from the past! Is it manufactured seriously any more, or is that just for fun?
I used to load deafening band printers with that stuff, and it would frequently mash it all up, and the whole print job would have to be redone.
Isnt it possible today to have on a dual channel common DDR RAM to have multiple parallel accesses?
What's up with that into? Looking away and then at the camera? Is that some kind of cinematography trick? It just looks kinda awkward to be honest.
Interesting that you give an example of a system where most of the parts of the CPU are idle, then compare it to a 6502...
Which does instruction decoding and execution in parallel. (it's like a short, 2 stage pipeline, but not quite.)
compared to some other processors from the era the instruction times for 6502 code were very short and consistent.
I miss The 65x family. But it died out because it's entire design is built around having RAM that is faster than the processor.
And since the mid 90's it's pretty much guaranteed that the processor is faster than RAM.
That's why cache memory exists. If your main memory was fast enough you wouldn't bother implementing a cache, because it would be redundant. But... When main memory is slow... Cache helps keep the CPU busy...
cant the bubbles be removed if the instruction memory and data memory were seperate? the structural hazards can be avoided that way since we can access both at the same time
This is actually why CPUs have multiple caches in series and parellel; The instructions and stack variables tend to be towards one end (or both ends) of allocated memory, while heap variables are towards the other end (or the middle). While the bubble can still exist, it's toned down many orders of magnitude, and still allows the possibility of treating instructions and data interchangably.
When multiple steps are occurring in the CPU at the same time, is that how simultaneous multi-threading works on an AMD CPU (or hyper-threading on Intel)?
excellent! animation and description wise..
I still have a big question: how does this translate into transistors? the piece piece I am missing of the puzzle is how adding more transistors increases the speed, specially knowing that there are tasks that require to be sequential.
Just adding more transistors doesn't increase the speed, but some programs can be structured so that several parts of the program can be run at the same time, only in different parts of the CPU. In that case, "more transistors" equals more "cores" in the CPU, allowing it to literally do more than one thing at a time.
But as you guessed, this does not increase speed where all parts of the program have to be run in sequential order.
The part when he mentions the 15 byte instructions on X86 reminded me of ROP. I Guess this is why ARM is so much more secure.
Thanks a lot for sharing the knowledge, about the necessity for Harvard architecture.
Very HELPFUL video...thanks so much !!!
Excellent presentation!!
Why not enhance the staccato speaking manner with staccato jump cuts?
Nice video! would interesting if he went more in depth into pipelining hazards
So I cannot estimate his age be looking at him, at all. I googled "Dr Steve Bagley"and it auto-completed to "Dr Steve Bagley age", so clearly someone else thought the same thing.
Every time he touched and left a fingerprint on the monitor my soul hurt :) In all seriousness though, great video!
1h30 min of lecture compressed into 11min15s.
You forgot to mention the Prock Architecture... oh wait , I haven't released it yet, its better that anything out there!
2:15 "You could do this, or this... Although it would probably crash *furrowed brow*." Lol
Can you explain the "on an ARM processor all instruments are 32 bits long"? I'm going to take that to mean "the same bit length" vs 32 bits. But besides that, I remember doing ASM on a NXP chip and some instructions take a few cycles. But I could've swarn some of the Java and thumb 2 stuff had a different instruction length...?
Cool, so I was half remembering correctly... It's been a while, so that's kinda cool to know :)
Btw, I searched on the ARM infocenter site first, but the Keil explanation was the best I found.
The (original) ARM processor used 32 bits (4 bytes) to encode its instructions unlike the 6502 which used 8 bits (1 byte) to encode its instructions (a lot of bit patterns were not used).
The 6502 used synchronous memory access - every clock cycle it read/wrote memory, and so used a single byte program counter pipeline - the next byte of the program (instruction/data) was being read: in his example of A9 43 when the A9 was being decoded the 43 was loaded, executing the instruction shifted the 43 to the A reg so whilst that was being done the next instruction 20 was being read in, and so on.
The biggest problem with bubbles (in the pipeline of the 6502) was branch instructions - they could take 2, 3 or 4 clock ticks to execute (2 if no branch, 3 if branch to memory with the same page number (top 8 bits of address), 4 to a different page).
Most instructions take 2 clock ticks plus extra tick if absolute addressing (3 bytes long) plus extra ticks for different addressing modes (causing bubbles).
The ARM processor executes near enough 1 instruction per clock tick by avoiding branching (bubbles in its pipeline) by using some of the 32 bits of the instruction as a condition - the instruction only executes if the condition is met (there is the condition TRUE which means the instruction is always executed).
I'm trying to remember but i think the 32 bits may have included memory addresses and immediate data.
please any one explain how cpu is outputing it on the screen i cant find anywhere!
It is nice to learn the basics of a computer. It gives you the confidence to use your computer. I think l was born too soon and l am playing catch up.
Really good explanations.
Cool, is the register a kind of memory, a kind of cache?
so if the pipeline infrastructure cant fetch a command and execute a command at the same time,
doesnt that just mean you need another data bus?
It seems to me that if you need one data bus for fetching instructions and another for accessing memory, that you should have every possible part necessary to execute a command redundant and in parallel, basically one bus for accessing memory, one for fetching instructions, a decoder for both, and then no matter what the instruction says, you always have a bus ready for it to be used on the next tick, so you always have an incoming pipeline and a parallel pipeline for things required in the actual instruction.
If you have pipeline flow issues, make the pipe bigger or in parallel :P
So how does the cpu get the first address loaded on its program counter in the first place?
It's always starts at address 0 or some predefined address I guess
Awesome video!
Well my wonderful peoples, I've been searching far and wide and I am yet to find an answer, how does the computer actually generate the clock pulse that determines the speed. Is it a tiny capacitor being charged and discharged as i suspect or am I completely wrong and is it something entirely different. The internet seems stumped by this and I can only seem to find videos like this telling me the software side of things. I would be much obliged to recive any information about this subject and would greatly apreciate some further reading links.
-yours sincerely, some random internet person
Great refresher.
thumb2 instructions are 32 bits, in thumb1 there are some 16 bit long instructions
Request: please do seperate video on two types of KERNEL...
I still don't get what's actually happening inside the CPU. How does it "know" to put a value in the point counter? How do the CPU and memory "talk" so that the memory knows, or is forced to send, an instruction from a specific address? Why does running two voltages (1s and 0s) through a CPU do anything? Seems like the CPU "knows" certain instructions, but where does the "knowing" come from?
How does the CPU know how to draw the character? Is it coded within the CPU itself?
The CPU doesn't "know" anything. Usually, in a Windows system, the CPU has to run a sub-program which translates the numeric value into the graphic for that letter or other character (eg 65=A).
Does anyone know how many reviews does it take for subs to get approved? Or how it even works?
I got some ppl to review my Spanish subs, but they don't appear yet (in this video nor the MegaProcessor one).
This video needs 65536 views.
Excelente explanation! THanks!
If we have a pipelined CPU, and the instruction needs to access something that will cause a bubble, why don't we just add some address buses?
Interesting but I don’t understand how in the past there was a race with CPU manufacturers to have the highest number of MHz and that basically said how fast the CPU is.
Why is that no longer a thing?
It was mostly just marketing. The same CPU running at double the MHz can theoretically perform calculations twice as fast. But you can also design the CPU to do more at slower speeds. It's no longer a thing because a) consumers realised that MHz only tell part of the story, and b) higher frequencies are exponentially harder to keep cool and stable
Basically to increase clock speed you need to pack transistors closer to each other, as well as evacuate the heat from smaller volume, and physical limits have been approached
You can try this with the Johnny-Simulator. Nice tool
Informative video, and interesting. Thanks.
Can this pipeline ‘bubble’ dilemma be a solution quantum computing can solve with its ability to compute instructions simultaneously regardless of a cycle?
why do these videos not have captions?
An explanation of how CPUs using the compliance model of everything.
Doesn't the cu fetch and decode. And the alu execute? Then the cu stores back in ram or cache
Register not cache
"Register not cache". registers can be considered cache too.
It will be alright, yeah.
Rock the cache-bah.
I don't really get what the decoding is supposed to do, can't you just execute a piece of code after you fetched it?
But in order to execute it you have to know what it does. Is it a math operation? Is it a compare? Is it testing for flag settings? Is it accessing a memory location?
arcuesfanatic that makes sense to me. Since you seem to know a lot about this, why is it that when an assembly instruction uses for example the hex code 22FF, in assembled machine code it becomes FF22? Things are written backwards for some reason
It really depends on the architecture, but the general idea is the instruction might not neccessarily be an atomic instruction. For example: var += 5 in most C family languages translates to var = var + 5. Some CPUs might implement a += operator in addition to the regular + and = operators. This allows the program's footprint in memory to be effectively two instructions shorter. The CPU would probably have the operation implemented in it's decoder by inserting load(var) -> add(5) -> save(var) atomic instructions into the instruction queue, rather than implementing a circuit for doing all three at once in the arithmetic operators unit.
I suggest you read about endianness en.wikipedia.org/wiki/Endianness (:
anyone allergic to the sound of that sketch pen sounding "shhhhh shhhhh shh"?
Steve is a rockstart of computing
Before CLUs. There was nothing. I want to know how they ŵent rom nothing to something as complicated as a cpu.
thankyou for sharing the video its very helpful
Love this guy.
What makes m4 so much faster than m1?
The CPU wires make me uncomfortable for some reason (6:04)
Because of you, I made a computer! Check out the video on my page. I made it control servers, leds, sounds and more! I just had to say thank you.
It'd be interesting to know how some modern CPUs are able to achieve 16 instructions per cycle
Why can't they create a CPU the size of a graphics card, so they dont have to worry about size issues?
Thanks for the video
Funny, I gave a presentation on this topic in school today.
Cool how I already learned pipelining basics before this video. Too bad he didn't go into branching, although that's a bit much for one video.
Do one on vault 7 backdoor in cpu's
Id like to know how the design effects performance, and why AMD has trouble competing with Intel, why has the improvement slowed down in the last 5 years, what are the challenges in making a better processor etc. Is the whole approach with the way the CPU designed wrong? Not wrong but are there different ways not explored yet? Does functional programming have anything to with it?
Biggest challenge is making smaller and smaller transistors, I think.
@@davidwebb4755 You are correct. The problem is that integrated circuit components have reached the point where their transistors consist of so few atoms that to make them any smaller would introduce too much electrical resistance to allow the circuit to work.
Much of the work now is going into packing more and more CPUs into a single chip so that the computer can literally do many things at once. In some applications, such as 3D rendering or spreadsheet calculations, this speeds up overall processing speed by a huge amount. In others, not so much.
Inside the CPU huh? Can this help me find my Xeon processor that I lost?
Why would you need 8^15 bit long instructions? I knew x86 was somehow suboptimal by todays standards, but that is just crazy!
Assembly language which we as humans can understand...ish. 😂
A 15 byte instruction? Holy cow.
@4:40 am i high or did he explain the same thing like 5 times in 5 different ways ? 🤔
Can someone recommend me a good Computer Architecture book?
I'd suggest as a primer, Ben Eater's Breadboard Computer series here on YT.