CPU Pipelining - The cool way your CPU avoids idle time!

Поделиться
HTML-код
  • Опубликовано: 13 июл 2024
  • The CPU is complex, so as you can imagine, optimizations exist to ensure that it runs as efficiently as possible without idling. In today's episode, we look at the pipeline - An ingenious optimization technique, but also one that comes with a set of caveats and gotchas!
    = CONTENTS PAGE =
    00:00 Opening
    01:07 CPU Basics - Instructions
    01:22 Stages of an Instruction
    03:32 Idle Time
    04:16 Introduction to Pipelining
    05:09 Introduction to Hazards
    05:51 Example: Read-After-Write Hazard
    06:41 Pipeline Stalls
    07:24 Operand Forwarding
    08:18 Out-of-Order Execution
    10:20 Dealing with Branching
    11:27 The Problem and Pipeline Flush
    12:14 Branch Prediction
    14:20 Conclusion
    -----
    Attribution: My thanks extend to the creators who have kindly placed their work in the public domain:
    Backdrop loop: pixabay.com/videos/particles-...
    CPU Removal: pixabay.com/videos/cpu-cpu-re...
    CPU Spin: pixabay.com/videos/cpu-intel-...
    Abstract: pixabay.com/videos/octagon-ab...
    Sci-fi Future: pixabay.com/videos/sci-fi-sci...
    freepd.com/music/Driving%20Co...
    freepd.com/Page2/music/Rap%20...
    freepd.com/Page2/music/Urban%...
    freepd.com/Page2/music/Rap%20...
    freepd.com/Page2/music/Rap%20...
    freepd.com/Page2/music/Rap%20...
    freepd.com/Page2/music/Rap%20...
    -----
    Want to contribute to the channel? Consider using the "Super Thanks" feature above, or visit my website at nerdfirst.net/donate to find alternative ways to donate. Thank you!
    -----
    Disclaimer: Please note that any information is provided on this channel in good faith, but I cannot guarantee 100% accuracy / correctness on all content. Contributors to this channel are not to be held responsible for any possible outcomes from your use of the information.

Комментарии • 63

  • @Ans3lm0777
    @Ans3lm0777 2 года назад +10

    The explanation in your videos are so crisp. Really appreciate the quality of these - keep it up :)

    • @NERDfirst
      @NERDfirst  2 года назад

      Hello and thank you very much for your comment! Glad you liked the video =)

  • @Eujinv
    @Eujinv Год назад +1

    I have a computer architecture exam late this morning, wake up extra early to go to the hospital for a visit, im watching this video while im waiting🙌🏻

    • @NERDfirst
      @NERDfirst  Год назад

      Hello and thank you for your comment! Do take care and all the best for your exam =)

  • @DReam-mn4mj
    @DReam-mn4mj 6 месяцев назад

    Great video, keep it up!

    • @NERDfirst
      @NERDfirst  6 месяцев назад

      Hello and thank you very much for your comment! Glad you liked the video :)

  • @KumaAdventure
    @KumaAdventure Год назад +1

    Thank you, this helped clarify some things I came across for the Comptia A+ exam. Much appreciated.

    • @NERDfirst
      @NERDfirst  Год назад

      You're welcome! Very happy to be of help :)

  • @dimnai
    @dimnai 2 года назад

    Great video, well done!

    • @NERDfirst
      @NERDfirst  2 года назад

      Hello and thank you very much for your comment! Glad you liked the video :)

  • @-TWFydGlu
    @-TWFydGlu 2 года назад

    Really informative video, thanks :D

    • @NERDfirst
      @NERDfirst  2 года назад

      You're welcome! Glad you liked my work :)

  • @AshtonvanNiekerk
    @AshtonvanNiekerk Год назад

    Very well explained.

    • @NERDfirst
      @NERDfirst  Год назад +1

      Hello and thank you for your comment! Very happy to be of help =)

  • @itznukeey
    @itznukeey Год назад

    Great explanation, thanks

    • @NERDfirst
      @NERDfirst  Год назад

      You're welcome! Glad to be of help =)

  • @mill4340
    @mill4340 Год назад

    I completely forget all of this having studied for Comp Arch class. Your video refreshes the introduction I needed. Thank you.

    • @NERDfirst
      @NERDfirst  Год назад

      You're welcome! Glad to be of help =)

  • @awayfrom90
    @awayfrom90 4 месяца назад

    Superb explanation 🎉

    • @NERDfirst
      @NERDfirst  4 месяца назад +1

      Hello and thank you very much for your comment! Very happy to be of help :)

  • @LegonTW0
    @LegonTW0 9 месяцев назад

    gracias capo, clarito como un vasito, te quiero

    • @NERDfirst
      @NERDfirst  9 месяцев назад

      Hello and thank you for your comment! Glad to be of help =)

  • @Brekstahkid
    @Brekstahkid 26 дней назад

    Good stuff

    • @NERDfirst
      @NERDfirst  25 дней назад

      Thank you! Glad you liked the video :)

  • @memeingthroughenglish7221
    @memeingthroughenglish7221 17 дней назад

    Damn, your videos are so nice!!!

    • @NERDfirst
      @NERDfirst  17 дней назад

      Thank you very much! I remember your comment on another one of my videos as well, glad to know you like my work =)

  • @Atharv0812
    @Atharv0812 2 года назад +5

    Your content is so professional. Can you also make videos on modern microprocessor architecture like i3 ,i5 ,i7 etc.

    • @NERDfirst
      @NERDfirst  2 года назад

      Hello and thank you for your comment! Unfortunately those architectures are far more complex (some modern architectures have twenty or more pipeline stages) so I haven't gotten round to learning about them.

  • @robot67799
    @robot67799 2 года назад

    Great content 👍

    • @NERDfirst
      @NERDfirst  2 года назад

      Hello and thank you very much for your comment! Glad you liked the video =)

  • @juanmanuelserna7692
    @juanmanuelserna7692 7 месяцев назад

    Great quality video, easy to understand for people who does not come from computer science world, great job!

    • @NERDfirst
      @NERDfirst  7 месяцев назад

      Hello and thank you very much for your comment! Glad you liked the video :)

  • @cyprienvilleret2266
    @cyprienvilleret2266 Год назад

    great thanks

    • @NERDfirst
      @NERDfirst  Год назад

      You're welcome! Glad to be of help :)

  • @fraewn2617
    @fraewn2617 2 года назад

    well put

    • @NERDfirst
      @NERDfirst  2 года назад

      Thank you very much! Glad you liked the video :)

  • @jefferybarnett1849
    @jefferybarnett1849 Год назад

    Thanks for enlightening me about heuristics. I loved the graphical representation of the "shifts" in your presentation on pipelines and "stalls" that happen and avoiding them along the way. I knew just a moment before you showed us that the instructions were about to be reordered. My understanding has been improved. My knowledge of assembly language helped a lot, I just never bothered to look into the matter as you have done. Thanks a lot.

    • @NERDfirst
      @NERDfirst  Год назад

      Hello and thank you very much for your comment! Glad you enjoyed the video, and really appreciate you sharing your "aha" moment - That's one of the things I live for as an educator =)

    • @ArneChristianRosenfeldt
      @ArneChristianRosenfeldt Год назад

      Heuristics makes me want to see a CPU (simulation) where the scalar CPU splits up into two threads at every branch (becomes super scalar). Store commands write into a FIFO! Then when the branch condition is clear, a whole tree of threads is flushed. The Store FIFO of the taken branch is flushed to memory. This might be a useful operation mode for those 16 core RISCV chips.

  • @Epic-so3ek
    @Epic-so3ek 2 года назад

    these videos are really good

    • @NERDfirst
      @NERDfirst  2 года назад

      Hello and thank you very much for your comment! Glad you liked the video =)

  • @123jimenez99
    @123jimenez99 2 года назад +2

    Amazing video, it really made my understand why the PPE cores used both in CELL and Xenon where so underwhelming, it really suffered from all the bad stuff mentioned in this video: long pipelines, lots of stalls, lack of out of order execution and more. Also it made me realize how important was relying on the SPEs as much as possible in CELL's case, witch BTW was a big PITA. Cool Stuff.

    • @NERDfirst
      @NERDfirst  2 года назад +1

      Oh wow, this is a great case study, thank you for sharing! Its pipeline is 23 stages! Really interesting to read about.

    • @123jimenez99
      @123jimenez99 2 года назад +1

      @@NERDfirst Prescott P4: Hold my beer!

    • @NERDfirst
      @NERDfirst  2 года назад +1

      At least that's x86 - a CISC instruction set so it's less out of place!

  • @akioasakura3624
    @akioasakura3624 10 месяцев назад

    THANK YOU SIR!! I made many minecraft CPUs when i was 13. back then there werent many videos or resources that didn't explain pipelining in terms of "car assembly lines" or "laundry", or 4000 page university PDFs from the 90s. Thank you so much good sir.

    • @NERDfirst
      @NERDfirst  10 месяцев назад +1

      You're welcome! Very happy to be of help =) I think those are fairly textbook explanations so it's no wonder you see them a lot. Analogies are good too I suppose, but I guess nothing beats visualizing it properly!

    • @akioasakura3624
      @akioasakura3624 10 месяцев назад

      @@NERDfirst i struggled with this for so long. but thanks to u maybe i can try playing minecraft again. have a good day!!

    • @NERDfirst
      @NERDfirst  10 месяцев назад +1

      Good luck! Consider planning out your design first using actual logic components before doing it in game. Redstone is a whole different level of complexity!

    • @akioasakura3624
      @akioasakura3624 10 месяцев назад

      @@NERDfirst ohh alright, thanks!!

  • @cheenoong9228
    @cheenoong9228 Год назад

    why do i see in some materials regarding the order of the process is IF ( Instruction Fetch ) --> ID ( Instruction Decode ) -> EX( Instruction Execute ) -> MEM( Access Memory Operand ) -> WB ( Write Back )

    • @NERDfirst
      @NERDfirst  Год назад +1

      Hello and thank you for your comment! If I'm not wrong, what you've described is specifically the MIPS pipeline. Different architectures can have a different number and order of pipeline stages, so this isn't universal. What I've shown in the video isn't linked to any specific assembly architecture, it's just a generic abstract pipeline to make understanding things easier.

  • @akkudakkupl
    @akkudakkupl 8 месяцев назад +4

    That's not the only reason for pipelining. You could do a CPU that does the whole instruction in one clock (one rising, one falling edge). But you still have propagation time that limits max clock speed (and computation speed), pipelining allows to break up propagation into smaller chunks and to elevate clock speeds.

    • @NERDfirst
      @NERDfirst  8 месяцев назад

      Hello and thank you for your comment! To be fair, increasing clock speed this way isn't going to increase the overall speed of computation - No point getting your clock speeds up to 20GHz if every instruction has to make its way through 100 pipeline stages!
      Ultimately it's less about managing propagation delay - In fact having multiple pipeline stages _increases_ the total per-instruction propagation delay since it makes the circuitry more complex. The advantage comes about from the "parallelism" where we essentially start on the next instruction before the last one is complete.

    • @akkudakkupl
      @akkudakkupl 8 месяцев назад

      @@NERDfirst let's say you have an ALU that has 100ns propagation. Now you split that up into two 50ns steps with some latches in between. You just almost doubled your instructions per second due to doubling the clock rate. This is pipelining and it's most important reason.
      What you are referencing is superscalarity and out of order execution - the use of multiple execution units to their full extent.

    • @NERDfirst
      @NERDfirst  8 месяцев назад

      I think we're talking about the same things using different words, or maybe I just wasn't explicit enough on the point. My way of explaining it (at 3:32) assumes that pipeline stages exist but instructions are processed to completion before the next instruction enters the pipeline. Your way of explaining it does away with the pipeline model and considers the execution of an instruction as a single large step.
      I didn't explicitly mention propagation delay by name to reduce on cognitive load, but I do believe the understanding conveyed is the same. If I understand your explanation correctly, you get a doubling of instructions per second _because_ of instruction-level parallelism. At the end of the day, if you double the clock speed but each instruction takes two clock cycles to complete, the number of instructions per second is exactly the same. It is because of superscalarity allowing you to have multiple instructions in the ALU at once that you can have a performance benefit.
      Do let me know if I'm understanding you wrongly. It's been a while since I did this stuff.

    • @akkudakkupl
      @akkudakkupl 8 месяцев назад

      @@NERDfirst In my example my single ALU can be in two discrete steps of executing two instructions - first half of a new instruction and second half of an older instruction. You can imagine my pipeline like this (a modification of the classic RISC pipeline):
      Fetch
      Decode
      Execution 1
      Execution 2
      Memory
      Write Back
      I have divided the execution stage in two. This is because my hypothetical ALU would have 100 ns of propagation and would limit the clock to 10 MHz. By splitting it up I now have a little longer pipeline , but my largest propagation went down to lets say 55 ns (because we had to add latches in between stages its not ideally half). Now my CPU can run at 18 MHz. Both of those frequencies roughly translate to instructions per second because in both cases the instructions complete "in a single cycle" due to pipelining. This is the advantage of longer pipelines - as long as you get an uninterrupted stream of instructions you can get a boost in IPS because you have higher max clock. This is of course not ideal because you have branches in the code and that stalls or flushes the pipeline.
      You are executing multiple instructions at a time because result of one step is transferred further on to be computed in the next - basicaly it's an improvement over very old CPUs that executed those steps one after another because pipelining needs additional circuitry, so you got one instruction in for example 4 clocks.
      But you can't compute more instructions at a time than you have pipeline stages. For that you need superscalarity - having multiple ALUs, multiple address generation units, etc. working at the same time - and to make it work right you also use out of order execution, so you can fill up those elements pipelines (yes, everything is pipelined in a modern CPU).
      What I was implying earlier was that a Harvard architecture CPU could execute a full instruction in a single clock - because both instruction and data are supplied at the same time - but it might not run at a very fast clock because data has to propagate through the whole datapath in that one clock cycle.

  • @adamchalkley956
    @adamchalkley956 8 месяцев назад

    I have a question, not all instruments have a write back, i.e. not written the results back to registers, memory, etc. for example on the 8080, jmp instructions do not write back to anywhere. Another example would be a MOV instruction, that moves data from memory/registers to registers/memory.
    So what happens when an instruction has no write back? Does it execute a noop?
    Again I’m still quite the novice, thanks

    • @NERDfirst
      @NERDfirst  8 месяцев назад +1

      Hello and thank you for your comment! Yes, instructions that don't require any action to be taken on any stage would still have to go through the stage, but will do nothing there.

    • @adamchalkley956
      @adamchalkley956 8 месяцев назад

      @@NERDfirst Thanks, that makes sense

  • @bahrikeskin5824
    @bahrikeskin5824 Год назад

    could you change he song please my brain is burning because of this :(
    but i understand the consept thanks :) like

    • @NERDfirst
      @NERDfirst  Год назад +2

      Oh sorry about that! I compared levels with popular RUclipsrs and realized my BGM was turned down much lower than them. I'd hoped for it to be out of the way but looks like you still picked up on it. I'll see what I can do for future videos!

    • @bahrikeskin5824
      @bahrikeskin5824 Год назад

      @@NERDfirst thanks