5-Stage Pipeline Processor Execution Example

Поделиться
HTML-код
  • Опубликовано: 26 ноя 2024

Комментарии • 66

  • @richardhall9815
    @richardhall9815 5 лет назад +122

    I didn't know Tom Hanks made videos about instruction pipelining in his free time!

    • @zackjohnson9387
      @zackjohnson9387 4 года назад +3

      tony stark*

    • @glykeriatheodorou7586
      @glykeriatheodorou7586 4 года назад +2

      hahahaha actually their voices are very similar and I just noticed it 😂😂😂

    • @jilha1122
      @jilha1122 3 года назад +2

      Wow now that you mention it...

    • @Theehannle
      @Theehannle 3 года назад +2

      @@zackjohnson9387 Tony Hanks

  • @Spinogrl2000
    @Spinogrl2000 2 года назад +20

    Thank you so much! RUclips has taught me more about pipelines and data paths in an hour than my prof. has this whole month. They make it so much harder than it has to be!! Again, thank you!

    • @ad.i
      @ad.i 9 месяцев назад +2

      I know it's been a whole year, but do you recall any specific videos that helped explain the topic that you think are definitely worth looking at? If not, that's all good but thanks a lot!

    • @darianxd5508
      @darianxd5508 6 месяцев назад +1

      @@ad.i david and sarah harris

  • @MrICH99999
    @MrICH99999 10 месяцев назад +3

    Still helping me in 2024 - big thanks!

  • @Vwcz
    @Vwcz 6 лет назад +5

    Great video, and active in comments section. Excellent content creator! This is what we need. Thanks

  • @kevinle4766
    @kevinle4766 6 лет назад +6

    I am a bit confused as to how the iteration is from 5 to 18?

  • @selvalooks
    @selvalooks 5 лет назад +3

    this is wonderful , pipeline fantastic explanation !!!!

  • @ahmadahm3513
    @ahmadahm3513 3 года назад +3

    in case of forwarding beq schuld not wait for data cause it can get the data passed throw in the same cycle, unless you are assuming that EX stage for slt and ID stage for beq they are not happening in the same cycle:)

    • @codewithven9391
      @codewithven9391 2 месяца назад

      This isn't true. In the SLT instruction, the arithmetic is performed at the beginning of cycle 3 therefore it can't be forwarded until the end of cycle 3. So BEQ will not have the correct value in cycle 3 but in cycle 4 it can after the result has been forwarded from SLT - E/M

  • @MrXinchuan
    @MrXinchuan 7 лет назад +3

    I don't understand why you stall in the first beq (second instruction), but you don't stall lw (fourth instruction) and let forwarding take care of it. Because the previous instruction blt and add, both have the result ready after the execute stage

    • @matthewwatkins88
      @matthewwatkins88  7 лет назад +4

      In the processor this is dealing with, branches are resolved in the decode stage. In this case that means that the value of $t0 is needed in the decode stage. Since the instruction before the branch (the add) writes to $t0, the value needs to come from the slt instruction and the result of the slt isn't available until the end of the EX stage so the first beq has to stall a cycle so that it can get the correct value from EX forwarded into the ID stage. The lw doesn't need to stall because it doesn't need the value of $t0 until the EX stage (where the branch needed it in the ID stage). In this case, the add instruction has completed the EX stage before the lw enters the EX stage and so no stalling is needed (it is just directly forwarded).

  • @martint5340
    @martint5340 6 лет назад +2

    This is awesome. Thank you very much!

  • @Simppi96
    @Simppi96 7 лет назад +7

    This is a really great video, thanks! But I am still not sure on how data dependencies work. How do you know when a command has the data ready for another to use? For example, the BEQ command needs $t0 and it can get it after the SLT command has executed, but the next BEQ command has to wait for the LW command to get to the memory clock cycle. I would be very grateful for an answer, thanks in advance!

    • @matthewwatkins88
      @matthewwatkins88  7 лет назад +13

      This partially depends on the implementation of the pipelined processor. For this example it is assumed that for all instructions that produce data, except for load instructions, the data is available as the instruction moves from the execute (E) to the memory (M) stage. This means that for the slt/beq combination. The SLT produces the data in execute and so that data can be forwarded from the beginning of the memory stage to the decode (D) stage (where it is needed for branches). For load instructions, the data is not available until after the instruction accesses the data memory, which means it is only available as the instruction moves from memory (M) to writeback (W). This is why the lw/beq combination has to wait another cycle as it is only as the lw moves into writeback that the data is available to forward to decode.

    • @rekr6381
      @rekr6381 5 лет назад +2

      @@matthewwatkins88 Thanks for this response, very helpful!!

  • @boathecat919
    @boathecat919 6 лет назад +2

    For when neither branch taken, why does the last instruction "add $v0, $s0, S0" have no cycle?

    • @codewithven9391
      @codewithven9391 2 месяца назад

      Because it's outside the loop. Only the ones inside the loop are considered for this problem. We are determining the overall CPI for the loop

  • @_nognom
    @_nognom 6 лет назад +3

    The value for $t0 from the SLT instruction should be ready to forward at the later half of stage E, which is right before the early half of stage E for the BEQ instruction, which suggests that value for $t0 will be forwarded to the ALU instead of requiring a stall. Is this not correct?

    • @matthewwatkins88
      @matthewwatkins88  6 лет назад +3

      No, this is not correct. The result of the SLT (or any instruction computed in the execute stage) is only ready at the end of the cycle and so can really only be forwarded at the beginning of the next stage (the memory stage). Additionally, the BEQ needs the value for $t0 in the decode stage since it resolves the branch in this stage. This means the branch can not properly complete the decode stage until the previous SLT has completed the execute stage. *If* the branch was resolved in the execute stage (which is not the case here), then a stall would not be necessary as forwarding would take care of the dependency.

    • @trumpetperson11
      @trumpetperson11 4 года назад +1

      @@matthewwatkins88 I had a similar though. It seems that I have been told that you can forward the data directly to the ALU (or more precisely, the register in between the D and E stages) for the calculation (overwriting the data received from the register in the D stage). This would give you the time to not require a stall there.
      Is this just not correct?

    • @ahmadahm3513
      @ahmadahm3513 3 года назад

      @@matthewwatkins88 that not realy true cause the result of each stage could be ready in the first half of the cycle like the WB stage

  • @wendyli6238
    @wendyli6238 6 лет назад +4

    Are we using forwarding in this problem? I'm confused on when the next instruction should start if we are using forwarding

    • @matthewwatkins88
      @matthewwatkins88  6 лет назад

      The example definitely assumes forwarding. I'm not 100% sure what you mean by "start." The processor fetches the next instruction the next cycle. If there is a dependency that forwarding can't handle, then the processor will stall the necessary stages (stalling is shown in the example by stages shown in '()', such as (F)).

    • @wendyli6238
      @wendyli6238 6 лет назад

      What I meant by start was where the next instruction would begin F,D,E... if we didnt use forwarding but needed information from the previous instruction.
      If we were not using forwarding and need information from a current register in the next instruction, we wouldnt decode the next instruction until after the current instruction finished its memory stage?

    • @matthewwatkins88
      @matthewwatkins88  6 лет назад +1

      If there was no forwarding at all, the dependent instruction wouldn't truly start decode until the previous was in writeback (assuming writes to the register file appear to happen before reads, which is what is assumed in the video). Data is only written to the register file in writeback, so, without forwarding, wouldn't be available until then.

    • @wendyli6238
      @wendyli6238 6 лет назад

      Thank you that is very helpful! :D

  • @albaraam1873
    @albaraam1873 Год назад

    I think in the third case you meant first branch (beq $t0,$0, end) is taken only

  • @CrashOverride332
    @CrashOverride332 2 года назад +1

    This went way too fast for me. I kept having to rewind.

  • @owenzhang7503
    @owenzhang7503 7 месяцев назад

    If we dont have the last line, what the pipeline will be? Can we begin the IF of the first loop line directly in circle 14?

    • @matthewwatkins88
      @matthewwatkins88  7 месяцев назад

      The last line, as I interpret it anyway, is never executed, so removing it really wouldn't change anything.

    • @owenzhang7503
      @owenzhang7503 7 месяцев назад

      @@matthewwatkins88 I see. Thank you very much!

  • @Manas09rai
    @Manas09rai 3 года назад

    Hey I just wanted to ask if an add instruction was dependent on a ld or lw instruction prior to it, would there be the same 2 cycle stall as there was for the beq instruction that was dependable on the lw instruction?

  • @perionan7281
    @perionan7281 7 лет назад +4

    OH MY GOD!! THANK YOU

  • @motorheadbanger90
    @motorheadbanger90 6 лет назад +3

    There are a lot of things you are saying that contradict my teachings and readings on this matter. Can you please explain to me what you define as the following:
    1) What is "branch taken/not taken"
    2) What is forwarding
    Additionally, are you saying that the resource in t0 cannot be accessed by the subsequent instruction until the memory stage of the previous instruction? And we have forwarding in this problem? Assuming yes, then your understanding of forwarding, and my understanding of forwarding contradict. Can you help explain?

  • @mahanteshmise6930
    @mahanteshmise6930 5 лет назад

    instruction no3 and no4 there must be stall at decode for instr 4.Correct me if i am wrong

  • @jayz6698
    @jayz6698 6 лет назад

    why is the iteration 14 is including the first W and does not include the last W (between 5 to17) ?

    • @matthewwatkins88
      @matthewwatkins88  6 лет назад +3

      As is noted in the comment for the video, there is a slightly updated version of this video (ruclips.net/video/Bj_BZ_d0OkU/видео.html). The CPI calculation shown is correct, but, as you note, the line at ~7:00 should extend to cycle 18, for a total of 14 cycles. (Also, the W in cycle 18 for the slt should really be an M.)

  • @yogeshbalbehra8930
    @yogeshbalbehra8930 6 лет назад

    what are stages in typical four stage cpu pipeline? and whats the purpose of each stage? this question was in my exam. can you help me with answee

  • @pacifiky
    @pacifiky 8 месяцев назад

    This is so cool

  • @a96185e
    @a96185e 5 лет назад +2

    this is fantastic :)

  • @zachnanabooboo517
    @zachnanabooboo517 6 лет назад +1

    Didn't know mike greenberg knew mips

  • @mohammadrezabaqery7492
    @mohammadrezabaqery7492 5 лет назад

    did you forget to resolve a dependency between add and lw?
    add $t0, $s3, $s4
    lw $t0, 0($t0)

    • @dmm2708
      @dmm2708 4 года назад

      there is is a dependency but it doesn t change the outcome

    • @pavuluriviratchowdary4480
      @pavuluriviratchowdary4480 4 года назад

      Because t0 is already executed in first instruction so there is no need for the processor to run it second time.

  • @eduardomiguelsalaspalacios3325
    @eduardomiguelsalaspalacios3325 5 лет назад

    mi causa dice que te equivocaste, es cierto? que opinas?

  • @mehmetb8703
    @mehmetb8703 6 лет назад

    nice tutorial

  • @profitjourneywithsk2136
    @profitjourneywithsk2136 5 лет назад

    Good video

  • @x3axDev
    @x3axDev 6 месяцев назад

    thank you tony stark

  • @PEGuyMadison
    @PEGuyMadison 3 месяца назад

    Oh come on throw those branch delays in and show how inefficient RISC code is.

    • @matthewwatkins88
      @matthewwatkins88  3 месяца назад

      When you say "RISC" do you mean actual RISC? If so, actual RISC code is equavelent to what is shown. If you are refering to the Mips code, then yes, real Mips code would change the performance, but it doen't necesarily destroy it.

  • @ağırsağlam
    @ağırsağlam 6 лет назад +1

    senin taşşaklarına kurban olalım abi :D

  • @sukrusekeroglu
    @sukrusekeroglu 4 года назад +6

    ohh no offence but i am happy to hear non-indian accent, I said oh god thanks in the beginning of the video

    • @FelixTheForgotten
      @FelixTheForgotten Год назад +1

      Sometimes I am so desperate I try to understand Vietnamese videos to study. Always feels good to find an English video even though it isn't my first language