Distributed Systems 6.2: Raft

Поделиться
HTML-код
  • Опубликовано: 2 янв 2025

Комментарии • 41

  • @renanreismartins
    @renanreismartins Месяц назад

    This subject could not have been explained better Martin. Thank you for your service to the computer science community. This is true gold!

  • @lenni8545
    @lenni8545 2 года назад +18

    Thank you so much for sharing your lectures. I took a similar course at my university and your videos have been a great supplement and you helped me get a top grade! So thank you so much Martin for sharing your knowledge and explaining concepts in a simple way without skipping the details. It is much appreciated! :-)

  • @tanmaymehrotra86
    @tanmaymehrotra86 2 года назад +11

    There are many raft vidoes on internet which shows you some fancy animations but none of them are even close to this. This is purely brilliant. Yes it may happen then one cannot get the entire content in one shot. I watch it multiple times and I am pretty confident that now I can explain Raft to others. Thanks a lot Martin for this lecture series. It answered many questions that I was trying to get answers for a long time.

  • @JohnCDSMB
    @JohnCDSMB 4 месяца назад

    After watching many videos, I finally found the explanation of an understandable protocol given in a very understandable way. Thank you very, very, very much

  • @mohamed-gara
    @mohamed-gara Год назад +1

    After reading the raft paper and watching the original videos presenting the algorithm, I started to look for a basic implementation in Java or any other language. But the pseudo code in this video is by far a best approach.

  • @programming6881
    @programming6881 Год назад +2

    It is a very tricky algorithm with log tricky cases. You have done an excellent job of explaining it. Thank you.

  • @chasing_the_horizon
    @chasing_the_horizon 2 года назад +1

    It was an absolutely marvelous explanation of the algorithm!

  • @yujiaqiao4885
    @yujiaqiao4885 2 года назад +1

    33:07 The line `ack >= ...` reminds me that we don't assume the link that messages are sent on is FIFO, do we? I just fall under the impression that TCP is used.

    • @gauravkondhare3605
      @gauravkondhare3605 7 месяцев назад

      Ive had the same thought, and thought as TCP is best effort and packets are received in order, we can assume that the order will be maintained. Do you have any explanation on as why such a design has been made in the pseudo code?

    • @ArsyadKamili
      @ArsyadKamili 5 месяцев назад

      @@gauravkondhare3605 The strictest network model used in the course is Reliable Network which only guarantees that message m is received iff it is sent, but it may be reordered

  • @wakandavernon1412
    @wakandavernon1412 4 месяца назад

    can we use this algorithm in wireless sensor networks?

  • @tarunpahuja3443
    @tarunpahuja3443 8 дней назад

    Lets say there are three Nodes A, B and C with A as a leader. If there is partition such as (A, B) and (C). C will keep incrementing its term number without becoming a leader whereas (A, B) will keep commiting the log entries. What happens when the partition resolve => Since C term is higher, A will become a follower. Does it mean A will lost its commited entries? I cant find the answer from the codebase, can someone pleaes point out?

  • @tysonliu2833
    @tysonliu2833 7 месяцев назад +1

    why on slide 9 the leader need qurom to deliver while on slide 7 the follower can just commit?

    • @jl1835
      @jl1835 7 месяцев назад +1

      because the leader is the only node that could decide which log entry is ready to be committed by checking if more than half of the nodes have already acknowledged this log entry (quorum).

    • @tysonliu2833
      @tysonliu2833 4 месяца назад

      ok seems slide 7 deliver logs up to leader, while slide 9 deliver the logs for the first time

  • @salad7389
    @salad7389 8 месяцев назад

    super well made, thank you!

  • @oz5219
    @oz5219 3 месяца назад

    I don't know what I'm missing but AppendLog seems to be a bit problematic since a replica will deliver a log to its application that potentially can changed the storage state? Imagine if there's n replica and leader is sending message to them, and after 1st and 2nd replica successfully append log, suddenly all other n-2 replica plus the leader die, then the message that was delivered to application in the 1st and 2nd node would be invalid right?

  • @tysonliu2833
    @tysonliu2833 7 месяцев назад

    what if a node advocating itself as a candidate only has log older than half, or even a handful of nodes, since node only gets to vote once each term, a relatively old node could be elected as the leader as long as it has votes from some older nodes and other candidate unfortunately have fewer votes (prob due to that they initiated themselves as candidate later)

    • @tysonliu2833
      @tysonliu2833 7 месяцев назад

      oh nvm when it sends its advocation to a node with newer data, it will be demoted to follower

  • @antonpuhach8005
    @antonpuhach8005 2 года назад +2

    It seems like CommitLogEntries lacks current term check which should prevent leader from committing entries based only on the entries from the previous terms (see Figure 8 in the Extended Version of the original raft paper)

  • @dehghanym
    @dehghanym 3 месяца назад

    Amazing!!

  • @tarunnurat
    @tarunnurat 2 года назад +2

    Thanks for this great lecture! I'm slowly getting a good understanding of Raft now.
    In slide (6/9), I had a question about the 2nd if condition, i.e., the "if term = currentTerm then" condition. You said that this exists because the receiving node might have been a candidate in the same term, and it's now receiving a msg from the leader in that same term, and needs to update itself to be a follower and, set it's current leader to be the id of the node that it received a msg from.
    Is there any reason this recipient candidate node doesn't set its own 'votedFor' set to null and cancel its own election timer, just as it did in the previous if condition? Is this because as a candidate the only node you would've voted for yourself is your own node Id, and that if your own election timer is running in the background despite having a leader, it doesn't have any harmful effects? I would've assumed from a practical standpoint, having your own election timer running in the background when you already know there is a leader for the current term would take up unnecessary processing power.

  • @default2117
    @default2117 2 года назад

    Thank you so much for the clear and concise explanation. Really appreciate it.

  • @tysonliu2833
    @tysonliu2833 7 месяцев назад

    is it possible if a leader commits a change say [1,2] given qurom, then goes down, a follower who has yet to commit the change or even voted yes to the commit becomes the new leader, having the log as [2,1], it now advocate to commit [2,1] while [1,2] has already been committed by some nodes?

    • @tysonliu2833
      @tysonliu2833 4 месяца назад

      in that case the new leader will be considered to have the correct value, by slide 7

  • @anrikezeroti4680
    @anrikezeroti4680 2 года назад

    Wonder design process of complex algorithm looks like

  • @BlakeDeFi
    @BlakeDeFi 3 года назад +1

    Martin a question, if I wanted to use isabelle as a haskell proof assistant, could I transcribe all the operators and symbols?

    • @kleppmann
      @kleppmann  3 года назад +4

      I'm not a Haskell user so I'm afraid I don't know how it works in conjunction with Isabelle. I believe Isabelle can generate formally verified Haskell code from your Isabelle/HOL definitions, but that's all I know.

  • @mmfStudent
    @mmfStudent 2 года назад +7

    well, this lecture is complex....

  • @akhtarandroid
    @akhtarandroid 2 года назад

    Awesome lecture. It must have taken a lot of trial and error to develop this algorithm right and deal with all the possible edge cases/failure points.

  • @LL-ol8gr
    @LL-ol8gr 3 года назад +1

    Thanks, but I wonder if the lecture can be better presented (like other videos in this series) than explaining the algorithm line by line.

    • @LL-ol8gr
      @LL-ol8gr 3 года назад

      With current format, it is just hard to have a big picture of how it works. Anyway, it is always not an easy task to explain complicate things. Appreciate you make it accessible, big fan of your DDIA book.

  • @yoyocswpg
    @yoyocswpg 7 месяцев назад

    I don't understand why I am paying 2000 for a uni course as an international student when I get to learn all this sh🎉 here with much better quality

  • @lespukh
    @lespukh 2 года назад

    Hi Martin! Could you elaborate somewhat why a leader and also all followers deliver messages to application. The leader does that on commit which makes sense. But a follower does that when appending entries to the log, which is confusing

    • @Rbkbadass
      @Rbkbadass 2 года назад

      I would imagine it is because the followers have their own clients (distributed systems), and when they commit, they update only their own clients. Since the leader is not connected to the followers clients, each follower needs to update their own clients, but only when the leader has first committed as this is needed to ensure total ordering. Think of it like the different nodes are connected to different datacenters around the world. The leader is in the US, and one of the followers are in Europe. When the leader commits, the changes are only visible for clients in the US. However, when the follower in Europe commits, it becomes visible in Europe as well.

    • @lespukh
      @lespukh 2 года назад

      @@Rbkbadass oh, so that's why. Thank you!

    • @gauravkondhare3605
      @gauravkondhare3605 7 месяцев назад

      I think of it with an example of kv store clients which are running on different nodes that are using raft algorithm. So whenever we are ready to commit, we basically are adding the entry ( delivering log ) to the kv store client on that node.
      Am I correct to assume above statement?

  • @n_fan329
    @n_fan329 2 года назад +1

    My Brain is spinning 🤯

  • @avejantzero9090
    @avejantzero9090 3 года назад

    There are a problem with video timestamps: it's missed for Raft 1/9, Raft 5/9 and Raft 9/9.

  • @weiboliu6095
    @weiboliu6095 2 года назад

    I believe in 31:31 the 5th line `ackedLength[follower] := ack` should be `ackedLength[follower] := ack - 1`
    my code works with the `ack - 1` solution.
    thanks for sharing. :-)