I can see how passionate you are about the subject and how well you know it. Great job referring to the paper and exploring the idea all throughout. Thank you.
Thanks for the video! I have a question: At timestamp: 24:36, it is mentioned that if an Accepter hears an Accepted message before, it piggybacks that value in the response when some other node tries to propose a value. Why cannot we do this in the prepare phase instead, we could piggyback node 1's value at timestamp 23:05 as well right? Since node2 and node 3 hear a prepare message from node 1, when node 2 tries to send the prepare message to node 3 in this example, could we piggyback on the promise response from node 3, saying I heard another proposer propose a different value in the last proposal message? @DistributedSystems
I think explaining distributed systems in the context of Blockchain, Cryptocurrency, DAPPS and DAO can be really attractive to lot of prospective viewers.
The Part-Time Parliament is actually the better paper to read. It's more long-winded but it explains the algorithm much more thoroughly and comprehensively, and also builds intuition for why it works.
Except for the fact that he just showed the system vunerabilities at 17:30. Unique identifier otherwise the protocol stops working. So what if a malicious system uses the same identifier, wouldn't that make a single system able to crash the entire network?
Supremax67, yes, recall that Paxos explicitly depends on a fail-stop behavior of its participants. This means that a node fails only by crashing. A node that fails by corrupting metadata or by no longer adhering to the protocol can break the whole system. These kind of (Byzantine) failures are indistinguishable from a purposefully participant. Byzantine failure consensus algorithms are available, but are much more expensive to run.
@@RobertLeeAtYT -- I wouldn't say they are more expensive to run. In fact, there's an aBFT algo available this year that does it at micro cost. Always on the lookout for other projects, unfortunately not many thus far that inspires a long and bright future.
Hi! Thanks for the fantastic video! In 22:01, shouldn't there be a unidirectional arrow from 1 -> 3 in the Accepted(42.1, "burger") phase instead of a bidirectional between 1 3? Since 3 was never able to promise and thus could not accept, it shouldn't be able to send Accepted (42.1, "burger") no?
23:48 you said that the accepts are getting send to all other nodes too, that means that Node 2 would get the info from Node 3 even if Node 1 stops... why didn't you mentioned that?
I have checked out a couple of videos, and this is the only can, that is in plain English without any complicated words. Do you have any content about Raft?
Thanks for this wonderful explanation of this algorithm, I was really struggling to understand but now I have some good idea of it (multi paxos is still confusing though, but at least basic paxos is clear for me).
I'm attempting to simplify things to make it easier to get the essentials -- but it is quite likely my explanation can be improved (perhaps by not simplifying as much?).
Would you mind give some advice about your other papers that connect to how to implement the paxos in state machine or multi paxos. As you said in the video, the paper Paxos made Simple didn't tell lots of details about it. Thank you so much. Your video is great! I am very interesting in distributed system and is a binger of the computer science, you videos are very helpful.
If I knew some good implementation papers I'd point you at them. But sadly I don't (not that they don't exist, but because I've not read as much of the literature as I'd like to). Most of the highest performing Paxos implementations have been done in industry environments, and their best performance tuning techniques are either specific to their proprietary execution environments, trade secrets, or both. In general most performance optimizations come down to: (a) study your workload; (b) observe how Paxos tries to solve all problems for everyone, but you only have performance difficulties with a small subset of the problem space it covers; and (c) find a clever way of making your common case go faster, perhaps at the expense of correctness in a not-going-to-happen-to-you corner case, or at the expense of a less common case. One example (which I've not studied in detail, but is written by folks I know and trust) is here, where these researchers show how you can optimize for read performance at the cost of availability and write latency: www.pdl.cmu.edu/PDL-FTP/associated/moraru-socc14.pdf If your goal is to implement your own consensus protocol, I'd actually point you at Raft. Partially because it is easier to understand and implement, and also because there already exist strong open source implementations you can study, learn from, and improve.
Thank you so much for your very detailed answer. I will continue watching your videos. It really helpful. I have been struggled understanding Paxos for one month. Your explanation is very clear and I love it.
Very nicely explained . Thank you so much . One minor query . Why do we need to run multiple proposers ; meaning what is the benefit of running two proposers ? Maybe expedite the progress Or something else ?
Typically implementations of Paxos implement a slightly more complex protocol including optimizations such as the one you propose. You are right, you can save some serious time by only having one proposer which keeps on issuing proposals (the leader), and only have other participants propose when they don't hear from that leader for a while. They rely on the basic Paxos protocol outlined here to ensure correctness (they revert to Paxos if machines fail, for example) but otherwise take shortcuts to make it go faster and have lower network overheads in the common case. If you want to read up on this, start by Googling "Multi Paxos".
If the paper was titled "Paxos Made Simple", then that paper is legendarily difficult to read. My goal was to come up with an easier-to-digest version of the same material, and glad I hit the mark here.
I am assuming that by third example you mean the "Failures: Proposer in Accept Phase" example. This is actually one of the places where the magic of Paxos comes in. You would think that a proposer should always stick with what they were proposing. But with the wrong combination of failures, this could result in a situation where the system deadlocks, and no agreement is ever reached. To avoid this deadlock, if a proposer finds that a prior proposer has already gotten any nodes to agree on the outcome, it simply tries to keep getting more nodes to agree to that same outcome, instead of proposing something new. In that way if proposer nodes keep on repeatedly failing (say, if they have a bug in the code...) they keep on moving forward and getting closer to consensus, instead of repeatedly starting the process over.
In the Paxos algorithm, the system will agree with the majority, but in the real world, sometimes the majority could be the wrong. Assume in a banking system, which need to balance the customers' accounts, if the proposer sent the wrong value and died, which is the case you talked about above. Then other sever become the leader/proposer and sent a right value, so the wrong value will still be chosen by the learners?
I got your point cause I was confused about him saying, it would pick the biggest ID. But, after repeating this part a couple of times, I found that what he meant by "It needs to look at all the promise responses it got" is that the proposer will compare all previous IDs, excluding itself, and pick the greatest one. And since we only have 42.1, then it will replace it with its own proposal round 43.2. A more general example is that if PrevProposals = {41.3, 42.1, 40.4} and the current proposal is 43.2, then it will pick the 42.1 amongst the PrevProposals set and replace it with its own proposal 43.2.
On the slide 'Alternatives to Paxos', Byzantine fault tolerance is said to require an exponential number of messages compared to the number of nodes. A blockchain based approach seems to require only linear scaling of messages with nodes (which you do touch on). Is there something else that is being traded off in this case? And thanks for the great video!
It is hard to directly compare proof-of-work blockchain and BFT consensus (as I'm talking about here), since they are really solving different problems. BFT consensus is saying "if I have a number of nodes under my control, and assume that some small number of them go insane and are out to get me, can I still reach consensus?" And the answer is yes -- and the solution is pretty darn fast (but not as fast as Paxos, which just assumes that nodes die instead of going insane). Proof-of-work blockchain is saying "if I have a huge number of nodes, of which only a tiny number are under my control, can I eventually get everyone to agree if I provide the right system of incentives, and use up so many resources that no evil-doer could possibly use more resources than me to mess me up?" And the answer is closer to "if you wait a massively long time (at least, compared to how long Paxos or BFT would take to reach consensus) and use an unimaginable amount of compute power (once again, compared to Paxos and BFT solutions) it works almost all the time." In most problem domains, that is not a great answer, and Paxos is a vastly simpler and better solution. But, if you are really trying to build a cryptocurrency with distributed trust, proof-of-work blockchains will work, while Paxos/BFT will not. For more details, see my other videos. I describe Byzantine Fault Tolerance in more detail in one. I also have a series of videos explaining how blockchains can be used to achieve consensus (and comparing them to Paxos/BFT).
@@DistributedSystems Thanks, I'll give your other videos a watch too. It's interesting because I imagine that although they seem like a different set of problems, there's some scale of tradeoffs across dimensions such as latency, effort, 'secureness', etc.
At 24:45 proposer 2 has sent out a Prepare message with round ID 43.2, and gotten a Promise response from node 3 saying "hey, we already started Paxos in round 42.1 and decided on burgers!" Because of this, as I explain at 25:00, proposer 2 changes from trying to get consensus on pizza to trying to get consensus on burgers. It does this by sending out an Accept message (continuing round 43.2) with burgers as the meal to be eaten.
@@mohdirteza6260 At that point Node 2 is playing both the Proposer and Acceptor roles (although I don't show the messages the machine sends to itself). So Node 2 will also get an Accept(43.2, burgers) message (or act as if it has gotten such a message, without actually sending it).
Except for the fact that he just showed the system vunerabilities at 17:30. Unique identifier otherwise the protocol stops working. So what if a malicious system uses the same identifier, wouldn't that make a single system able to crash the entire network?
Hi. Great video! When you talk about the bully algorithm, you introduce it by saying that if a proposer wants to propose something, it starts an election. But by using the bully algorithm, that node might never win and might never get to propose its values. Any thoughts on how this can be mitigated? I see others suggest using an exponential back-off to fix the multiple proposers problem, but that could potentially lead to quite some performance degradation I guess.
The bully algorithm is not meant to be fair -- you are right, the biggest bully may always win and nobody else gets to win. But that is okay, since the purpose of using the algorithm is to simply ensure there is *some* winner. The losers can simply forward any values they want proposed to the winner and get them recorded by Paxos. (Paxos assumes that the nodes are cooperating with each other and want to help each other -- it is not meant for an adversarial system. See the video on Byzantine Fault tolerance or the Bitcoin Consensus videos for algorithms which cover those cases.)
Yes. This is what allows Paxos to successfully terminate if the proposer repeatedly crashes, as it continues to propagate a previous decision to more acceptors. Note that if more than one promise response is received with extra information, the proposer has to decide which one to use.
42.1 or 43.2 are examples of "round identifiers". Every time you run the Paxos algorithm to come to consensus on a new thing you choose a new round identifier. But as I've created them, the round identifiers have two parts -- a unique id (such as 42, or 43) concatenated with a server id (server 1, server 2, server 3...). So when I talk about a server id at 29:52, what I'm talking about is some unique number or string which identifies each server. Could be an IP address, could be a hostname, could be an IP+pid (if you run more than one Paxos instance per machine for some reason). The important properties being that these IDs have to be unique, must not change (be reassigned) while you run the bully algorithm, and they have to be ordered (so you can compare them and see who is biggest).
Assuming you start a new network and all nodes have id1: How will Paxos decide on a proposer? Every node would send out prepare messages and also return promises since there is no node with a higher id leading to every node believing it's a proposer? What's does the protocol say in that case? I know Raft is solving this with randomized timers on nodes to send out prepare messages to avoid mulitle proposer election. Thanks in advance!
The short answer: you don't. One of the assumptions that Paxos has is that your nodes all have unique IDs assigned to them. If you violate that assumption, the algorithm doesn't work. Fortunately, coming up with a technique for assigning unique IDs to nodes is fairly easy. (For example, if you are on an IP network, then you could just use the IP addresses of each machine as an ID. You could also use the MAC address of an ethernet card, the serial number of a CPU or motherboard, or even just manually configure your servers (if that is acceptable for your application).)
Hey there, I noticed at ~10:38 on the slide you've written Paxos sends linear (n) messages with the number of n nodes. I also noticed though that during a round, all nodes will send an `accept()` packet to all other nodes in the cluster. Isn't this therefore n^2 messaging?
I was also curious about this. Found this online cited to Lamport's Fast Paxos paper: "Instead of each acceptor sending Accepted messages to each learner, acceptors can send their Accepted messages to the leader and the leader can inform the learners when a value has been chosen. However, this adds an extra message delay"
The nodes which adopt the Proposer and Acceptor roles store the state of the system. They have to have storage, be under your administrative control, and if you want to achieve consensus quickly they have to have a fast network connection to the other nodes. As you add more acceptors, you improve your resistance to failures -- but you also make the system slower (since you have to send messages to more nodes, and have a greater chance of encountering a long tail "slow node" which holds you up). Sometimes you want more nodes. Nodes which know the state of the system, but don't have to bother storing that state and serving it to others. Or perhaps those additional nodes belong to someone else who you don't fully trust to not get hacked. Or perhaps they have cruddy net connections (such as due to being mobile devices, or being far away). Any node which simply needs to know the state of the system, but isn't participating in and storing the consensus itself, can use the Learner role to get that state. To give a concrete contrived example: let's say you ran a super-reliable chess service. Every time you get a move from a player, you run a round of Paxos on your servers, and all N of your servers store the new board state. If any one of your N servers go down, the chess games can still be played with the remaining N-1 servers while you try to bring the dead server back to life. All is good. Except -- the chess players need to know the state of the game so they can submit new moves to your service. And any audience members also need to know the game state. One way to implement this would be to have the players and audience members act in the Learner role, so they know exactly what is going on, but they can't cheat and try to change the board state, as they are not in a Proposer or Acceptor role.
Except for the fact that he just showed the system vunerabilities at 17:30. Unique identifier otherwise the protocol stops working. So what if a malicious system uses the same identifier, wouldn't that make a single system able to crash the entire network?
@@visionlee4587 My first reaction: don't. ;-) Implementing Paxos, getting all the corner cases right, and implementing optimizations to make it usable in any particular system is notoriously hard. If you are building a real system (instead of doing this for school or fun), use someone else's implementation of Paxos or Raft. If you are going to build it, then that paper will certainly tell you about the base algorithm (although it is notoriously hard to read and fully grok -- one reason why I made this video). There are loads of papers others have written which suggest optimizations and improvements for different situations. Depending on what you are building, you'll certainly want some of those. The Wikipedia Paxos page points to some of the more popular variants, and it would be worth studying some of those too. Good luck!
Videos like these get you more good karma, than giving money to charities. Because instead of giving the child a fish, you are teaching them how to catch one.
I can see how passionate you are about the subject and how well you know it. Great job referring to the paper and exploring the idea all throughout. Thank you.
Glad you enjoyed it!
It's easy to get lost in the details when you are studying Paxos. This video strikes the right balance between depth and breadth, great video Chris !
Thank you Chris, you are a great teacher! I love how you make it more human and easy to understand.
I appreciate that!
This is fantastic! Thank you so much for creating this class Chris!
The only video that I not only followed but also comprehended.
Glad it was helpful!
Thanks for the video!
I have a question: At timestamp: 24:36, it is mentioned that if an Accepter hears an Accepted message before, it piggybacks that value in the response when some other node tries to propose a value. Why cannot we do this in the prepare phase instead, we could piggyback node 1's value at timestamp 23:05 as well right? Since node2 and node 3 hear a prepare message from node 1, when node 2 tries to send the prepare message to node 3 in this example, could we piggyback on the promise response from node 3, saying I heard another proposer propose a different value in the last proposal message?
@DistributedSystems
I think explaining distributed systems in the context of Blockchain, Cryptocurrency, DAPPS and DAO can be really attractive to lot of prospective viewers.
Hey. I am really thankful for this. I wanted to publicly comment. Really good video
Thanks!
This is best explanation of Paxos. Thank you.
Glad it was helpful!
The Part-Time Parliament is actually the better paper to read. It's more long-winded but it explains the algorithm much more thoroughly and comprehensively, and also builds intuition for why it works.
Best example for distributed system algorithm!
thanks man ,It means a lot "Just get the essence; the rest is easy to follow.", tnx man really mean it.
Holy hell, this video cleared so much up for me. Thank you very much.
Wonderful presentation! Loved the personification of the nodes in the cluster. :)
Thanks! Recommend watcher to have some coffee and you are good to go! So clear!
Except for the fact that he just showed the system vunerabilities at 17:30. Unique identifier otherwise the protocol stops working. So what if a malicious system uses the same identifier, wouldn't that make a single system able to crash the entire network?
Supremax67, yes, recall that Paxos explicitly depends on a fail-stop behavior of its participants. This means that a node fails only by crashing. A node that fails by corrupting metadata or by no longer adhering to the protocol can break the whole system. These kind of (Byzantine) failures are indistinguishable from a purposefully participant. Byzantine failure consensus algorithms are available, but are much more expensive to run.
@@RobertLeeAtYT -- I wouldn't say they are more expensive to run. In fact, there's an aBFT algo available this year that does it at micro cost.
Always on the lookout for other projects, unfortunately not many thus far that inspires a long and bright future.
Supremax67 , can you put a link to the paper?
Hi! Thanks for the fantastic video! In 22:01, shouldn't there be a unidirectional arrow from 1 -> 3 in the Accepted(42.1, "burger") phase instead of a bidirectional between 1 3? Since 3 was never able to promise and thus could not accept, it shouldn't be able to send Accepted (42.1, "burger") no?
Start @12:00
23:48 you said that the accepts are getting send to all other nodes too, that means that Node 2 would get the info from Node 3 even if Node 1 stops... why didn't you mentioned that?
What a fantastic explanation. You're the man 💪
I have checked out a couple of videos, and this is the only can, that is in plain English without any complicated words.
Do you have any content about Raft?
not sure about the marching band but you my friend definitely deserve a burger and a pizza. Thx for teaching.
Any time!
Thanks for this wonderful explanation of this algorithm, I was really struggling to understand but now I have some good idea of it (multi paxos is still confusing though, but at least basic paxos is clear for me).
Glad it was helpful!
Thank you for sharing your knowledge and insights.
My pleasure!
thank you! the analogy makes it much easier to understand this concept!
Glad it was helpful!
Incredibly simple explanation!!
Thanks!
Thank you so much, this video was very useful.
Great video! But I think you'd better introduce the the "learner" role, otherwise I don’t know who made the consistency decision. Do you think right?
I'm attempting to simplify things to make it easier to get the essentials -- but it is quite likely my explanation can be improved (perhaps by not simplifying as much?).
Would you mind give some advice about your other papers that connect to how to implement the paxos in state machine or multi paxos. As you said in the video, the paper Paxos made Simple didn't tell lots of details about it. Thank you so much. Your video is great! I am very interesting in distributed system and is a binger of the computer science, you videos are very helpful.
If I knew some good implementation papers I'd point you at them. But sadly I don't (not that they don't exist, but because I've not read as much of the literature as I'd like to). Most of the highest performing Paxos implementations have been done in industry environments, and their best performance tuning techniques are either specific to their proprietary execution environments, trade secrets, or both. In general most performance optimizations come down to: (a) study your workload; (b) observe how Paxos tries to solve all problems for everyone, but you only have performance difficulties with a small subset of the problem space it covers; and (c) find a clever way of making your common case go faster, perhaps at the expense of correctness in a not-going-to-happen-to-you corner case, or at the expense of a less common case.
One example (which I've not studied in detail, but is written by folks I know and trust) is here, where these researchers show how you can optimize for read performance at the cost of availability and write latency: www.pdl.cmu.edu/PDL-FTP/associated/moraru-socc14.pdf
If your goal is to implement your own consensus protocol, I'd actually point you at Raft. Partially because it is easier to understand and implement, and also because there already exist strong open source implementations you can study, learn from, and improve.
Thank you so much for your very detailed answer. I will continue watching your videos. It really helpful. I have been struggled understanding Paxos for one month. Your explanation is very clear and I love it.
Very good explanation, congratulations
Thank for the video, it was very clear!
Very nicely explained . Thank you so much . One minor query . Why do we need to run multiple proposers ; meaning what is the benefit of running two proposers ? Maybe expedite the progress Or something else ?
Typically implementations of Paxos implement a slightly more complex protocol including optimizations such as the one you propose. You are right, you can save some serious time by only having one proposer which keeps on issuing proposals (the leader), and only have other participants propose when they don't hear from that leader for a while. They rely on the basic Paxos protocol outlined here to ensure correctness (they revert to Paxos if machines fail, for example) but otherwise take shortcuts to make it go faster and have lower network overheads in the common case.
If you want to read up on this, start by Googling "Multi Paxos".
@@DistributedSystems Thanks for the prompt reply. I will do some more reading on multi-paxos
Hello, Thank you so much for this explanation.
Really made it clear. :)
Much better than the paper i read.
If the paper was titled "Paxos Made Simple", then that paper is legendarily difficult to read. My goal was to come up with an easier-to-digest version of the same material, and glad I hit the mark here.
Hi, just notice your third example, shouldn't the second proposer stick to pizza since (43.2) > (42.1)?
I am assuming that by third example you mean the "Failures: Proposer in Accept Phase" example.
This is actually one of the places where the magic of Paxos comes in. You would think that a proposer should always stick with what they were proposing. But with the wrong combination of failures, this could result in a situation where the system deadlocks, and no agreement is ever reached.
To avoid this deadlock, if a proposer finds that a prior proposer has already gotten any nodes to agree on the outcome, it simply tries to keep getting more nodes to agree to that same outcome, instead of proposing something new. In that way if proposer nodes keep on repeatedly failing (say, if they have a bug in the code...) they keep on moving forward and getting closer to consensus, instead of repeatedly starting the process over.
In the Paxos algorithm, the system will agree with the majority, but in the real world, sometimes the majority could be the wrong. Assume in a banking system, which need to balance the customers' accounts, if the proposer sent the wrong value and died, which is the case you talked about above. Then other sever become the leader/proposer and sent a right value, so the wrong value will still be chosen by the learners?
I got your point cause I was confused about him saying, it would pick the biggest ID. But, after repeating this part a couple of times, I found that what he meant by "It needs to look at all the promise responses it got" is that the proposer will compare all previous IDs, excluding itself, and pick the greatest one. And since we only have 42.1, then it will replace it with its own proposal round 43.2. A more general example is that if PrevProposals = {41.3, 42.1, 40.4} and the current proposal is 43.2, then it will pick the 42.1 amongst the PrevProposals set and replace it with its own proposal 43.2.
On the slide 'Alternatives to Paxos', Byzantine fault tolerance is said to require an exponential number of messages compared to the number of nodes. A blockchain based approach seems to require only linear scaling of messages with nodes (which you do touch on). Is there something else that is being traded off in this case?
And thanks for the great video!
It is hard to directly compare proof-of-work blockchain and BFT consensus (as I'm talking about here), since they are really solving different problems. BFT consensus is saying "if I have a number of nodes under my control, and assume that some small number of them go insane and are out to get me, can I still reach consensus?" And the answer is yes -- and the solution is pretty darn fast (but not as fast as Paxos, which just assumes that nodes die instead of going insane). Proof-of-work blockchain is saying "if I have a huge number of nodes, of which only a tiny number are under my control, can I eventually get everyone to agree if I provide the right system of incentives, and use up so many resources that no evil-doer could possibly use more resources than me to mess me up?" And the answer is closer to "if you wait a massively long time (at least, compared to how long Paxos or BFT would take to reach consensus) and use an unimaginable amount of compute power (once again, compared to Paxos and BFT solutions) it works almost all the time."
In most problem domains, that is not a great answer, and Paxos is a vastly simpler and better solution. But, if you are really trying to build a cryptocurrency with distributed trust, proof-of-work blockchains will work, while Paxos/BFT will not.
For more details, see my other videos. I describe Byzantine Fault Tolerance in more detail in one. I also have a series of videos explaining how blockchains can be used to achieve consensus (and comparing them to Paxos/BFT).
@@DistributedSystems Thanks, I'll give your other videos a watch too. It's interesting because I imagine that although they seem like a different set of problems, there's some scale of tradeoffs across dimensions such as latency, effort, 'secureness', etc.
At 24:45, Isn't the largest N 43.2? Doesn't that mean it should pick pizza and NOT a burger? I'm confused.
At 24:45 proposer 2 has sent out a Prepare message with round ID 43.2, and gotten a Promise response from node 3 saying "hey, we already started Paxos in round 42.1 and decided on burgers!" Because of this, as I explain at 25:00, proposer 2 changes from trying to get consensus on pizza to trying to get consensus on burgers. It does this by sending out an Accept message (continuing round 43.2) with burgers as the meal to be eaten.
@@DistributedSystems Interesting, so at this point I guess Node 2 is only playing the role of a Proposer. Not proposer AND acceptor?
@@mohdirteza6260 At that point Node 2 is playing both the Proposer and Acceptor roles (although I don't show the messages the machine sends to itself). So Node 2 will also get an Accept(43.2, burgers) message (or act as if it has gotten such a message, without actually sending it).
Thanks for the fantastic explanation!
You're very welcome!
Good explanation, but the pizza and burgers images made me quite hungry
Thanks a lot! Great class.
Except for the fact that he just showed the system vunerabilities at 17:30. Unique identifier otherwise the protocol stops working. So what if a malicious system uses the same identifier, wouldn't that make a single system able to crash the entire network?
Very thankful for this video!!
Thanks!
Really great video, thank you, sir.
Very welcome
Hi. Great video!
When you talk about the bully algorithm, you introduce it by saying that if a proposer wants to propose something, it starts an election. But by using the bully algorithm, that node might never win and might never get to propose its values. Any thoughts on how this can be mitigated? I see others suggest using an exponential back-off to fix the multiple proposers problem, but that could potentially lead to quite some performance degradation I guess.
The bully algorithm is not meant to be fair -- you are right, the biggest bully may always win and nobody else gets to win. But that is okay, since the purpose of using the algorithm is to simply ensure there is *some* winner. The losers can simply forward any values they want proposed to the winner and get them recorded by Paxos. (Paxos assumes that the nodes are cooperating with each other and want to help each other -- it is not meant for an adversarial system. See the video on Byzantine Fault tolerance or the Bitcoin Consensus videos for algorithms which cover those cases.)
24:05 If the promise response has extra information, is it a must to override with a previous proposal?
Yes. This is what allows Paxos to successfully terminate if the proposer repeatedly crashes, as it continues to propagate a previous decision to more acceptors.
Note that if more than one promise response is received with extra information, the proposer has to decide which one to use.
@@DistributedSystems awesomE!
Super nice explanation - thanks.
Thanks!
Hi Chris, at 29:52 minutes of the video you talked about server id. It's same id such as 42.1 or 43.2 right? Those unique ids...
42.1 or 43.2 are examples of "round identifiers". Every time you run the Paxos algorithm to come to consensus on a new thing you choose a new round identifier. But as I've created them, the round identifiers have two parts -- a unique id (such as 42, or 43) concatenated with a server id (server 1, server 2, server 3...).
So when I talk about a server id at 29:52, what I'm talking about is some unique number or string which identifies each server. Could be an IP address, could be a hostname, could be an IP+pid (if you run more than one Paxos instance per machine for some reason). The important properties being that these IDs have to be unique, must not change (be reassigned) while you run the bully algorithm, and they have to be ordered (so you can compare them and see who is biggest).
Thanks Chris for the clarification... It's clear now...
What if your server ID is smaller than other, can you still be the leader? In this situation, how to select the leader?
Hello
Thank you for sharing the video. It was great
Glad you liked it!
Assuming you start a new network and all nodes have id1: How will Paxos decide on a proposer?
Every node would send out prepare messages and also return promises since there is no node with a higher id leading to every node believing it's a proposer? What's does the protocol say in that case? I know Raft is solving this with randomized timers on nodes to send out prepare messages to avoid mulitle proposer election. Thanks in advance!
The short answer: you don't.
One of the assumptions that Paxos has is that your nodes all have unique IDs assigned to them. If you violate that assumption, the algorithm doesn't work. Fortunately, coming up with a technique for assigning unique IDs to nodes is fairly easy. (For example, if you are on an IP network, then you could just use the IP addresses of each machine as an ID. You could also use the MAC address of an ethernet card, the serial number of a CPU or motherboard, or even just manually configure your servers (if that is acceptable for your application).)
Learned a lot .... thank you so much
You're welcome!
Hey there, I noticed at ~10:38 on the slide you've written Paxos sends linear (n) messages with the number of n nodes. I also noticed though that during a round, all nodes will send an `accept()` packet to all other nodes in the cluster. Isn't this therefore n^2 messaging?
I was also curious about this.
Found this online cited to Lamport's Fast Paxos paper: "Instead of each acceptor sending Accepted messages to each learner, acceptors can send their Accepted messages to the leader and the leader can inform the learners when a value has been chosen. However, this adds an extra message delay"
Why do some models of Paxos explicitly list a "Learner" as a third role? (i.e. Proposer, Acceptor, and Learner).
The nodes which adopt the Proposer and Acceptor roles store the state of the system. They have to have storage, be under your administrative control, and if you want to achieve consensus quickly they have to have a fast network connection to the other nodes. As you add more acceptors, you improve your resistance to failures -- but you also make the system slower (since you have to send messages to more nodes, and have a greater chance of encountering a long tail "slow node" which holds you up).
Sometimes you want more nodes. Nodes which know the state of the system, but don't have to bother storing that state and serving it to others. Or perhaps those additional nodes belong to someone else who you don't fully trust to not get hacked. Or perhaps they have cruddy net connections (such as due to being mobile devices, or being far away). Any node which simply needs to know the state of the system, but isn't participating in and storing the consensus itself, can use the Learner role to get that state.
To give a concrete contrived example: let's say you ran a super-reliable chess service. Every time you get a move from a player, you run a round of Paxos on your servers, and all N of your servers store the new board state. If any one of your N servers go down, the chess games can still be played with the remaining N-1 servers while you try to bring the dead server back to life. All is good. Except -- the chess players need to know the state of the game so they can submit new moves to your service. And any audience members also need to know the game state. One way to implement this would be to have the players and audience members act in the Learner role, so they know exactly what is going on, but they can't cheat and try to change the board state, as they are not in a Proposer or Acceptor role.
@@DistributedSystems Great, thank you for the response!
Very helpful super likes
Hello Chris! You are doing amazing job! I would love to see you do a lesson on CAP Theorem. Have a nice day!
Here you go: ruclips.net/video/k-Yaq8AHlFA/видео.html
Why is the "really good burger place" showing a picture of a big mac...
Good question. I don't have a great answer, sadly, as I agree with you. :-)
wow, so clear everything
Thanks!
Automated democracy is the future
ty
This is great!
Thank you!
Amazing! Thanks so much for making these videos
Very clear
Thanks!
Nice!
Except for the fact that he just showed the system vunerabilities at 17:30. Unique identifier otherwise the protocol stops working. So what if a malicious system uses the same identifier, wouldn't that make a single system able to crash the entire network?
Put the server in space with spaceX.
I'm here after reading Paxos Made Simple.
Hopefully this helps make it easier to grok the paper. (The paper is quite challenging, in spite of its title.)
The story around 5:00 is incorrect to my knowledge.
Oh no! Which parts did I get wrong? (I based my story partly on Lamport's telling here: lamport.azurewebsites.net/pubs/pubs.html#lamport-paxos).
@@DistributedSystems I was wrong, you were right :)
@@pawelszulc84 Whew. Thanks for keeping me honest!
Disclaimer: the Paxos algorithm doesn’t help girls figure out what they wanna eat 😩😅
wtf
A pig that goes for burger it is a bit odd xD
Yeah! I drive raft instead of paxos haha!
Heh. Use whatever works for you.
Distributed Systems Course Hi, I should read what paper for implementing paxos, just paxos made simple?
@@visionlee4587 My first reaction: don't. ;-) Implementing Paxos, getting all the corner cases right, and implementing optimizations to make it usable in any particular system is notoriously hard. If you are building a real system (instead of doing this for school or fun), use someone else's implementation of Paxos or Raft.
If you are going to build it, then that paper will certainly tell you about the base algorithm (although it is notoriously hard to read and fully grok -- one reason why I made this video). There are loads of papers others have written which suggest optimizations and improvements for different situations. Depending on what you are building, you'll certainly want some of those. The Wikipedia Paxos page points to some of the more popular variants, and it would be worth studying some of those too.
Good luck!
Hey, Are You The Hamburguer?
Videos like these get you more good karma, than giving money to charities. Because instead of giving the child a fish, you are teaching them how to catch one.
Thanks!
Why cant you just get into the point . I dont want to hear analogies ar funny jokes .for god sake
Sorry this was not the video for you! There are multiple other videos explaining Paxos on RUclips, perhaps one of them is what you need?