In 19:44 it is assumed that ss-tables have a synchronized clock because their entries have a key and timestamp. What method is used to synchronize the clocks of separate nodes that contain ss-tables?
It's handy to think about what it actually looks like for this to matter: you have multiple nodes being written to with different values for the same key at close to the same time, so this is essentially just the multiple master/primary node problem. Either it's fine for one of those to win, or you already need something like a mechanism for optimistic update where the nodes can agree on which is the existing latest value that is getting replaced and that the incoming write was from a client that knew about it. The simple answer is have a single primary node that writes go to, and use its clock. You can be more clever and determine a different primary for each key based on hash to spread the load, which then replicates to the other nodes for resilience. You can still have multiple primaries for a key, but generally that involves then knowing about each other and pushing any received updates to each other, along with the common timestamp, so that communication had to take into account that there's clock differences, time lag, and concurrency issues to consider. Note that making it a timestamp isn't even needed, an auto increment version number works too with most of these approaches, but using a timestamp can be handy.
good talk, but has one flaw: He has too less slides. it is not good (for beginners/learners) if too much is explained without having backed that by slides.
Yeah, true. Lot of talking with the same slide on, the slide just becomes a distraction and also boring I guess. I can see people talking about lack of visualizations when talking about database. Lot of scope for improvement and content for databases I guess! ;)
Starts at 16:00 with LSM-Tree's if you're already aware of sequential vs random access
You are a saint.
Not for a beginner. Good talk to revise the concepts. Highly recommend reading Chapter 3 in DDIA book.
What's DDIA book please?
@@rapoliitdesigning data intensive application
one of best talks about database read and writes
It was a joy to watch.
Enjoyed the talk. Thanks.
Great talk, much appreciated! :)
In 19:44 it is assumed that ss-tables have a synchronized clock because their entries have a key and timestamp. What method is used to synchronize the clocks of separate nodes that contain ss-tables?
Use NTP
It's handy to think about what it actually looks like for this to matter: you have multiple nodes being written to with different values for the same key at close to the same time, so this is essentially just the multiple master/primary node problem. Either it's fine for one of those to win, or you already need something like a mechanism for optimistic update where the nodes can agree on which is the existing latest value that is getting replaced and that the incoming write was from a client that knew about it.
The simple answer is have a single primary node that writes go to, and use its clock. You can be more clever and determine a different primary for each key based on hash to spread the load, which then replicates to the other nodes for resilience. You can still have multiple primaries for a key, but generally that involves then knowing about each other and pushing any received updates to each other, along with the common timestamp, so that communication had to take into account that there's clock differences, time lag, and concurrency issues to consider.
Note that making it a timestamp isn't even needed, an auto increment version number works too with most of these approaches, but using a timestamp can be handy.
Amazing talk
good talk, but has one flaw: He has too less slides. it is not good (for beginners/learners) if too much is explained without having backed that by slides.
Yeah, true. Lot of talking with the same slide on, the slide just becomes a distraction and also boring I guess. I can see people talking about lack of visualizations when talking about database. Lot of scope for improvement and content for databases I guess! ;)
Awesome
Great talk