Leaderless Replication Introduction | Systems Design 0 to 1 with Ex-Google SWE

Поделиться
HTML-код
  • Опубликовано: 24 ноя 2024

Комментарии • 37

  • @leftyhero147
    @leftyhero147 24 дня назад +1

    Man U R the GOAT, I was looking for companion lectures for DDIA and found your playlist

  • @jamesOwanga
    @jamesOwanga 10 месяцев назад +8

    You genuinely have the best system design videos I’ve ever encountered, by a significant margin.

  • @KENTOSI
    @KENTOSI Год назад +5

    Hey thanks Jordan you explained this really well.

  • @ashoke8031
    @ashoke8031 Месяц назад +1

    I was expecting Sequence CRDTs next after seeing the previous video 😅. But this one is cool

  • @WoahImNIce112
    @WoahImNIce112 11 месяцев назад +4

    Being insane is part of what makes up a good engineer

  • @andriidanylov9453
    @andriidanylov9453 Год назад +2

    Cool. This kind of tree is surprising. Thanks for sharing

  • @siddharthsingh7281
    @siddharthsingh7281 Год назад +4

    Loved it, let's implement it on postgres make two databases and ...

  • @the1anonymouse
    @the1anonymouse 8 месяцев назад +3

    Wouldn't there be a significant risk of data mismatch in the tree? What if one tree has 762 and 127 and the other has 761 and 128? Wouldn't that lead to a condition where one database mistakenly thinks the other has data that it doesn't?

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад

      Well I'd say that as long as the way that we create the tree/hashes is deterministic that two nodes with the same data should not have that issue

  • @pintoo...387
    @pintoo...387 4 месяца назад +3

    Hey Jordan,
    when the merkel trees mismatch and we found the row that is mismatching, which row of the two dbs do we pick?

  • @shineinusa
    @shineinusa 4 месяца назад +2

    Multi leader and leader less replication looks almost same in the sense that we would be writing data to multiple nodes. My question is can we use crdt for replication in leader less system as well?

    • @jordanhasnolife5163
      @jordanhasnolife5163  4 месяца назад +1

      I guess technically the merkle tree that we use in leaderless replication for anti entropy is kind of a type of CRDT

  • @rahullingala7311
    @rahullingala7311 9 месяцев назад +3

    How do we have access to the other node's merkle tree, while doing the comparison?
    Would it be sent on network? Then it kind of defeats the purpose.
    Also the merkle tree can be huge if I understand correctly, kind of 2x the size of the DB.
    Isn't that a lot of data to communicate over the network from one node to another?
    May be I didn't understand correctly, please correct me if I'm wrong.

    • @jordanhasnolife5163
      @jordanhasnolife5163  9 месяцев назад

      Hey Rahul - you send the root hash of the merkle tree over the network first (small), which tells you if you have differences, and then just traverse down the merkle tree where the hashes are not the same to see the differences in rows.
      Merkle tree is just a tree of hashes. I have a dedicated video on them on my channel from a couple of years ago, maybe worth checking that one out!

    • @rahullingala7311
      @rahullingala7311 9 месяцев назад

      Ahh understood how that helps, thanks a lot!
      Will also checkout that video.

  • @timothyh1965
    @timothyh1965 3 месяца назад +1

    Jordan when you mention snapshot versioning right at the beginning are you talking about vector versioning? Or are you simply referring to how a db may store the previous write along the current write?

    • @jordanhasnolife5163
      @jordanhasnolife5163  3 месяца назад +1

      Would you mind giving me a timestamp

    • @timothyh1965
      @timothyh1965 3 месяца назад +2

      @@jordanhasnolife5163 Around 2:12. I think the main idea is that every write has a version number associated with it. In DDIA, they give an example on the leaderless replication chapter (pg 189) between two clients trying to write to a single replica.
      I think vector versioning (later on, in that same chapter) is this same concept but applied to multiple replicas. Does that make sense?

    • @jordanhasnolife5163
      @jordanhasnolife5163  3 месяца назад +1

      @@timothyh1965 so yeah in this case basically that number would be a timestamp assigned by whatever node "coordinates" the write. Cassandra can also do version vectors IIRC, but the example I'm showing here is simple "last write wins" resolution.

    • @timothyh1965
      @timothyh1965 3 месяца назад

      @@jordanhasnolife5163 awesome. Thanks man love your videos

  • @dibll
    @dibll Год назад +2

    Could you pls do a video on Data Mesh technology? Thanks!

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад

      Yeah I've gotta research this a bit more, not sure how relevant it is to the systems design interview.

  • @sakshibhasin96
    @sakshibhasin96 8 месяцев назад +2

    If both the databases have multiple mismatched writes, then would the complexity still remain O(log n)? It could be possible that both the left and right nodes are mismatched, in which case both sub-trees will need to be traversed, thus making the time complexity greater than log n

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад

      That's correct, in the worst case everything is different and now we have to linearly traverse the tree.

    • @sakshibhasin96
      @sakshibhasin96 8 месяцев назад +1

      @@jordanhasnolife5163 Thank you! And I love your videos!!

  • @anshulkatare
    @anshulkatare 2 месяца назад +1

    Hey jordan, Is the video on sequence CRDT missing in the playlist, or not made?

    • @jordanhasnolife5163
      @jordanhasnolife5163  2 месяца назад

      It's in the google docs system design video

    • @anshulkatare
      @anshulkatare 2 месяца назад

      @@jordanhasnolife5163 Awesome, thanks.

  • @josepha8415
    @josepha8415 Год назад +4

    Did you pass a systems design interview to get your current job ?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +8

      Nope! Ironic isn't it? Passed other systems design rounds for different quant firms during the interview process

  • @ShivangiSingh-wc3gk
    @ShivangiSingh-wc3gk 4 месяца назад +1

    Jordan, how are you learning these concepts? Do you read papers or are you referencing a book?

  • @sankalpsharma1755
    @sankalpsharma1755 5 месяцев назад +1

    Learned about binary search trees in 2016
    And i am learning about their real use cases in 2024 :O
    And yes i'm old :/

  • @chaitanyatanwar8151
    @chaitanyatanwar8151 5 месяцев назад +1

    Thanks Jordan!