21: Distributed Locking | Systems Design Interview Questions With Ex-Google SWE

Поделиться
HTML-код
  • Опубликовано: 31 дек 2024

Комментарии • 84

  • @Swole_coder
    @Swole_coder 8 месяцев назад +18

    I am a software engineer at Amazon, and your videos have helped me a lot. I was recently able to get senior and principal level offers from Oracle, Microsoft, and a few others. Thanks again. Appreciate it.

  • @pratyushkumarsingh6161
    @pratyushkumarsingh6161 7 месяцев назад +7

    Once you read Designing Data-Intensive Applications all the videos becomes a great source to revisions. Great quality videos, keep up the good work!!!!

  • @TheSdl79
    @TheSdl79 8 месяцев назад +10

    The intro is hilarious😂p.s. congrats on the 30k milestone!

  • @ShreeharshaV
    @ShreeharshaV День назад

    Thanks for great video, Jordan. Follow up questions:
    In the partial failures section around 12:00, I didn't quite understand how did A grab the lock because the write did not succeed in atleast 2 nodes which is needed for quorum right? I understood it as write was successful only in one node and not in other. So unless 2 nodes ack it, write is not successful isnt it, which means A can't claim it has a lock yet?

  • @ziyuchen3112
    @ziyuchen3112 28 дней назад +1

    Thank you for the wonderful video!

  • @RS7-123
    @RS7-123 26 дней назад +1

    like your other videos, really top quality stuff man. question about partial failures around 13 minutes. why can't the node at the top accept requests and coordinate writes amongst all other other nodes in the cluster? doesn't Cassandra do it like this? so the coordinator would know if the write request to all nodes was successful or not,and if not would revert it's state.

    • @jordanhasnolife5163
      @jordanhasnolife5163  23 дня назад +1

      Yeah, the main thing is you just can't rely on reverts, because what if your coordinator goes down before it's able to revert the data?

  • @adrian333dev
    @adrian333dev 8 месяцев назад +4

    Thanks for the content! Preparing for an upcoming Mid level role, hopefully your videos will help me

  • @Rambo0524g
    @Rambo0524g Месяц назад +1

    Hi Jordan - Another great video - For Single Leader Synchronous Repliction - If the Replica dies, wouldn;t there be some zk to tell leader how many acks. to wait instead of leader waiting for the response from dead replica ? if this make sense..is there any other reason why sync replication in Single Leader is not good besides latency ? it seems to me it is fault tolerant like distributed consensus

    • @jordanhasnolife5163
      @jordanhasnolife5163  Месяц назад

      Well then you need zookeeper, which isn't even single leader lol. But if the one replica that we're synchronously replicating to dies, presumably we'd stop writing, because the whole point of synchronous replication is that we're 100% sure that all committed data is replicated.

  • @SohailKhan-gu2du
    @SohailKhan-gu2du 8 месяцев назад +1

    Hey , I love the way you teach . Can you also do this concepts in hands-on project in spring boot or something , so that we can improve our coding and also learn how to test such scenario in real Life

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад

      This is something I've thought about, but realistically would take me a very long time to do haha. In my current state, it's unlikely I can, but maybe if I have some significant life changes

  • @oren23198
    @oren23198 7 месяцев назад +2

    hi jordan thanks for those vidoes very helpful,
    but something it's clear for me how does s3 handles the fencing tokens? doesnt the appliation requires another layer for solving this out?

    • @jordanhasnolife5163
      @jordanhasnolife5163  7 месяцев назад

      S3 is a bit of a bad example because we don't own s3. But imagine you own whatever data source you're sinking to, you can just build this into the logic there.

  • @Keira77L-t3b
    @Keira77L-t3b 4 месяца назад +1

    In the linearization problem example, can't B do a read repair and only grabs the lock after the repair to mitigate it?

    • @jordanhasnolife5163
      @jordanhasnolife5163  4 месяца назад

      Sure, but how does B know that this is state it's supposed to read repair?

    • @Keira77L-t3b
      @Keira77L-t3b 4 месяца назад +1

      ⁠@@jordanhasnolife5163because B reads B:2 from one node, and B:1 from the other, so it knows to ‘fix’ the other node with B:2 (assuming quorum).

    • @jordanhasnolife5163
      @jordanhasnolife5163  4 месяца назад +1

      @@Keira77L-t3b In this particular case yes, however there are unfortunately cases in leaderless replication where for example in N:3, R:2, W:2 with 3 clients writing that they can all achieve quorum and all 3 nodes achieve different values.
      See my most recent dynamo video.

  • @visheshchanana5658
    @visheshchanana5658 8 месяцев назад +1

    In the Distributed Conses, we had 5 nodes(1 leader, 4 followers). When leader went down, we chose the follower that was up to date. What if as soon the follower was chosen as a leader, it went down. Now we have 2 nodes with old data and 1 with new. What happens in this case?

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад

      The one with new data must become the leader. If it goes down, we can't proceed as we can no longer reach a a majority of nodes.

  • @rishabhsaxena8096
    @rishabhsaxena8096 8 месяцев назад +1

    Hey Jordan , Could you please create a video on stock exchange system design, that would majorly focus on the users getting a notification on the stocks they have subscribed if the stock values go up or down based on some parameters in real time.
    Thanks

  • @kevinburke3941
    @kevinburke3941 3 месяца назад +1

    I'm confused about something. You mentioned around 4:20 that client 1's lock will expire due to the garbage collection, but then around 5:00 you say that when client 1 comes back they "still have the lock." I thought it expired for them?

    • @jordanhasnolife5163
      @jordanhasnolife5163  3 месяца назад

      Client 1 just *thinks* it has the lock. It doesn't actually.

  • @chaitanyatanwar8151
    @chaitanyatanwar8151 Месяц назад +1

    Thank you!

  • @nguyentrunghieu6200
    @nguyentrunghieu6200 5 месяцев назад +1

    I'm kinda curious about the fencing token, since the destination write (in your video it's S3) it has to know/store the value of used fencing token so far, is that possible? Since I think that we might have to communicate with many 3rd parties which we do not have the "right" to do that check. How can we resolve it?

    • @jordanhasnolife5163
      @jordanhasnolife5163  5 месяцев назад

      I'm not sure what you mean here. A fencing token is just a number that you can pass to an external service, as long as the external service allows it in their API. Then the external service will only accept writes in an increasing order of that number.

    • @nguyentrunghieu6200
      @nguyentrunghieu6200 5 месяцев назад +1

      @@jordanhasnolife5163 " as long as the external service allows it in their API" - that's my point. I meant, we can't be sure that the external service we want to use will always support fencing token (or have similar thing), what should we do in that case?

    • @jordanhasnolife5163
      @jordanhasnolife5163  5 месяцев назад

      @@nguyentrunghieu6200 Use a different external service or add a stateful proxy in front of it

  • @theJeet8
    @theJeet8 8 месяцев назад +2

    Your intros are always weirdly funny :) Question: in your final design, is the queue also written to followers? i.e. If Leader were to go down, would followers know that B is waiting for A? How would the websocket be restored by A?

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад

      Yep, the queue is written to followers. If the leader goes down, the clients will notice it, and reach out to the other links of nodes in the zookeeper cluster to get the address of the new leader and connect there.

  • @traveling_cruiser
    @traveling_cruiser 6 месяцев назад +1

    Awesome video.. one question..
    what is the leader/follower nodes here: application server/ redis cache/ database?

  • @ashutoshshukla6242
    @ashutoshshukla6242 6 месяцев назад +1

    How can anyone through the notes you have made during these videos? Is it present in any GitHub repository or somewhere else?

  • @RS7-123
    @RS7-123 26 дней назад +1

    in minute 2 of the problem, so dunno what to expect next, but isn’t S3 object immutable? why wouldn’t get corrupted? i get what you are trying to convey though

  • @hazardousharmonies
    @hazardousharmonies 8 месяцев назад +2

    Excellent job Sir

  • @stormShadow64
    @stormShadow64 8 месяцев назад +1

    You are doing awesome work

  • @ananth11
    @ananth11 6 месяцев назад +2

    Well when you feel you are better you don’t have to prove it to anyone, just relax and see ahead and I am sure you will find a much better girl than her !!

  • @shobhitarya1637
    @shobhitarya1637 6 месяцев назад +1

    Nice Video but i have one query. In case of distributed consensus, how reads are done for lock ?. If it is read from replica which is not upto date, it can lead to a problem. I have also watched your raft videos, which gives me impression that distributed consensus provides lineraliziability but not strong consistency in reading.

    • @jordanhasnolife5163
      @jordanhasnolife5163  6 месяцев назад

      You read from the leader. This is slow, but it's still fault tolerant, because we have the ability to perform a fail over if the leader goes down.

  • @nontechnicalbaba
    @nontechnicalbaba 14 дней назад +1

    if writing to s3 happens inside synchronised block, whats wrong in that? Garbage collection will have no role to play in this scenario.

    • @jordanhasnolife5163
      @jordanhasnolife5163  13 дней назад

      Can you elaborate? I don't know what you mean here.
      Garbage collection just happens when we write stuff to S3 and then it turns out we don't need to use it

  • @divijsharma5610
    @divijsharma5610 5 месяцев назад +2

    Why this channel name man , you are giving life to so many. Rename the channel to Jordan gives life

  • @mcee311
    @mcee311 8 месяцев назад +1

    what if you have a large amount of connections on the leader node? How do you deal with that situation?

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад +1

      I'm assuming this is for many different locks, you basically have to partition them across many zookeeper clusters.

    • @潘雪松-f4g
      @潘雪松-f4g 7 месяцев назад +1

      @@jordanhasnolife5163 does that mean this leader + several followers just act like one zookeeper node. And for horizontal scale up we need more zookeeper nodes with sharding(and every node have their own leader and followers)?

    • @jordanhasnolife5163
      @jordanhasnolife5163  7 месяцев назад

      @@潘雪松-f4g You are correct

  • @michaelv2555
    @michaelv2555 8 месяцев назад +2

    No Flink and CDC used? Jordan, are you ok?

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад +2

      No someone said my videos were too unrealistic for interviews on reddit and now I'm in a deep state of depression

    • @Ynno2
      @Ynno2 7 месяцев назад +1

      @@jordanhasnolife5163 I think that's true, but I don't necessarily think it's bad. Your videos aren't really structured in an interview style. You go into a lot of depth in your designs. It would be unrealistic to draw the sometimes huge designs your show at the end of your videos in the space of an interview - especially somewhere like Meta where you realistically only have 35 minutes of design time, but it's beneficial to see everything as inspiration for how you might deep dive in different areas. In a real interview you may only deep dive into a couple of the areas you show.

  • @fallencheeto4762
    @fallencheeto4762 5 месяцев назад +1

    Is linearizable similar to causal consistency?

    • @jordanhasnolife5163
      @jordanhasnolife5163  5 месяцев назад

      Causal consistency just implies that if a write B happened because a user first saw write A, we should never be able to read B without also having access to the A write
      Linearizable databases are causally consistent, but not all causally consistent databases are linearizable.

    • @fallencheeto4762
      @fallencheeto4762 5 месяцев назад

      @@jordanhasnolife5163 interesting, we learn something new everyday! Great video man

  • @vipulspartacus7771
    @vipulspartacus7771 8 месяцев назад +1

    Hi Jordan, really appreciate the content, is it possible for you to share your ipad notes. It is difficult to follow and revise your content without the notes and making the entire notes while following the video is time consuming.
    It would be really helpful if you could share your hand written notes from ipad (maybe it is not perfect but still a better reference than nothing) which we could keep as reference to follow your content. As we go through the video, we could add our own comments or notes on it to make it more clear. Please consider.

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад +1

      Planning on doing this in bulk after finishing my current series, this will be in the next 1-3 months.

  • @codr0514
    @codr0514 8 месяцев назад +1

    What editor are you using for drawing? Do you also use any pen based device?

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад +1

      Apple pencil + oneNote

    • @codr0514
      @codr0514 8 месяцев назад

      @@jordanhasnolife5163 thanks for your response 🫡

  • @NoName-lz6bc
    @NoName-lz6bc 3 месяца назад +1

    Make irrelevant question but how will things work when there are multiple resources to contend for. Eg one s3 file second s3 file maybe a customers sms channel. ?
    How will distributed consesus work then

    • @jordanhasnolife5163
      @jordanhasnolife5163  3 месяца назад

      You use a separate lock for the other file.

    • @NoName-lz6bc
      @NoName-lz6bc 3 месяца назад

      @@jordanhasnolife5163 when the master node fails then backup node1 have latest version of lock 1 but node2 has latest version of lock2. Then who will be the leader?

  • @RolopIsHere
    @RolopIsHere 3 месяца назад +1

    I had to debate with myself if I left the video with 69 comments or if I added a comment to help your video with the algorithm...

  • @Algorithmswithsubham
    @Algorithmswithsubham 8 месяцев назад +1

    congrats, whta introo

  • @jhonsen9842
    @jhonsen9842 5 месяцев назад +1

    I rejected in System Design Round LoL I took it lightly and didn't prepare

  • @MallardDuck77
    @MallardDuck77 8 месяцев назад +2

    LE'S GO KNICKS!

  • @rafaelarantes4804
    @rafaelarantes4804 8 месяцев назад +3

    The content is great, but I always come for the golden nugget in the description

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад +1

      I churn out nuggets in the description and on the toilet

  • @helperclass
    @helperclass 7 месяцев назад +1

    Thanks for the great video. I need your help with one of my task. I will become your patreon if you help. In my current company I have received one task in which I have to execute queries in the order they were originally executed. I have a list of queries and their original start and end times. So to execute them again in the same order we need to build dependency graph. How we can build this dependency graph.
    Query 1: start time 1 end time 3
    Query 2 start time 2 and end time 5
    Query 3 start time 4 end time.
    Qry 2 can start after qry 1 has started. Query 3 can be started after 1 finished and 2 started

    • @helperclass
      @helperclass 7 месяцев назад +1

      My implementation is not efficient as I am checking for every query all the query started before it and storing the dependencies in list

    • @jordanhasnolife5163
      @jordanhasnolife5163  7 месяцев назад +2

      Doesn't really make much sense to me considering the start times and end times. But look up topological sorting. Make a graph of the dependency relationships, and run a topological sort. This will tell you when you can schedule a given task, at which point you can run a second job that looks at currently scheduled tasks and whether their start time has passed.
      I don't have a patreon, send it to charity.

  • @sid4579
    @sid4579 8 месяцев назад +1

    What is the source of truth for all this?

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад

      I'm not sure what you're asking - do you mean my sources?

    • @sid4579
      @sid4579 8 месяцев назад +1

      @@jordanhasnolife5163 Thanks for replying! I meant where did you learn about all this? Is there a comprehensive resource or is this just result of your years of experience in tech?

    • @jordanhasnolife5163
      @jordanhasnolife5163  8 месяцев назад +2

      @@sid4579 Well considering that I don't have many years of experience in tech, I'm going to say that I did not learn anything that way. I'm simply just aggregating any information that I can find across anywhere on the internet. If there was a comprehensive resource for it, I don't think I'd be making these videos in the first place, as I myself am attempting to be a comprehensive resource for it.

    • @Ynno2
      @Ynno2 7 месяцев назад

      @@sid4579 Martin Kleppmann's book and RUclips videos cover a lot of this.