System Design Interview - Distributed Message Queue

Поделиться
HTML-код
  • Опубликовано: 28 сен 2024

Комментарии • 349

  • @jhnmn
    @jhnmn 5 лет назад +36

    I almost never comment on RUclips but this undoubtedly deserves an exception. Thank you for the superb quality content you’ve put together. I wouldn’t be surprised if this series becomes the de facto video resource for systems design and architecture interviews. Hope you keep uploading!

  • @jamesyin3220
    @jamesyin3220 2 года назад +2

    This is the best content on system design I've ever seen. Please consider resuming the journey! We'd love to ride along!

  • @stillmattwest
    @stillmattwest 2 года назад +79

    This was the first video I'd seen from this channel. This is some next-level system design content. Way more in-depth than other videos I've seen. Unfortunately, it doesn't look like there have been any recent uploads, which is really too bad.

  • @suzi3245
    @suzi3245 Год назад

    Each word in this video is a golden word. Make sure you don't skip or neglect it. Thank you so much

  • @jayendrasingh6580
    @jayendrasingh6580 2 года назад +1

    These videos are best resource among all I have gone through. I am surprised, why this channel is not posting any more videos. Good Work and thank you !

  • @jasonmeyer495
    @jasonmeyer495 7 месяцев назад

    I wish this guy was still making these videos. By far the best of the system design interview content out there (I've watched them all, lol).

  • @akhtarmnnit
    @akhtarmnnit 4 года назад +1

    I have seen a bunch of youtubers for system design interview...this one is one of the better ones...good way of using graphics while talking instead of mundane approach of heavy talking and using a whiteboard....Great job buddy...I am gonna explore all your video now

  • @rishabhjain2404
    @rishabhjain2404 4 года назад +9

    thank you for working on the subtitles, makes it easier to consume your good content

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад +7

      Sure, Rishabh! Thanks for letting me know that subtitles help. Every next video will have subtitles as well.

    • @rishabhjain2404
      @rishabhjain2404 4 года назад +4

      Thanks Mikhail. You have excellent english fluency. I am just used to different pronunciations of certain words.

  • @victormartins-software3912
    @victormartins-software3912 3 года назад +11

    I can’t thank you enough, I was really struggling to grasp these topics and your explanations really helped me put it all together 🙏 excellent work!

  • @bhaveshssharma8826
    @bhaveshssharma8826 3 года назад +3

    Best course available on the Internet.

  • @hyunminkim3315
    @hyunminkim3315 5 лет назад +5

    Very thorough! Really appreciate your hard work. I can tell your channel will become huge for engineering resources.

    • @SystemDesignInterview
      @SystemDesignInterview  5 лет назад

      Thank you, Hyunmin. Appreciate your feedback and words of encouragement!

  • @TyzFix
    @TyzFix 2 года назад +1

    I am expecting you write a SD book that gives us the same amount of useful information as here. outstanding job!

  • @jmitesh01
    @jmitesh01 5 лет назад +7

    Summary(notes):
    1. Problem statement: Producer sends data and exactly one of the consumers gets the data
    2. Resolving ambiguity in the problem statement by asking questions such as scale, priority, and so on...
    3. Just focus on the core set of requirements - sendMessage(messageBody), receiveMessage()
    4. SLA numbers for the non-functional requirements
    5. Components: LB, Control Plane(Metadata-Service), Data Plane-1(Frontend), Data Plane-2(Back-end)
    6. FE: Required Cross cutting concerns such as billing, throttling, the most important - routing to Backend since the Backend is stateful and so on.
    7. Metadata Service: Caching Layer for routing information and metadata ( high consistency required in case of very few writes, R/W Ratio)
    8. Backend Service: API Handling Layer, Storage and so on. Since Backend has to be HA and fault tolerant as it requires a consensus service like ZooKeeper or In-Cluster and Out-Cluster management strategy.
    ----
    Extend the above design of queue creation with queue deletion, message deletion, message replication, delivery semantics( exactly once delivery not supported because it requires 2PC) and Pull vs Push messages, security and monitoring.
    ---
    Scalability Bottlenecks, use-case exntensability and use-case supported/limitations?

  • @YashRaithatha1989
    @YashRaithatha1989 5 лет назад +2

    Just awesome ! Your approach to problem solving is very generic. Really liked it and keep posting such fantastic system design interview questions. This is the best material i have seen till date on the topic.
    Thanks a lot.

  • @deathbombs
    @deathbombs 2 года назад +2

    17:08 to clarify option 2, queues are basically each a cluster in this case(each cluster contains a set of queues). Instances are the replications.
    Interestingly replication for inmemory hosts/instances are handled similarly to nosql nodes

  • @chiragr1336
    @chiragr1336 4 года назад +2

    Thanks allot @Mikhail. Your videos are so fun and easy to watch. I feel it's one of the best specifically for system design and you sound like some Russian Pro coder to top it;) I request you to make a video about all the possible components (load balancers, CDN, etc) in a system design interview that will ever be used, because you keep using few different components for different problems. If we get to know all the components, then we too can arrive at a better solution. Thanks for the great content and keep creating new videos! 👍

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад

      Thank you for the kind words, Chirag! I have been thinking about the same for a while. And there are ideas how to address this. Just need to find more spare time to realize all these ideas ((

  • @BestURLShortenerBioPageQRCode
    @BestURLShortenerBioPageQRCode Год назад +2

    On the system design topic, video very in-depth and concrete explanation on how to navigate successfully through the system design interviews. Thank you for this tutorial! ❤❤

  • @culsumu
    @culsumu 2 года назад +1

    Your videos are Superb !
    Most useful videos on System design.
    Please start making more videos like this !
    More on each component details which helps in System design :) !

  • @babadun36
    @babadun36 5 лет назад +20

    This is more just Distributed MQ. The video covers the fundamental approaches in modern data intensive distributed systems.

  • @riteshpatel16
    @riteshpatel16 3 года назад +1

    Thank you for all the hard work and such a great explanation of complicated topics. This is way superior to other paid content. I would love to see video on API design that covers how and what. For example, question that say design an API to upload photos from iOS, how do you go about it?
    What are the good characteristics of an API? What are key components you need to think about while designing an API and so on.

  • @madhukarm1319
    @madhukarm1319 2 года назад +1

    Thank you! This covers a lot of background. One thing i feel should have been covered how strict message ordering is achieved across partitions?

  • @jamess5330
    @jamess5330 2 года назад

    Very helpful! Another super effective way to prepare system design interviews: Do mock interviews with FAANG engineers at Meetapro.

  • @swapniljain3459
    @swapniljain3459 4 года назад +2

    Great work and Explanation . Thanks a lot. This is the best explanation and walk through to prepare for a System design interview.

  • @austinkim7804
    @austinkim7804 Год назад

    Finally finished going through your videos. Thanks so much!

  • @leoxiaoyanqu
    @leoxiaoyanqu 3 года назад +1

    Thanks a lot for your videos! Very helpful!
    I wonder if it's possible for you to have a mock interview video (e.g. you're on the interviewee side), covering things like what tools/apps would you use for real-world SDIs for better productivity, etc.

  • @shreyasns1
    @shreyasns1 2 года назад

    Thanks for the video, had one suggestion on rate limiting. Token bucket is widely used and not leaky bucket.

  • @SchartzRehan
    @SchartzRehan 5 лет назад +1

    This is crisp and clear. Many thanks.

  • @sudharshannd3497
    @sudharshannd3497 10 месяцев назад

    Great video, but I think it should cover even more low-level details on how messages are stored in memory and retrieved using offset/invisible flag.

  • @bhavyabansal1143
    @bhavyabansal1143 2 года назад +1

    any reason that we don't have more content uploaded here? is the author busy?

  • @akankshamahajan9709
    @akankshamahajan9709 5 месяцев назад

    Wowww!!! These videos really helped me to prepare for my SD interview. Is there anything similar for ML System Design interviews?

  • @onePunch95
    @onePunch95 3 года назад +8

    Great content! I have some confusion regarding the queue identification.
    1. In the API definition, we are only sending the message, so when the first-ever message comes, how is that message getting mapped to a queue number? For example in the slides it says a sendMessage(msg) comes for queue id =1, how does the sender know about the queue id? Similarly, when the receiveMessage() API is called, how does the receiver know which queue to get the message from, secondly there are several messages in the queue, so how do we know which message it wants to receive, and how are we deciding?
    Let's say., when the first message comes around, the backend stores the data and takes care of replication, then writes the mapping in the DB. But how is this information being propagated to the receiver, that wants the message, how do they get to know about the queue id?
    2. In the table shown for in-cluster management, for qid 1, the leader is A and followers are C, B. But if the queue is distributed over nodes, then how are we just having one leader node as A? Doesn't that mean we are storing the entire queue 1 in A, and the copies in the followers?

    • @AshishNegi1618
      @AshishNegi1618 2 года назад +2

      1a. Message should contain QueueId.
      1b. API should be queue.ReceiveMessage() ; Queue object knows about queue_id and sends in either every poll request or is tied to tcp/grpc/websocket connection.
      1c. Messages are received from queue in kind of FIFO order. So, client sends last Message id or Sequence Number and server sends SequenceNumber+1 th message.
      1d. Client knows queue name and that should be able to give them queue id. It can be either hash of "queue_name" Or they ping Frontend service to get QueueId for a QueueName.
      2. A distributed queue does not mean partial data on different nodes. It means full copy of data on all nodes. One of the node can only currently write -- this node is called Primary node. This is done so that even if one machine goes down for ever, full copy of data is available in other machines. This gives high availability/durability in case of failures.

  • @engineerv3248
    @engineerv3248 2 года назад +1

    Thank you for great quality content. Excellent explanation, I can't miss single word.
    Why the series is stopped? Any other resources would you recommend?

  • @ruslanda7690
    @ruslanda7690 5 лет назад +6

    This is awesome!! Thank you!

  • @ameyjain3462
    @ameyjain3462 4 года назад

    Great content:
    1. How will backend store files in local file system? - can you shed some light how will they maintain order. How will replicas sync on the order of the messages they sent.
    2. How will Delete calls sync on a file a system.
    3. Even using database if we store queueName-> messageId as the key, retrieving records in order might be slow and hard. If each queue is served by a single host, it can update which messages are sent to a client by writing some field in the database record(may be batch update).Retrieve more message from the table but the query might be a scan query, I am not sure which no sql can provide such kind of query quickly. Dynamodb can have queueUrl as partition and messageId as sort key and it can return sorted messge ids. But the throughput might be limited as a single host is serving all requests. How will multiple host serving the messages from the queue can handle the ordering?

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад

      Hi Amey,
      Many good questions. A good answer will require a video on its own. Let me share only some ideas.
      1. Please check this thread: ruclips.net/video/iJLL-KPqBpM/видео.html&lc=UgxvWvctwx7ZvfFisMR4AaABAg.93RPUNm8kUx93p9DlWifqg
      And this one, for message ordering: ruclips.net/video/iJLL-KPqBpM/видео.html&lc=UgxxrP7-xc81hs98frV4AaABAg.8x222vilo4H8xENn5UVXaR
      2. It depends on the storage mechanism we use. E.g. in Kafka, deletes are not happening immediately. Only when the whole segment (file) with messages is processed, it is deleted. Many embedded databases use the similar approach. When record is marked as deleted (a tombstone record). And the actual deletion is happening later, e.g. during a log compaction process.
      3. You are correct, we should not expect high throughput for a messaging system that relies on an external database. As mentioned before, we should store messages locally, on disk. Using, for example, our own solution based on append-only log files. Or an embedded database (e.g. RocksDB).
      Database can be used for a small scale though. In such cases, we can maintain order using some simple technique, e.g. timestamps.

  • @wangyipeng123
    @wangyipeng123 3 года назад +4

    16:21 Question: can FE talk to the in-cluster manager instead of MS to figure out which leader to talk to?

  • @atabhatti6010
    @atabhatti6010 9 месяцев назад

    Thanks for the great content. Can you please talk about which choice of architecture (1, 2 or 3) you would make for the Metadata Service? And why? And please explain your comment that the datastore for MS does not need to be strongly consistent?

  • @ChandraSekhar-zu9nw
    @ChandraSekhar-zu9nw 4 года назад +2

    Thanks for the great video. Just a question below:
    If we have multiple consumers, say, application deployed in a cluster and assuming consumers poll for new messages, how do we ensure that only one instance of them gets the message? Do we need to have some distributed lock on the message so that only one consumer would get it?

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад +8

      Hi Chandra,
      In case of a pull-based consumer (like the one described in the video) we indeed need a mechanism to "lock" a message. So that it is not available for other consumers. One option is to do something similar to what AWS SQS does: docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-visibility-timeout.html
      We could have also implement a push-based mechanism, when queue service itself is responsible for sending a message to one of the subscribed consumers (please check a video about Notification Service, similar idea is described there but for pub/sub use case, not a queue). In this scenario queue ensures that only one consumer gets the message by sending it to a single consumer from the list (e.g. using round-robin algorithm).
      Please let me know if more details are needed.

  • @suchismitagoswami5609
    @suchismitagoswami5609 3 года назад +1

    Really great content. I have one doubt. In the out cluster management option, let's say we split each queue into multiple partition across multiple clusters, and each partition is being handled by separate clusters with replication of data in all the nodes inside the cluster. What if an entire cluster goes down? How will we ensure durability of the message belongs to the partitions managed by that cluster.. Please help me to resolve the doubt.

  • @chickentikkasauce1301
    @chickentikkasauce1301 5 лет назад +1

    Overall great video. Speaker is very knowledgeable.

    • @SystemDesignInterview
      @SystemDesignInterview  5 лет назад

      Thank you, Chicken Tikka Sauce, for all your comments. To this and other videos. Much appreciated!

  • @kavehshirgir6107
    @kavehshirgir6107 2 года назад

    Excellent descriptions and problem solving. Please share more.
    One thing: about TLS termination I don't think it's a common practice to be done on service instance. It's usually done on LB. For example AWS LBs can be configured to be TLS terminators.
    Would you be able to elaborate more?
    Thanks

  • @shubhamgoyal3100
    @shubhamgoyal3100 2 года назад +1

    Awesome Content, Thanks a ton :)

  • @abhishekaggarwal9774
    @abhishekaggarwal9774 4 года назад

    Thank you so much.. I've an upcoming interview in couple of weeks and I'm totally confused from where to learn System Design. I came to your videos and now I'm thinking to listen to your videos multiple times so that this content fits in my brain. I've one question. Shouldn't Metadata Service and Metadata DB be connected to Backend rather than Frontend? Also, apprepreciate if you can upload few more top design interview videos like Design Whatsapp/Netflix etc. Also, I really like the idea if you can do mock interviews and upload so that we can learn from the mistakes.

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад +1

      Hi Abhishek. Thank you for the feedback! Much appreciated!
      First of all, let me start by wishing you luck at your interviews.
      Second, please read the following thread, it should provide more understanding about Metadata service purpose: ruclips.net/video/iJLL-KPqBpM/видео.html&lc=UgwNZ5mE3o8fFV_yY214AaABAg.8zdMxmBN3of9-4ktitHhKv
      Third, I have topics you mentioned in my TODO list. And yes, mock interviews sounds like a great idea. My only problem is to find time for all this ))

  • @ErwinDSouza
    @ErwinDSouza Год назад

    Thanks for the video! I wish there was more of a deep-dive into where to store the data - you said memory + file system, but how exactly? is it an async process? cheers!

  • @evgeniystepankevich7964
    @evgeniystepankevich7964 2 года назад

    Big thanks for the videos, great material !

  • @RR8401674
    @RR8401674 5 лет назад +4

    Thanks a lot. That is awesome!

  • @chickentikkasauce1301
    @chickentikkasauce1301 5 лет назад

    Another direction this could go in is if you’re designing a distributed queue for N queues for single customer vs multi tenant.

  • @GaneshManika
    @GaneshManika 3 года назад +1

    Golden quality.

  • @amitpaliwal3544
    @amitpaliwal3544 4 года назад +1

    can u explain like when u said experienced folks should not go for database but hard disk and memory...how will hard disk provide high throughpout? cant we use nosql for that to avoid acid guarantees?

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад +2

      Hi Amit,
      "Database" is a valid answer. But be ready that interviewer will start to dig deeper. A proper storage design for a messaging system is a big topic of its own. Let me share with you some ideas.
      - A shared database (either SQL or NoSQL) may work well for a small scale. But for highly loaded systems database solution will become too expensive. Be prepared for a discussion how distributed databases work.
      - Embedded (local to the machine) database is a better option. No remote calls, data isolation. Be prepared for a discussion on what principles embedded databases are built upon.
      - But do we really need a database for storing messages temporarily? Some of these messages may be consumed within seconds after they were created. Even embedded database looks like an overkill. What if we store messages in a log file? New messages just keep appending to the log.
      - If we go with log solution, there are many details to think about: how we store (e.g. buffer in memory and flush the batches), how messages are identified within a log (e.g. offset + index), log compaction, data compression, etc.

  • @kumarchandan9685
    @kumarchandan9685 2 года назад

    Please do more if you can :) excellent content

  • @dipenpatel5226
    @dipenpatel5226 4 года назад +2

    Great video! Thanks for uploading it. I have a question about the replication piece. Once an item is popped off the queue in the leaderless, how do we ensure consistency? Is there some sort of quorum that has to happen to avoid another pop request coming to another cluster?

  • @himangshuroy3571
    @himangshuroy3571 2 года назад

    Hi Mikhail, Others
    Not sure whether you will see this or not, but wanted to ask one thing. If I try to design gmail using a similar architecture outlined above then
    1. Whether the Metadata server will store time, sender, size etc? While the backend store will store the actual mail?
    2. Can the MD Server( No SQL with a Cache/façade in front) organized in a consistent hash ring using the User name( through hash) as primary key?
    3. If 2 is correct how do I display the most recent mails? Seems I need to sort the data stored in a node, when to do it? Where and how to store it?
    4. If I sort based on time and store in a distributed cache and then I want to sort by size how can I do it, Will the Frontend Service help on this? Does No SQL allows this kind of queries?
    5. How will I know Which Backend Storage store the mail? Is there a mapping exists between MD server and Backend Cluster?
    Many thanks in advance.

  • @jmka1222
    @jmka1222 5 лет назад +1

    A few questions
    - at 10:12 you say that "read when mesg arrives, write only when queue created". why? wouldn't you write every mesg arriving to the cache?
    - metadata service is it responsible for persisting data to db other than being used as cache? if so why is backend service also doing the same?
    - when you say distributed queue, you mean queue for communicating between a single producer-single consumer resides on several machine or between several "single produce-consumer" connections on several machines? if former, wouldn't only one machine be sufficient?

    • @jmitesh01
      @jmitesh01 5 лет назад +1

      Hi Jm Ka,
      1. Metadata service is used for storing meta information such as queues to backend service host mapping so when we create a new queue then only we need to add that info to Metadata Service persistent storage and cache as well.
      2. Backend Service stores the actaul message and based on the requirements we may cache for frequeuent accessed queue to Metadata Service.

    • @SystemDesignInterview
      @SystemDesignInterview  5 лет назад

      Hi Jm Ka,
      Hopefully Mitesh's answer helped to clarify what is stored where. Metadata service for metadata only and Backend service for messages.
      As for your last question, can you please clarify what yo mean by "several single produce-consumer connections"?
      Each message lives on several machines. To achieve high availability. Simply speaking, we want to make sure that messages are not lost if a single machine crashes. Every time we store a published message, we replicate it. So, we always have several copies of the same data.

    • @jmka1222
      @jmka1222 5 лет назад

      @@SystemDesignInterview Hi, Mitesh's post doesn't answer anything, it's simply reiterates stuff he heard you say.
      1) was how at 10:12 for metadata service you say "read when mesg arrives, write only when queue created". Is metadata service (or cache) storing the whole queue including messages? Or is it storing only which queues go to which consumers? Either way, when a message arrives, then too it'd need to be cached in metadata service, so there has to be a write. In that case, saying ""read when mesg arrives, write only when queue created" would be wrong
      2) Is metadata service only a cache or can front-end service persist messages without going through metadata service?
      backend service persists messages, but it looks like in your description metadata service is doing the same too?
      3) by the term "distributed queue", one could mean several things. A) you can have a distributed queue that's conceptually a single queue for only 1 producer that's generating messages for 10 consumers, but the queue is replicated and sharded on several machines for availability concerns. B) you could have a distributed queue that's multiple queues for 5 different producers, each one catering to 10 consumers (total of 50 consumers). this queue can also be called distributed. C) a distributed queue that's multiple queues for 5 different producers, each one catering to only 1 consumer (total of 5 consumers).
      By "several single produce-consumer connections" I meant case (C).
      Which one of A or B or C you meant by "distributed queue"?

    • @SystemDesignInterview
      @SystemDesignInterview  5 лет назад +1

      Hi Jm ka,
      1. We write to Metadata service only when we create/update queue metadata. We do not store messages in Metadata service. When message arrives, we make a call to Metadata service to get details about the queue. E.g. we may retrieve message size limit value and check if the just arrived message exceeds the max size or not. Messages are only stored on back-end service machines.
      Please let me know what part of the video confused you and made you thinking we store messages in Metadata store.
      2. When message arrives, front-end service needs to pick a back-end machine for storing the message. Same is true for retrieving a message, front-end needs to forward receive message request to a machine that stores messages for the requested queue. So, front-end needs to get this information from somewhere. From some persistent storage. Calling database directly is not great in this case, as there may be too many calls. This may be both slow and expensive. Metadata service helps to avoid direct calls to the database, by storing queue metadata information in memory.
      3. Distributed simply means messages for the same queue are replicated and stored across several machines. There may be multiple producers and multiple consumers. Only one consumer gets the message.
      Let me know if you have other questions.

    • @xipan5344
      @xipan5344 4 года назад

      @@SystemDesignInterview Hi, Does that mean the in-cluster(zookeeper) out-cluster mapping information is retrieved from the metadata service

  • @dhruvjainiitkgpcse
    @dhruvjainiitkgpcse Год назад

    isnt outcluster same as meta data service as they help in getting to know which clutster the message corresponding to a queue id goes?

  • @ImranAliyev
    @ImranAliyev 5 лет назад +1

    really good explanation. Thanks !!!

  • @twistedlog24
    @twistedlog24 4 года назад

    This is exceptional content, thanks. One thing I’d like to think about is if the messages are going to be stored on a file system how are the organized so that some ordering is guaranteed or achievable, wonder how aws sqs does it!

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад +1

      Hi Sahil,
      I cannot speak for AWS SQS, there is no publicly available information that describes how data is stored internally. But we can take a look at how other, open-source messaging systems, store the data.
      I have shared some details here: ruclips.net/video/iJLL-KPqBpM/видео.html&lc=UgxvWvctwx7ZvfFisMR4AaABAg.93RPUNm8kUx93p9DlWifqg and here: ruclips.net/video/iJLL-KPqBpM/видео.html&lc=UgxG4wRwhQBsfn_C82V4AaABAg.9-tTttN28Qk91Hi2umb-Yd

    • @Labandusette
      @Labandusette 4 года назад +1

      @@SystemDesignInterview thanks for your fast and helpful response

  • @nishant07kumar
    @nishant07kumar 4 года назад +1

    Best Video. So detailed

  • @rockyraj12
    @rockyraj12 4 года назад +1

    Great channel for system design. Highly appreciate your efforts. One question though, at around 15:55, under option1(single leader replication) you mention that for a receiver the FE needs to query the metadata service to find the leader. Since data is replicated across all nodes, read requests can be serviced by any followers right ? I agree for writes, leader needs to be sent the data.

    • @kvv6452
      @kvv6452 2 года назад

      Yes, that can be done. Followers can be used to honor reads.

  • @агатакристи-г3ы
    @агатакристи-г3ы 3 года назад

    заодно и английский подучил, спасибо автору

  • @jackyd1917
    @jackyd1917 2 года назад

    How does DNS know which loader balance to direct the request to? Doesn't it make DNS take the responsibility of LB itself?

  • @amyyadav0
    @amyyadav0 4 года назад +1

    Appreciate your efforts

  • @SameerSrinivas
    @SameerSrinivas 4 года назад +1

    Great post! Thanks a lot for your effort.
    I have seen recommendations from people to make network and storage capacity estimations right after listing requirements. Could you please tell whats your opinion on that? Because, doing so, would have brought topics such as max message length allowed, which could have played a role when deciding what to chose for backend storage too.

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад

      Hi Sameer. Good question! Please find my answer on this topic here: ruclips.net/video/bBTPZ9NdSk8/видео.html&lc=UgzOwtTRUT70DA0PoR54AaABAg.8xfY2LIJwXK8xpWB0AcmmX

  • @andyyuan97
    @andyyuan97 3 года назад

    Don't stop!!! Come on, My dear !

  • @pragya7746
    @pragya7746 3 года назад

    I have a question ... why do we need backend service and storage to store data of the queue.... Isnt queue should be something like queue data structure, where produces produces it and consumer consumes it ?

  • @drdzdd
    @drdzdd 4 года назад +1

    Great video!

  • @HoudiniSouth
    @HoudiniSouth 3 года назад

    Would you be able to name one of the example services that acts like the metadata service from the lecture? Just to help my understanding! Thank you.

  • @MultiMach7
    @MultiMach7 4 года назад

    How does chat apps ensure the order of messages from sender? Let’s assume each message has a “send timestamp”. Do we just dump messages into database, and sort them on querying; or we set up some service before messages enter database (similar to aggregator in the other video) and sort a small batch on the fly; or we insert messages to database in an order-maintaining fashion?

  • @durgaprasana5531
    @durgaprasana5531 4 года назад +1

    Great video. 👍
    Towards the end, the summarization felt rushed while answering whether the service is fault-tolerant, scalable..While answering such questions, it might be helpful to highlight how the system would cope to say a network partition, how would the backend hosts receiving the message save/receive requests respond to those queries?

  • @samchen_123
    @samchen_123 12 дней назад

    Thank you!

  • @bokistotel
    @bokistotel Год назад

    This frontend service looks more like a backend service to me and @15:55 ,does FrontEnd recieve a message from queue, and then communicates to Backend / Message Service (blue)?

    • @spirridd
      @spirridd Год назад

      MS - Metadata Service

  • @TheDEMMX
    @TheDEMMX 2 года назад

    why don't we just consult configuration service ((in/out) cluster manager) to find the leader, why do we need metadata service, doesn't that add more complexity?

  • @asakhala
    @asakhala 3 года назад

    So what exactly are we using metadata persistence for?

  • @hechen236
    @hechen236 4 года назад +4

    13:36 Ping on the video for building distributed database :D

  • @nitinneo7
    @nitinneo7 3 года назад

    To clarify, frontend service here is similar to the WebServer and the MetadataService layer is similar to the App Server with in-memory cache which interacts with the DB, right?

  • @deepaksoundappan3244
    @deepaksoundappan3244 3 года назад

    Awesome work.. I have like this..

  • @VS-SEA
    @VS-SEA 7 месяцев назад

    How is ordering maintained?

  • @mohamedelsharkawy9166
    @mohamedelsharkawy9166 2 года назад

    Amazing content

  • @manivannanrajah
    @manivannanrajah 4 года назад

    one of the important aspects is how the list of message is stored in the persistent storage and the next one is given to exactly one of the several consumers. essentially how does get message work within a host that contains the data for a specifc queue. could you help understand this?

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад

      Hi Mani. Sure.
      Please take a look at my comments here
      ruclips.net/video/iJLL-KPqBpM/видео.html&lc=UgxvWvctwx7ZvfFisMR4AaABAg.93RPUNm8kUx93p9DlWifqg
      and here
      ruclips.net/video/iJLL-KPqBpM/видео.html&lc=Ugywi1VzEkG71lUYicR4AaABAg.92yd_GDCHa-930a7BWIx4K
      and let me know if more details are needed.

  • @LovyGupta007
    @LovyGupta007 5 лет назад +1

    Hi, Can you explain how FIFO works? I really want to know how ordering is maintained in a distributed queue.

    • @SystemDesignInterview
      @SystemDesignInterview  5 лет назад +7

      Hi.
      First thing to mention, is that FIFO is only really meaningful when you have one single-threaded sender and one single-threaded receiver. This article explains why: sookocheff.com/post/messaging/dissecting-sqs-fifo-queues/
      This is one of the reasons why FIFO queues have limited throughput.
      Second, we need a mechanism to maintain an order. Usually, this is achieved by maintaining a log. When messages arrive they are appended to the log. We can think of this log as a distributed service of its own, it must be highly available and performant. Implementing such service is not a trivial task. You can find inspiration in how Zookeeper guarantees the total ordering of all changes.
      Problem of FIFO ordering is closely related to the problem of Atomic Broadcast ( en.wikipedia.org/wiki/Atomic_broadcast ), which equivalent to the problem of Consensus ( en.wikipedia.org/wiki/Consensus_(computer_science) ).
      Sorry for dumping all this complexity on you. But this problem is indeed hard. Hopefully, you now have a better understanding of where to dig further.

    • @wnwca
      @wnwca 5 лет назад +1

      @@SystemDesignInterviewThank you for your structured and in-depth presentation on this topic. One follow-up discussion: the distributed message queue like Kafka can provide some level of transaction consistency with logs, which was frequently discussed by Martin Kleppmann in his tech talks (.e.g., ruclips.net/video/BuE6JvQE_CY/видео.html). I guess the interviewer may drill down this path to challenge the interviewee.

    • @LovyGupta007
      @LovyGupta007 5 лет назад +1

      @@SystemDesignInterview Thanks a lot for the links, i will go through them.

    • @SystemDesignInterview
      @SystemDesignInterview  5 лет назад

      Thank you, Wei, for the link. Always great to listen to Martin Kleppmann.

    • @wnwca
      @wnwca 5 лет назад

      @@SystemDesignInterview One additional suggestion: the distributed system broadly covers many terminologies, technologies, and algorithms. I wonder if you area able to build and share a list of reading materials for those important components of the distributed system. Reynold Xin of DataBricks shared his db-reading list here: github.com/rxin/db-readings Thanks again!

  • @catherineyin1303
    @catherineyin1303 2 года назад

    in the minutes of 16:10 how do we know what queue ID to assign for each message ?

  • @ashishaggarwal415
    @ashishaggarwal415 4 года назад

    Thanks for sharing this great video. I have one question related to distributing of messages. It could be the possibility that messages corresponding to one queue can't reside on one single host(If the subscriber is down and messages are getting accumulated) or one single host can maintain 5-6 queues messages(If the load corresponding to these queues are low), then what i can think of is to shard the messages based on Message Id and Queue ID in combination. It will help system to scale and better utilise the resources. But i am kind of confuse about fetching the messages from a host which is serving multiples queue messages data. Can you help me to understand that ?

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад +6

      Hi Ashish. May I ask you to clarify the question? I feel I am missing something. But I cannot figure out what specifically this is.
      Yes, each host stores many different queues. For simplicity, think of a queue as a file on disk. New messages are appended at the end of the file. Older messages are constantly fetched from the beginning of the file. Each file has its unique name - queue Id. When fetching a messages, queue Id is specified by the consumer. System identifies what host stores the file and reads the next message from the file on that host. Once again, this is a very simplistic description, but it might help.
      If one queue has many messages, we can split this queue (file) into multiple files and store these files across multiple hosts.

  • @nitinkhulbe6234
    @nitinkhulbe6234 Год назад

    nice content

  • @AmitKumar-we8dm
    @AmitKumar-we8dm 3 года назад

    How to build distributed database?

  • @vikasgupta1828
    @vikasgupta1828 Год назад

    Thanks

  • @AbhijeetNayak-connect
    @AbhijeetNayak-connect 4 года назад +2

    The moment he said VIP - I knew he is from Amazon

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад +3

      Hi Abhijeet. VIP is a commonly used concept in the industry. For load balancing, active/passive database clusters and pretty much everywhere where you want to use a single public IPs that can be reached by the internet. Where traffic is then routed to private IPs.
      I believe, I first learned about this concept from the "Release It!" book by Michael T. Nygard, first published in 2007. There is the second edition right now. I recommend to read it, especially for engineers less familiar with distributed systems.

    • @AbhijeetNayak-connect
      @AbhijeetNayak-connect 4 года назад +1

      @@SystemDesignInterview Hi Mikhail, thanks for the suggestion. Looking forward to more of your informative videos.

  • @gauravagarwal8592
    @gauravagarwal8592 3 года назад

    Sir any more new videos from you.

  • @raketabombapetarrda
    @raketabombapetarrda 3 года назад +1

    Спс, бро

  • @lolista
    @lolista 3 года назад +1

    wow

  • @adityabahuguna6815
    @adityabahuguna6815 5 лет назад +187

    Appreciate your efforts on aggregating and delivering such quality content in such a lucid manner. I don't think there is better content than this anywhere on youtube especially for system design topics. Wow ! (Y)

  • @vasilyvlasov3255
    @vasilyvlasov3255 2 года назад +28

    This is absolutely the best content on RUclips on the system design topic! No "scratching the surface" bullshit, but rather very in-depth and concrete explanation on how to navigate successfully through the system design interviews. Thank you Mikhail for your great efforts! Большое спасибо, Михаил! One thing bothers me though, there haven't been any recent updates on the channel. I'm pretty sure, all the people here will appreciate and support if Mikhail decides to continue his endeavours and uploads new videos! Anyway, thanks a lot!

  • @debasishnayak5576
    @debasishnayak5576 5 лет назад +18

    Normally I play videos in 1.5x or 2x. Your videos have so much information that I am afraid of losing some fundamentals if I play in 2x. Outstanding quality. Please keep making such videos.

  • @Dragoon77
    @Dragoon77 5 лет назад +22

    I've watched the whole series already, thanks for the great quality content! looking forward for more

  • @omfromam
    @omfromam 5 лет назад +10

    Great content! Thank you for making this. One thing that seems not clear (at least to me) ~min15-min17 you show the flow of the message from FE to backend node and to receiver. It is unclear how do you separate persisted information between MS (database) and in-cluster manager (ZooKeeper). Both seems to store mapping between Queue name and Leader Host. Do you really need to store this information in two places? How are they synchronized? Why would you need to keep this info in the MS in the first place? Isn't ZooKeeper enough for queue-to-node mapping?

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад +4

      Both options are fine:
      1. When we store mapping in the Metadata service. Zookeeper is used for leader election and leader monitoring. And if leader changes, this information is propagated to the Metadata service.
      2. When we store mapping in Zookeeper. Zookeeper is highly optimized for reads.
      Anyway, this information is stored in one place. So that we avoid any synchronization between configuration storages and have a single source of truth.

    • @BitsnBytes8
      @BitsnBytes8 2 года назад

      @@SystemDesignInterview Thank you for answering this question. I had the same doubt when going through the material.

  • @reyazahmed9320
    @reyazahmed9320 5 лет назад +11

    Great content. Thanks a lot. Just one feedback: Would have been great had there been subtitles as I find a bit difficult to get the words.

    • @SystemDesignInterview
      @SystemDesignInterview  5 лет назад +9

      Hi Reyaz. Thank you for the feedback! Point taken, I will try to add subtitles relatively soon.

    • @harshdubey9951
      @harshdubey9951 4 года назад +3

      Hi Reyaz, you can use inbuilt feature for subtitles provided by RUclips player.
      Click on the icon labelled with cc while playing the video.
      hope it helps until Mikhail provides the subtitles.
      Thanks

  • @saumittrasaxena2877
    @saumittrasaxena2877 4 года назад +9

    This is quality content. Really appreciate your efforts.

  • @dharmendrabhojwani
    @dharmendrabhojwani 5 лет назад +4

    Why this Guys is not giving more videos. We should in fact invest some money from our pockets and pay him to make such kind of videos rather than spending money individually on some courses.

    • @SystemDesignInterview
      @SystemDesignInterview  5 лет назад

      Hi Dharmendra. I am working on a new video right now. I took a big topic this time. Plus, was on vacation for a couple of weeks. So, a little bit behind. Hopefully, you will like the upcoming video. It covers many concepts required for a successful system design interview and system design in general.

  • @SudeeptaSood
    @SudeeptaSood Год назад +3

    can't thank you enough for this video. All of these components are building blocks and the interviewer can dig deep as to how the requests are handled from client to server. Awesome video

  • @jackieh2195
    @jackieh2195 5 лет назад +7

    Please keep uploading! This is great! Thanks

  • @ky5069
    @ky5069 4 года назад +3

    Regarding how the front end service finds the leader backend nodes, you mention that this discovery would be done via metadata service. But in the in-cluster method, we actually have that information in the coordinator service (zookeeper). In this case, would the metadata service just be a thin wrapper for the coordinator service (in case of backend node discovery)?
    Thank you so much for sharing these videos Mikhail. (Also, I love that you mention several times that the interviewer is there to help us, I find it delightful to have that perspective, and definitely helps during the interview)

    • @SystemDesignInterview
      @SystemDesignInterview  4 года назад

      Hi Suharto,
      Thank you very much for the feedback!
      Regarding your question, we can use either Metadata service or Zookeeper itself for storing and retrieving information about leaders. Please take a look at my answer here: ruclips.net/video/iJLL-KPqBpM/видео.html&lc=UgxAE6YfMUj95phbLid4AaABAg.90QChp-3ylO93AokcEr3Bu

  • @deephorse6110
    @deephorse6110 3 года назад +3

    Thank you for summarizing precisely about what can be covered in a 40 minute time limit. Knowledge is one part which is built over learning and experience. Your video really helps to focus on structuring and expressing the knowledge in a coherent manner. Thank you.

  • @ashwinkumar2126
    @ashwinkumar2126 3 года назад +3

    Really comprehensive coverage of the topic! Although, would've loved to see more discussion on Asynchronous out-cluster replication. It's tricky to design, eg. what happens when you receive a get request while the data hasn't fully replicated across hosts in a cluster? can we hit all hosts in a cluster? what happens if we receive a get request when the message deletion replication is in progress etc.