Notes to self: * Servers are processing jobs in parallel. * A server can crash. The jobs running on the crashed server still needs to get processed. * A notifier constantly polls the status of each server and if a server crashes it takes ALL unfinished jobs (listed in some database) and distributes it to the rest of the servers. Because distribution uses a load balancer (with consistent hashing) duplicate processing will not occur as job_1 which might be processing on server_3 (alive) will land again on server_3, and so on. * This "notifier with load balancing" is a "Message Queue".
The alternative approach is, instead of assigning individual task to server, you can let servers poll from the queue. In this case, your message queue is decoupled from application servers since message queue doesn't need to know anything about servers.
This guy reminds me of that friend that tries to explain and wrap up the whole semester for you 30 minutes right before the exam, because you didn't attend any lecture since the beginning.
Former teacher turned Linux engineer here. Very well done explanation of this concept. Easy to follow with great usage of visuals and ongoing metaphor!
know a days People like Bhaiya and didi on linkdln those who are giving lecture on system design other bla bla.. even don't know how to code, have more subscribers than this genuine talent. Hats of bro.😃
I am a java dev. For learning purpose I am planning to make queue system. But from your video i got idea that JMS is something I should learn now. Thanks Gaurav.
EXCELLENT job in all areas: Simplifying the use cases for explaining easily to non-experts, very VERY close examples to real-world instead of using hypothetical cases, starting from simple (in memory) to more complex approach (with database), and avoiding super technical jargon; yet not shying away from technical details (i.e. load balancing, hashing, etc.) Well done Gaurav!! I enjoyed this video.
Hi Gaurav, This is the one of the best real life example (with pizza shop) showing the need of asynchronous request/response system. Thanks for the great video. Really loved it.
This was the best video so far. The way you explained the entire stuff without saying Message Queue the whole time awakened my grey cells. Thanks a lot!
There are very less resource available for system design in youtube so please complete series . I like the way you teach. Thanks for making videos for us. God bless you
Great video! This video explains the system design of a pizza shop very well. However, it spends a lot of time explaining load balancers and the notifier and very few minutes are dedicated to discussing message queues. I am more interested in the actual use of message queues in real-world systems like pizza shop here.
this is the first time I found the wrong explanation. 1. Once you have notified and it found that server 3 is dead then it can give that specific task to "assigner" node which basically divides the task and share between other nearest server to that location (it is important). 2. Each shop has to maintain his own task queue and "assigner node" would add those task in that queue based on priority ( but this will be least among its own task, as it requires to completes own task first then other. ) 3. there is no use of centralized queue, until n unless you provide a feature to the pizza shop and based on user location pizza shop automatically (user did not select the shop which is always the case in dominoes at least) assign this task to the nearest shop. 4. Load balancer (if you were talking about the actual load balancer between servers) is no use for the assignment, as its responsibility to just equally divides the task across multiple servers within the same region for which this load balancer responsible for.
great clear presentation! one tip: use higher contrast ink and board. So that means more lighting to make the board more lighter/white and/or using dark ink like black or brown so we can see it more clearly. great content. subscribed. keep up the good work!
Hey Gaurav, appreciate your awesome work. my point is as you gave example if a pizza shop is down. what i think there will be no server on shop side and a shop has a client id only servers must be at remote location. orders object must have a client id. there must be process on server side that check heartbeat of client with an associated client id. if that client id down ( clients spawn a heartbeat thread to server and server checks heartbeat from client) then it will do the rest of things your explained done by your notifier component to assign the orders ( to client having nearest to client which is down)
Hi Gaurav, loved your explanation. But I think, here you've mixed up related but different concepts. You are actually creating a storm topology with input from a kafka queue. The order queue is a message queue (say kafka). Storm topology is polling messages from the order queue & is assigning the tasks to one of the worker nodes (pizza shops) & is internally keeping track of task status using a task queue & using hearbeats (by using a zookeeper) to check whether nodes (pizza shops) are alive or not & if not assign the unfinished tasks on that worker node to a different worker node.
As a European developer that has seen too much low-quality programming disasters from projects outsourced to low-wage Asian country developers, I must say that it's refreshing to see an Indian engineer that has actually studied for this stuff and knows what he's talking about, using the correct terminology... I'm subscribing; you're creating great content.
Gaurav, this is really a nice and knowledgeable tutorial you have made, in very sort time you have explained this topic very clearly. Can you try to make it as a practical ( how to implement RabbitMQ)
Gaurav, fantastic job with explaining the key tenants of high-level system design; how about bringing in some tooling and technology stack to support each of the architectures.
It's an amazing talk. Here's a follow-up question. What if the message queue goes down to be a single failure? Should we have the message queue with hot backup such as Active-Passive mode?
That is where the persistence of data comes into play. In order to keep reliability of the messages high, most message queues offer the ability to persist all messages to disk until they have been received and completed by the consumer(s). Even if the applications or the message queue itself happens to crash, the messages are safe and will be accessible to consumers as soon as the system is operational.
You don't necessarily need to assign a task to a specific server. A server that is ready can go and pull the pizza order from the queue and process it. It is much faster
But what if a task was pulled , was in progress and then the server crashes , how do we make sure that this task, is again pulled by any up & running servers ?
@@kanishkamakhija9046 Notifier(which does the heartbeat check) could detect whether a server has crashed and mark the corresponding tasks to be unassigned. Alternatively, a timeout could be added to each assigned task and if the in process task times out we remark it as unassigned and pick some alive server to handle it.
I also saw another approach where you can maintain a separate queue that holds in-progress tasks. If the server crashes, any other server can pick up tasks from the corresponding queue.
@Gaurav Sen Very informative video! I request you to make a video or videos about every component of any system design. Different problems employ different components. If we first study all the different components and their properties then at least we can stay thinking in right direction. Thanks for your video series!
Dear Gaurav Sir, please clarify 2 things please: (1) can you give a practical implementation of RabbitMQ or JMS or any one of the Message Queues? (2) Also, please explain more on Load Balancer with consistent hashing and notifier, with a practical example please? Please kindly reply to this.
Hi Gaurav , can we have server id as one of the columns in the table and use that also while querying the DB , so that it picks only the undone orders by that server(dead server) only, and rest can be taken care by the load balancers which is distributing the load ?
We could, but wouldn't that complicate things? For example: A job is initially assigned to S0, then reassigned to S1 after S0 crashes. Now S1 crashes. The way to keep track of S1's jobs would be to update the server id on the DB. It's an extra update operation. Now let's say a new requirement comes of evenly distributed load. In this case, when a new server comes up, it will take jobs from others. The db will again have to be updated. Consistent Hashing seems like a cleaner solution :)
Yes, may be it requires querying DB more often to update the server ids. Its better to follow the principle of load balancing which implements the uniqueness. Thanks. :)
But, if we are not maintaining system id along with each order, then whenever a server dies, we have to again see for all the orders, their nearest server. But, if we have server ids along with them then we would only have to assign the orders which are initially handled by the server that died. If a new server is added, then we have to anyway check all the entries, then its fine but when a server dies wouldn't maintaining server id's would reduce our work. Anyways great video..!! thanks.
@@shiwanggupta8608 I agree, also let's say you are very large chain, and have thousands of orders in process at any given time, going though all of them seems a waste of time, when you can differentiate by just the server id.
If you could cover on the concepts of real time operating systems- which includes tasks, task states, ,message queues and mailboxes etc..., It will be definetely helpful for me and for those who are interested in embedded computing and RTOS...
I'm self taught and system design is not an area that I seem to be learning from documentation (and that makes sense, it isn't relevant to synatx or features of most technologies). Thank you for providing this material! You're decent to good at teaching, but mostly you're providing resources that are difficult to acquire outside a classroom or without a mentor.
Hey Guarav, I am curious. If S3 (server 3) went down, can't traffic 9 and traffic 11 be identified as the ones that need rerouting if the database table had a column specifying which server it was routed to? That way we can query in the DB for requests that were lost by losing S3 instead of querying for all requests that have not finished yet.
IMHO, the main reason is we don't need to store server to tasks "permanently", the info is not useful comparing to task state/description, .etc, or in short, it's unrelated to the task itself. If we really want to identify server tasks, we could introduce another DB to store a map from server to in progress tasks, but it's an overkill and more complicated than using a timeout on each task in the queue.
@@bozhang8910 don't you want to see the report who failed frequently and tracking who's serving which item which can help to identify the poor performing outlets.. in case of other use cases where the customer does not need to know which server it delivers the request then your justification is good but incase of pizza deliver i dont think so
@@bozhang8910 You don't need another database, just another column. And there are other reasons to do this. Consider three pizza shops, p1, p2, and p3 where p1 and p2 are very close to each other and p3 is on another continent. If p1 goes down, sending its jobs to p3 would be a disaster, so if the table has a column for the server and a column for its nearest neighbor, the queue can dispatch in a way that doesn't involve transoceanic pizza delivery. A real life example would be CDNs for which the entire point is to find a replica of an asset which is closest to the client for performance reasons. Others would be geographic (like maps) or internationalized/localized services where the location can be inferred from the language.
for each system design if a detailed explaination is there, then, it will be helpful for all becoz i think i can learn system designs from you thank you for your extraordinary work
banks, thats a fairly common use case that uses messaging services/queues extensively. batch processing transactions, after the card scheme authorises your tx and the request is send to the issuing bank, the entire xml message is stacked up on a messaging queue that is being constantly listened to by the issuing bank messaging service
Great video! One minor tweak I'd make, rather than pushing tasks to the server why not have the servers pull tasks from the queue. No need for a load balancer and you can spin up new server instances without any bookkeeping or overhead.
Thanks Kyle! Having subscribers pull from the queue would make it (more or less) stateless, which is great. However, when you want your subscribers to immediately be notified on a new task, I'd have the queue pushing messages. I think a hybrid model is best, where most servers poll for messages while some expect them to be pushed to them.
Very well explained👍 ..If possible please make a video or two on Kafka Queues( involving Producer n Consumer ) ..This would serve as an example for for few of your previous videos on MQ,Load Balancing,Event Driven Systems..This would help them put together..
Hello Gaurav, nice and informative video. I have couple of questions: Q1: Fi sysA publishes a message to a queue, SysB and sysC are clients. If sysC restarted and missed few messages, When it recovers how can it claim the missed messages? without having sysB receiving duplicate messages. Q2: SysA publishes messages but sometimes not in order. SysB consumes messages. How we guarantee sysB gets them in order? looking forward to hearing your thoughts. Cheers
Hey Gaurav, you explained the concepts behind services provided by a message/task queue neatly. Your system design series is extremely useful for beginners like me. Keep up the good work. I have a small suggestion regarding the pizza example. In the scenario of a pizza shop node failure, the load balancer may consider other factors such as geographic proximity of a particular shop to the clients location. Hence the dynamic reallocation of requests by load balancer also includes some business logic. This makes the example a little bit complex for explaining the purpose of message queue. Please do consider finding out more simple examples which serves the purpose without additional complexity in future videos
Hello, really grateful for ur video, really. But May I give u little suggestion ? I’m backend dev, and new to system design. Sometimes I can’t understand what u r trying to explain. U speak then u stop then speak then stop. I think maybe things would get better when u try to explain things more fluently ? Again, really grateful for ur videos. Just really want to understand every video u shares with us😬 thx!
Please confirm if my video understanding related to queues are correct : When S3 crashed, we wanted to re-distribute the order no 9 and 11 to other pizza stores (s0 to s2).But server did not have record or order no's which were assigned to Pizza store S3. Initially we thought of using 'Load Balancer' for this purpose. (This is clear to me, so moving to queue now) After that we improved our solution by using 'Message queues'. If we use 'Message queues' then we need to have one 'Queue' for each Pizza Store (Both at client and server side applications). Each pizza store will listen to its own 'Message queue'. Whenever a new Pizza is added to a 'Queue' (belonging to S3). when this Message is received by Pizza store S3 and acknowledged and then S3 starts processing it. If S3 is 'SUCCESSFULLY' able to complete it then message will be removed from the Queue at server side. If S3 crashes then heart beats are not received and all messages stored in the Queue [belonging to Store S3] will be added to other store's Queues. If S3 throws error/Exception (Enough Cheese not available for Pizza) during processing then message is still on server queue [Not a valid use case for current discussion]
Hi Gaurav, I'm glad you have discuss about RabbitMq. I just came over to explore more about this topic ...I just want to know more about how to monitor microservices in rabbitmq..I'm able to know more about the application architecture and implementation after watching your videos. can we have ellaborate discussion over this i'm stuck somewhere in rabbitmq.
Is the notifier or load balancer a single point of failure in this case? What would happen if the notifier/LoadBalancer itself dies? Does there have to be a redundant notifier/Loadbalancer that pings the original LB to make sure its alive and take over whenever it goes down?
InterviewReady has a more detailed course, for SDE-II and above. This playlist is great for undergraduates and young engineers (upto SDE-1). You can view the website course contents here: interviewready.io/learn/system-design-course
messaging queue needs more channels to execute vast messages, if you use kafka then you need more partition or topics to handle huge messages execution per second. In stock broking we get more and more vast messages per seconds from lots of dealers and customers.
Hi Gaurav, great explanatory video. We used rabbitMQ to build sort of a cron scheduler. The issue previously was in case of multiple instances of the same app server how do you ensure a cron job is executed only once irrespective of which instance executes it. Do you think we made the right choice?
@@gkcs Yes, it provides acknowledgement and also an option to send message to only one or all subscribers to a queue. We had to choose between this or using redis and implementing some sort of semaphore. Thanks for responding :)
Great Explanation!! But I have a doubt.. To avoid the problem of same pizza order being delivered by multiple servers to same location (6:21) we can just add a column in the database which would be server id. So, the notifier will pick only those orders that belong to S3 (which crashed) by using its server id and which are not done. That will solve the duplication problem right?
Hi Gaurav, Love all the videos and the easily contextualized examples. When you mention load balancing as a solution for deduplication, it wasn't clear how load balancing does solve for this issue. You mention that there are 5 items in the list (3, 8, 20, 9, 11) and 9 and 11 need to be rerouted. The notifier identifies that a server is down and the database is queried to identify which items have not been finished. 1) instead of using a load balancer, couldn't we just have additional field in the database for pending? then only items with pending = 0 could be rerouted. (my guess is you'll mention that it's not necessary based on your load balancing solution, or that we will need load balancing regardless. 2) i'm not sure how the load balancing prevents items that are already pending in the still functioning servers are not deduplicated. (one pizza order made two times) could you go into a little more detail on how the load balancing prevents that? Thanks again!
to better my question, when balancing the load -> how does it know which requests don't need to be redistributed (eg, the items from servers 0, 1, 2)? is it querying all the servers first to know where the remaining load needs to be distributed?
5:58 why is order 3 being sent to another server even if s2 is working fine? and how does the load balancing mechanism prevent that order 3 from duplicating between two healthy servers? In your examples, are those orders being sent to the servers in the first place before they appear on the db table? because on one hand you're saying notifier is distributing NOT-DONE task to servers, but on the other hand you are using arrows to point orders directly to each server. You said load balancing is preventing SAME request goes to SAME server, so it does not prevent order 3 from going to s2 as well as s1, and that's exactly what we want to avoid.
Once an order is picked up by a server, Can't we mark its status as in progress and attach a timer to it such that the orders that are in progress wont be picked up by any servers? Status can be : New, In progress/Pending and Completed.
Hi Gaurav - I do have below queries, please answer me if possible 1. How to set campaign of 10000 mail trigger using rabbitmq? 2. We don't know about our server capability, how to know? 3. Which tools in use for rabbitmq load and stress testing?
@Gaurav Sen - Not able to follow along how RabbitMQ balances load also, as per my understanding, its a persistent system which can store messages and subscribers can read from this queue, however if a message is not read then that can be read by another server therefore database is not required. Could you please explain in detail?
Notes to self:
* Servers are processing jobs in parallel.
* A server can crash. The jobs running on the crashed server still needs to get processed.
* A notifier constantly polls the status of each server and if a server crashes it takes ALL unfinished jobs (listed in some database) and distributes it to the rest of the servers. Because distribution uses a load balancer (with consistent hashing) duplicate processing will not occur as job_1 which might be processing on server_3 (alive) will land again on server_3, and so on.
* This "notifier with load balancing" is a "Message Queue".
Very good notes 👍
I don't understand why there might be a duplicate ?
the notifier will just query the tasks that handled by the cracked server and distribute them!
@@mahmoudelrabee2456 That is true if we store the server id also. This is explained as the first approach at @4:52
Consistent hashing and load balancer are 2 different things not same
The alternative approach is, instead of assigning individual task to server, you can let servers poll from the queue. In this case, your message queue is decoupled from application servers since message queue doesn't need to know anything about servers.
Great job explaining everything in a way anyone can understand. You are a natural teacher! Please continue teaching and sharing your knowledge!
Thank you!
This guy reminds me of that friend that tries to explain and wrap up the whole semester for you 30 minutes right before the exam, because you didn't attend any lecture since the beginning.
great comment.... the way he is explaining its really wonderful
Lmao
dipak sonawane r
Exactly :)
those people are saints
Former teacher turned Linux engineer here. Very well done explanation of this concept. Easy to follow with great usage of visuals and ongoing metaphor!
know a days People like Bhaiya and didi on linkdln those who are giving lecture on system design other bla bla.. even don't know how to code, have more subscribers than this genuine talent. Hats of bro.😃
👍
Explained the whole thing in literally the first 40 seconds. Truly amazing work!
I am a java dev. For learning purpose I am planning to make queue system. But from your video i got idea that JMS is something I should learn now. Thanks Gaurav.
EXCELLENT job in all areas: Simplifying the use cases for explaining easily to non-experts, very VERY close examples to real-world instead of using hypothetical cases, starting from simple (in memory) to more complex approach (with database), and avoiding super technical jargon; yet not shying away from technical details (i.e. load balancing, hashing, etc.)
Well done Gaurav!! I enjoyed this video.
well said.
"Now I know everything about how to run a restaurant; I will be starting my own restaurant very soon."
Thanks to the legend Gaurav.
As I am watching the videos , I am able to predict what you are going to say/do now, like a movie, Awesome explanation !! more power to you Chap !!
Brother, I've been researching about message queues and I was so confused until I saw your video, thanks alot!
kya to samjaya hai Gaurav, maza aa gaya.. Such precise explanation is so rare on online tutorials.
Hi Gaurav, This is the one of the best real life example (with pizza shop) showing the need of asynchronous request/response system. Thanks for the great video. Really loved it.
Thanks!
One of the best explanations. Thank you so much
Thank you!
This was the best video so far. The way you explained the entire stuff without saying Message Queue the whole time awakened my grey cells. Thanks a lot!
I like the sheer excitement with which the topic is delivered!! Kudos!!
There are very less resource available for system design in youtube so please complete series . I like the way you teach. Thanks for making videos for us. God bless you
Thanks Sameer :)
"Very less", Very few. Indian English is hilarious though.
Thank you! Your tutorials are great! My college did not have System Design and Analysis class, and your videos helped to learn a lot.
When to use a Topic vs a Que would have been a nice addition.
Message queue is getting lots of spot light in the industry, please keep making the video on this topic i like you way of teaching. thanks again.
Great video! This video explains the system design of a pizza shop very well. However, it spends a lot of time explaining load balancers and the notifier and very few minutes are dedicated to discussing message queues. I am more interested in the actual use of message queues in real-world systems like pizza shop here.
The concept has been explained very clearly. It would be great if you would come with practical implementation using NodeJS with any MQ
not trying anything but a lot of indians are really smart. keep it up bro.
this is the first time I found the wrong explanation.
1. Once you have notified and it found that server 3 is dead then it can give that specific task to "assigner" node which basically divides the task and share between other nearest server to that location (it is important).
2. Each shop has to maintain his own task queue and "assigner node" would add those task in that queue based on priority ( but this will be least among its own task, as it requires to completes own task first then other. )
3. there is no use of centralized queue, until n unless you provide a feature to the pizza shop and based on user location pizza shop automatically (user did not select the shop which is always the case in dominoes at least) assign this task to the nearest shop.
4. Load balancer (if you were talking about the actual load balancer between servers) is no use for the assignment, as its responsibility to just equally divides the task across multiple servers within the same region for which this load balancer responsible for.
great clear presentation!
one tip: use higher contrast ink and board.
So that means more lighting to make the board more lighter/white
and/or using dark ink like black or brown so we can see it more clearly.
great content. subscribed. keep up the good work!
Are we just going to ignore the fact that 9 / 11 were pointing at the same server and THAT was the server that crashed! :P Your lectures are fun man!
Illuminati confirmed! Well done.
nahh bro!
no way u noticed this lol
I didn't even think of this 😂
Amazing explanation Gaurav. The best way to explain any concept is with practical examples, and you did the same.
Your channel is amazing. Your explanations are some of the best I’ve ever heard/seen. Good job man.
Hey Gaurav, appreciate your awesome work. my point is as you gave example if a pizza shop is down. what i think there will be no server on shop side and a shop has a client id only
servers must be at remote location.
orders object must have a client id.
there must be process on server side that check heartbeat of client with an associated client id. if that client id down ( clients spawn a heartbeat thread to server and server checks heartbeat from client) then it will do the rest of things your explained done by your notifier component to assign the orders ( to client having nearest to client which is down)
Thanks Vishal! Have a look at the full playlist. I do speak of it as you mentioned
Hi Gaurav, loved your explanation. But I think, here you've mixed up related but different concepts.
You are actually creating a storm topology with input from a kafka queue. The order queue is a message queue (say kafka). Storm topology is polling messages from the order queue & is assigning the tasks to one of the worker nodes (pizza shops) & is internally keeping track of task status using a task queue & using hearbeats (by using a zookeeper) to check whether nodes (pizza shops) are alive or not & if not assign the unfinished tasks on that worker node to a different worker node.
Yes. What about the reallocation of tasks though?
You explained it so well, thank you
As a European developer that has seen too much low-quality programming disasters from projects outsourced to low-wage Asian country developers, I must say that it's refreshing to see an Indian engineer that has actually studied for this stuff and knows what he's talking about, using the correct terminology... I'm subscribing; you're creating great content.
Low-quality programmers are everywhere not just in Asia.
The only youtuber that keeps his video at 9:59
Hahaha
HAHAHAHAHAHAHA
haha :?
back in the golden days of youtube everybody made videos to make videos, not to make money
Gaurav, this is really a nice and knowledgeable tutorial you have made, in very sort time you have explained this topic very clearly. Can you try to make it as a practical ( how to implement RabbitMQ)
Everyday you learn something new!
I always thought PS3 stood for PlaStation 3, but here I learn that it stands for PizzaShop 3.
Owsome👌👌 keep it up. Thank you🙏
Thank you!
Dude, you should get a Ph.D. for this... made it easy to understand. Thanks for sharing.
Gaurav, fantastic job with explaining the key tenants of high-level system design; how about bringing in some tooling and technology stack to support each of the architectures.
I've learnt a heck of a lot about how pizza shops work
Your charisma and you way of teaching gain me to be your subscriber :)
Nice narration..!! My pizza just got delivered under 30min
It's an amazing talk.
Here's a follow-up question.
What if the message queue goes down to be a single failure? Should we have the message queue with hot backup such as Active-Passive mode?
That is where the persistence of data comes into play. In order to keep reliability of the messages high, most message queues offer the ability to persist all messages to disk until they have been received and completed by the consumer(s). Even if the applications or the message queue itself happens to crash, the messages are safe and will be accessible to consumers as soon as the system is operational.
You don't necessarily need to assign a task to a specific server. A server that is ready can go and pull the pizza order from the queue and process it.
It is much faster
But what if a task was pulled , was in progress and then the server crashes , how do we make sure that this task, is again pulled by any up & running servers ?
@@kanishkamakhija9046 Notifier(which does the heartbeat check) could detect whether a server has crashed and mark the corresponding tasks to be unassigned.
Alternatively, a timeout could be added to each assigned task and if the in process task times out we remark it as unassigned and pick some alive server to handle it.
I also saw another approach where you can maintain a separate queue that holds in-progress tasks. If the server crashes, any other server can pick up tasks from the corresponding queue.
4:43 is my most favourite part of the video
Very good job. I had so many confusion you just cleared
Nice explanation. You are actually doing excellent work. Thank you.
@Gaurav Sen Very informative video! I request you to make a video or videos about every component of any system design. Different problems employ different components. If we first study all the different components and their properties then at least we can stay thinking in right direction. Thanks for your video series!
Haha, all this talk of food is giving me pizza cravings!
4:42 hmmmmmm interesting choice of numbers and words
Again, thanks so much for taking the time to make this video. I've learned a lot from you. Keep it up!
😁
Dear Gaurav Sir, please clarify 2 things please:
(1) can you give a practical implementation of RabbitMQ or JMS or any one of the Message Queues?
(2) Also, please explain more on Load Balancer with consistent hashing and notifier, with a practical example please?
Please kindly reply to this.
ruclips.net/p/PLMCXHnjXnTnvo6alSjVkgxV-VH6EPyvoX
Hi Gaurav , can we have server id as one of the columns in the table and use that also while querying the DB , so that it picks only the undone orders by that server(dead server) only, and rest can be taken care by the load balancers which is distributing the load ?
We could, but wouldn't that complicate things? For example: A job is initially assigned to S0, then reassigned to S1 after S0 crashes. Now S1 crashes. The way to keep track of S1's jobs would be to update the server id on the DB. It's an extra update operation.
Now let's say a new requirement comes of evenly distributed load. In this case, when a new server comes up, it will take jobs from others. The db will again have to be updated.
Consistent Hashing seems like a cleaner solution :)
Nice explanation Gaurav, I was stuck on this only.
Yes, may be it requires querying DB more often to update the server ids. Its better to follow the principle of load balancing which implements the uniqueness.
Thanks. :)
But, if we are not maintaining system id along with each order, then whenever a server dies, we have to again see for all the orders, their nearest server. But, if we have server ids along with them then we would only have to assign the orders which are initially handled by the server that died.
If a new server is added, then we have to anyway check all the entries, then its fine but when a server dies wouldn't maintaining server id's would reduce our work. Anyways great video..!! thanks.
@@shiwanggupta8608 I agree, also let's say you are very large chain, and have thousands of orders in process at any given time, going though all of them seems a waste of time, when you can differentiate by just the server id.
If you could cover on the concepts of real time operating systems- which includes tasks, task states, ,message queues and mailboxes etc..., It will be definetely helpful for me and for those who are interested in embedded computing and RTOS...
great explantaion!
I'm self taught and system design is not an area that I seem to be learning from documentation (and that makes sense, it isn't relevant to synatx or features of most technologies). Thank you for providing this material! You're decent to good at teaching, but mostly you're providing resources that are difficult to acquire outside a classroom or without a mentor.
Hey Guarav, I am curious. If S3 (server 3) went down, can't traffic 9 and traffic 11 be identified as the ones that need rerouting if the database table had a column specifying which server it was routed to? That way we can query in the DB for requests that were lost by losing S3 instead of querying for all requests that have not finished yet.
IMHO, the main reason is we don't need to store server to tasks "permanently", the info is not useful comparing to task state/description, .etc, or in short, it's unrelated to the task itself.
If we really want to identify server tasks, we could introduce another DB to store a map from server to in progress tasks, but it's an overkill and more complicated than using a timeout on each task in the queue.
@@bozhang8910 don't you want to see the report who failed frequently and tracking who's serving which item which can help to identify the poor performing outlets.. in case of other use cases where the customer does not need to know which server it delivers the request then your justification is good but incase of pizza deliver i dont think so
@@bozhang8910 You don't need another database, just another column. And there are other reasons to do this. Consider three pizza shops, p1, p2, and p3 where p1 and p2 are very close to each other and p3 is on another continent. If p1 goes down, sending its jobs to p3 would be a disaster, so if the table has a column for the server and a column for its nearest neighbor, the queue can dispatch in a way that doesn't involve transoceanic pizza delivery. A real life example would be CDNs for which the entire point is to find a replica of an asset which is closest to the client for performance reasons. Others would be geographic (like maps) or internationalized/localized services where the location can be inferred from the language.
Nicely explain
Excellent and lite tutorial on messaging and load balancing
so smart the way he is explaining. keep it up man
for each system design if a detailed explaination is there, then, it will be helpful for all becoz i think i can learn system designs from you
thank you for your extraordinary work
Thanks David :)
Very giid analogies and articulation of topics..gr8 wrk.
Thanks for the video. How RabbitMQ handles messages in the Kubernetes cluster? Please create a video on this topic.
In a typical System design interview: Dabbe banao dabbe.. :D
Thanks for the valuable video! It really helps me to understand what the message queue is :)
thanks for the informative presentation
Glad it was helpful!
Short.. To the point.. Brilliant 👍
banks, thats a fairly common use case that uses messaging services/queues extensively.
batch processing transactions, after the card scheme authorises your tx and the request is send to the issuing bank, the entire xml message is stacked up on a messaging queue that is being constantly listened to by the issuing bank messaging service
Nicely and simply explained
Explained the concept really well , easy to understand
Great video! One minor tweak I'd make, rather than pushing tasks to the server why not have the servers pull tasks from the queue. No need for a load balancer and you can spin up new server instances without any bookkeeping or overhead.
Thanks Kyle!
Having subscribers pull from the queue would make it (more or less) stateless, which is great. However, when you want your subscribers to immediately be notified on a new task, I'd have the queue pushing messages.
I think a hybrid model is best, where most servers poll for messages while some expect them to be pushed to them.
@@gkcs That makes sense
Very well explained👍 ..If possible please make a video or two on Kafka Queues( involving Producer n Consumer ) ..This would serve as an example for for few of your previous videos on MQ,Load Balancing,Event Driven Systems..This would help them put together..
I have this in my task list. It'll take time to get to it though :)
Cool!!!!
Hello Gaurav, nice and informative video. I have couple of questions:
Q1: Fi sysA publishes a message to a queue, SysB and sysC are clients. If sysC restarted and missed few messages, When it recovers how can it claim the missed messages? without having sysB receiving duplicate messages.
Q2: SysA publishes messages but sometimes not in order.
SysB consumes messages.
How we guarantee sysB gets them in order?
looking forward to hearing your thoughts.
Cheers
Hey Gaurav, you explained the concepts behind services provided by a message/task queue neatly. Your system design series is extremely useful for beginners like me. Keep up the good work.
I have a small suggestion regarding the pizza example. In the scenario of a pizza shop node failure, the load balancer may consider other factors such as geographic proximity of a particular shop to the clients location. Hence the dynamic reallocation of requests by load balancer also includes some business logic. This makes the example a little bit complex for explaining the purpose of message queue. Please do consider finding out more simple examples which serves the purpose without additional complexity in future videos
That's a good point.
I took up the example since I wanted to point the simple way of going about designing a system. 😁
Very clear explanation. Thank you
Bro, keep bringing up more videos like this. We are all such a big fan of yours. ALL the best! 👍
PS: Amazon has SQS ;)
Please make one video on Kafka. Appreciate your work.
Check learning journal challenge for Kafka video.
Hello, really grateful for ur video, really. But May I give u little suggestion ? I’m backend dev, and new to system design. Sometimes I can’t understand what u r trying to explain. U speak then u stop then speak then stop. I think maybe things would get better when u try to explain things more fluently ? Again, really grateful for ur videos. Just really want to understand every video u shares with us😬 thx!
Thanks! I'll try and keep that in mind :)
Please confirm if my video understanding related to queues are correct :
When S3 crashed, we wanted to re-distribute the order no 9 and 11 to other pizza stores (s0 to s2).But server did not have record or order no's which were assigned to Pizza store S3.
Initially we thought of using 'Load Balancer' for this purpose.
(This is clear to me, so moving to queue now)
After that we improved our solution by using 'Message queues'. If we use 'Message queues' then we need to have one 'Queue' for each Pizza Store (Both at client and server side applications). Each pizza store will listen to its own 'Message queue'. Whenever a new Pizza is added to a 'Queue' (belonging to S3). when this Message is received by Pizza store S3 and acknowledged and then S3 starts processing it. If S3 is 'SUCCESSFULLY' able to complete it then message will be removed from the Queue at server side.
If S3 crashes then heart beats are not received and all messages stored in the Queue [belonging to Store S3] will be added to other store's Queues.
If S3 throws error/Exception (Enough Cheese not available for Pizza) during processing then message is still on server queue [Not a valid use case for current discussion]
Hi Gaurav,
I'm glad you have discuss about RabbitMq. I just came over to explore more about this topic ...I just want to know more about how to monitor microservices in rabbitmq..I'm able to know more about the application architecture and implementation after watching your videos. can we have ellaborate discussion over this i'm stuck somewhere in rabbitmq.
Really good video. Khub valo laglo dekhe.
Is the notifier or load balancer a single point of failure in this case? What would happen if the notifier/LoadBalancer itself dies? Does there have to be a redundant notifier/Loadbalancer that pings the original LB to make sure its alive and take over whenever it goes down?
Hi Gaurav, great content. Does this playlist contains all topics in the course you sell on interview bit? @
InterviewReady has a more detailed course, for SDE-II and above. This playlist is great for undergraduates and young engineers (upto SDE-1).
You can view the website course contents here: interviewready.io/learn/system-design-course
4:42 I'd never thought I would hear the number 9, 11, and the word crash in the same sentence
Really nice explanation, impressive.
messaging queue needs more channels to execute vast messages, if you use kafka then you need more partition or topics to handle huge messages execution per second. In stock broking we get more and more vast messages per seconds from lots of dealers and customers.
Amazing explanation! Thanks!
You're welcome!
Hi Gaurav, great explanatory video.
We used rabbitMQ to build sort of a cron scheduler.
The issue previously was in case of multiple instances of the same app server how do you ensure a cron job is executed only once irrespective of which instance executes it. Do you think we made the right choice?
I think it's a good choice. Does RabbitMQ give an exactly-once delivered feature?
@@gkcs Yes, it provides acknowledgement and also an option to send message to only one or all subscribers to a queue.
We had to choose between this or using redis and implementing some sort of semaphore.
Thanks for responding :)
Great Explanation!!
But I have a doubt..
To avoid the problem of same pizza order being delivered by multiple servers to same location (6:21) we can just add a column in the database which would be server id. So, the notifier will pick only those orders that belong to S3 (which crashed) by using its server id and which are not done. That will solve the duplication problem right?
Nicely done Gaurav!! Once again... :)
Hi Gaurav,
Love all the videos and the easily contextualized examples. When you mention load balancing as a solution for deduplication, it wasn't clear how load balancing does solve for this issue. You mention that there are 5 items in the list (3, 8, 20, 9, 11) and 9 and 11 need to be rerouted. The notifier identifies that a server is down and the database is queried to identify which items have not been finished.
1) instead of using a load balancer, couldn't we just have additional field in the database for pending? then only items with pending = 0 could be rerouted. (my guess is you'll mention that it's not necessary based on your load balancing solution, or that we will need load balancing regardless.
2) i'm not sure how the load balancing prevents items that are already pending in the still functioning servers are not deduplicated. (one pizza order made two times) could you go into a little more detail on how the load balancing prevents that?
Thanks again!
to better my question, when balancing the load -> how does it know which requests don't need to be redistributed (eg, the items from servers 0, 1, 2)? is it querying all the servers first to know where the remaining load needs to be distributed?
"Filling up the coke can" 2:20 haha. Love your videos bro.
his expressions are hilarious !
Good Teaching!!! Keep it up with great work
We would like to have videos on object-oriented design patterns too like observer pattern etc.
Please keep making videos. :D
OOP is overrated and people are finally moving towards functional programming with NodeJS and React as some common examples
5:58 why is order 3 being sent to another server even if s2 is working fine? and how does the load balancing mechanism prevent that order 3 from duplicating between two healthy servers? In your examples, are those orders being sent to the servers in the first place before they appear on the db table? because on one hand you're saying notifier is distributing NOT-DONE task to servers, but on the other hand you are using arrows to point orders directly to each server. You said load balancing is preventing SAME request goes to SAME server, so it does not prevent order 3 from going to s2 as well as s1, and that's exactly what we want to avoid.
Please create a video over HermesJMS. This video was really helpful in getting the basics of MQ
Once an order is picked up by a server, Can't we mark its status as in progress and attach a timer to it such that the orders that are in progress wont be picked up by any servers?
Status can be : New, In progress/Pending and Completed.
Gaurav! Its good conceptually. Thanks
Hi Gaurav - I do have below queries, please answer me if possible
1. How to set campaign of 10000 mail trigger using rabbitmq?
2. We don't know about our server capability, how to know?
3. Which tools in use for rabbitmq load and stress testing?
@Gaurav Sen - Not able to follow along how RabbitMQ balances load also, as per my understanding, its a persistent system which can store messages and subscribers can read from this queue, however if a message is not read then that can be read by another server therefore database is not required. Could you please explain in detail?