I've seen lots of system design prep videos on youtube especially cause I'm being interviewed right now for Senior BE position and this guy really shows valid ideas on required abstract level. For everyone who has doubts, "yes" this is proper level for senior position and his conceptions are valid for every-day usage with this level of abstraction.
THIS IS PURE GOLD! It just somehow magically landed in my recommendations and damn, I'm so grateful it did. Just 2 minutes into the video and was convinced to subscribe! So thankful to Bobby, he has a great future ahead!
pretty decent take - I've seen some comments on specific improvements, but narrowing the scope of a given problem on SD interviews is A MUST, otherwise you won't make it inside the given time frame. functional/business prioritization on this case was very well done!
Super good point. This was very broad; usually a system design interview will drill into a specific part of this much larger system to see how deep your understanding goes. Thanks for watching!
I can't believe what I just saw. This man is literally what I want to become in the long term. For me, this video has set up my end goal as a senior. What an amaizing video. Thank you so much. New subscriber here.
I'm not a software engineer or developer, but you just had my eyes glued to the screen for 16 minutes. While I might not understand the concepts in this system design, the way the information was presented was extremely well done and approachable for a wide range of viewers.
no offense, but the reason you like this is exactly because you are not an engineer or developer. Almost everything seen in this video is obsolte and just a nice visualization of what the process of creating a real app looks like (which is done without all the painting)
Great content. Please make more videos. Content is king. Even though there are tons of videos on Internet about System Design but the kind of clarity that one gets from these videos is unparalleled. It's to the point. Worth watching again and again.
RUclips reccomended this video to me and 2 minutes into the video, i subscribed to the channel, i can't let this kind of content pass me by, please do more.
This channel just came at the right time, when I am doing system architecture, thank you and keep them coming you are indeed contributing to my career growth
WxCtly what I needed. I don’t have time to learn syntax, so I generate all my snippets but I need form of macro assembly training to manage larger models thanks a million
Beware that the second you mention a pub/sub system in an interview you may have to touch on the topic of message ordering. The immediate scalability of kafka event streams seems great in paper but if you have to process things like chats in order you may have to make kafka store all the relevant events in one partition which isn't scalable.
Great point. In this system, message ordering isn't a huge concern, but there are certainly situations where this is an important consideration. Thanks for watching!
This is super cool man, thanks so much. Just a small feedback, I clicked on the video because I saw a really nice diagram. I'm a highly visual person, probably like a million others and when I saw your drawings I was a bit dissapointed, not because they are bad but because studying with other people hand writting definetly slows you down like "what does it say here?" Don't take this the wrong way, you are a genius, keep up the good work.
one organization improvement I'd make on the diagram, not only for readability, but also for a better notion of resources management, would be delimiting some domain/namespace boundaries, organizing the services appropriately: - Client APIs - External APIs / Services - Events Bus - Internal APIs / Services - Data / Storage etc
Cool stuff! Anyways, I think it is very important to get the requirements as clear as possible before jumping into the design. Ask questions, make sure you are not assuming too much. This solution could be way over-engineered for some cases and it might rise some red-flags among the interviewers.
FYI (for those interested in project and program management) systems design and project management works hand-in-hand. Technical Project Management is breaking these down one by one, little by little then coming up with a solution - that's technical project management. More often than not, technical project managers help the engineers by solving what they cant: meaning issue X is more of a devops issue than an engineering issue, the tpm looks at the problem from an engineering standpoint, asks the devops team how to solve it, creates a plan, budget etc. If you're into firefighting all day, get into TPM, we need lots of people that wants to fight chaos.
I feel like you can make a whole series on this one problem man, would love to see an extended version of this where you go a little slower and into more detail. Either way, this video is super useful!
I agree! Typically during interviews, we'd take an hour to cover one of the many key concepts he addressed here: efficient data storage and retrieval at scale; distributed systems and comms; buy vs build tradeoffs; algorithms to for efficient search at scale; etc. I was very impressed at how good he was at packing all that info in under 20mins, and conveying it in a way that's easily digestible!
🎯 Key Takeaways for quick navigation: 00:00 🚗 Key requirements for ride-sharing: map point selection, ETA, payment, matching, real-time updates. 02:03 📡 Using an event bus (e.g., Kafka) for system communication. 04:21 📊 Structuring the database with sharding for scalability. 06:00 🌐 Efficiently indexing drivers using H3 hexagonal cells. 09:44 💰 Leveraging services like Stripe, Mapbox, or Google Maps for payments and mapping. 10:12 🚀 Implementing a Spark streaming pipeline for demand-based pricing. 13:25 🚖 Matching riders with drivers through proximity-based services. 15:43 🛠️ Opportunities for optimization, data analytics, and advanced ETA algorithms. Made with HARPA AI
Yep, quadtree is another data structure that can be used instead of the H3-based approach discussed here. Both have pros and cons. Thanks for watching!
Few things that felt like hand waving: 1. How is the driver location updated in the data structure? Uber has a few million drivers (say 3M) and if they send location every 3 seconds, there's 1M updates coming in every second on an average. 2. It seems disparate events are put on Kafka (ride request, payment, driver location), so instead of showing one Kafka box, perhaps having more than one would've helped. Currently the diagram looks like a spider web, and that's mostly because all the services on the right subscribe to one Kafka box. 3. It's not clear how the global indices are implemented in the trip DB? Is the data duplicated? 4. What's in Redis that is used for pricing calculation? 5. Can we hope that good content will be supported by a diagram not drawn by a 3-year old? Why, there are probably a dozen tools that can be used to draw boxes and stuff. An actual system design interview is close to 45 minutes, so, three times the duration of this video. All systems consist of load balancers, event bus (Kafka), distributed cache (Redis), so, any candidate can draw those boxes. The details are what make for a more real interview experience.
1. As explained, the driver locations are held in a sharded data store using H3. This means we can efficiently query for only data in a certain area, and it means the writes from our 3M drivers are distributed. 2. Yes, there is one Kafka cluster with multiple topics; each service can subscribe to whatever topics it needs. 3. The implementation of global indices would depend on the database used (we try not to limit ourselves to one platform in these videos), but a commonly used approach is to create a secondary table with the indexed value as the shard key and another column pointing back to the primary table based on the primary key. 4. The pricing data is ephemeral in nature, so we're storing the results from our streaming pipeline in Redis for efficiency and simplicity. 5. Thanks.
These types of videos imo are the best as these can get one out of “tutorial hell”. Although challenging - I think it forces me out of my comfort zone and actually makes me think on how to program and shows me gaps in my knowledge. New subscriber - I would like to see more of your videos!
One critical part that is missing in this SD interview is, discussion about different trade offs. e.g. in the matching service, why use Uber's H3, what about quadtree, geohash and Google's S2, what's the pros and cons of using these different methods.
Great video. The one thing I didn't hear him mention is what type of Database or databases he would use (other than Redis for caching). What do you think the driver, rider, trips database should be and why?
Hey @interviewpen, great video, thanks! One question I have is why did you go for the server side API app with load balancing rather than an API gateway to access Rides (and maybe other services) directly? I know load balancers and gateways are not mutually exclusive, just interested why you went with one app routing all the client requests? Another question is about payments, will the user get payment confirmation on the client side as well? You show in the little diagram that user will receive the confirmation after the webhoot will send the message to kafka, so the server side of the app will be subsribed to the topic of payments filtering by user_id (for example), but how would the client side receive the confirmation afterwards?
Thanks for watching! Usually when people talk about API gateways, it's a nebulous term that probably means a load balancer. If we want to scale our API, we need some way to route our clients to one of several nodes, so a load balancer is critical. About payment confirmations, this is something we could set up by sending a message to the client over WebSockets after the server receives the message. Hope that helps!
tbh I am very impressed about a content, that autor produced despite of his young age he has a lot of knowledge in projects and it's greate! I found a lot of interesting things. Thanks a lot!
This is a fantastic video - lots of content in a short space of time delivered clearly - thanks! One question I had was regarding storing of driver locations and loading them into H3 - how would that be accomplished using the DB design here? Or would the H3 index constantly be updated separately? Thanks.
We showed the two databases separately to show that they can be decoupled, but they certainly could be done in the same one. However, the index would still need to be update separately either way. Thanks!
Major pro of WebSockets is that we don't have to keep making network round-trips for polling when there's no new data. This can decrease latency and load on the API servers. The con is that it's a bit harder implementation-wise--we have to do some special stuff on the load balancer, handle dropped connections, etc. Hope that helps!
Hey, first of all great video, great content, exceptional delivery - seriously wow. Around minute 6:20 of the video you correctly say that the DB will have to be able to scale horizontally as it is expected to be very large with high traffic coming in. You said, the easiest way to do it is by sharding, which left me wondering, did you consider a noSQL option? Clearly it is easier for horizontal scaling. If so, why did you decide to stay with the relational approach?
I don’t think sharding implies a relational database, in fact you’re absolutely right that NoSQL databases are far easier to shard. Good thoughts and thanks for watching!
I am not even a software engineer, but this was interesting to watch. The seemingly simple applications that we use in day to day have complicated backends. Hats off to the engineers.
The main challenge of Uber or Lyft , is the massive number of updates that they have to do in realtime and also persist , i don't see this tackled here ? for this Uber using a variance of quadtree
Yep, this large influx of updates is why we introduced a sharded database, and the realtime location updates you're referring to are tackled by the rides database using H3. Of course in practice there's a ton of optimizations to be made! Thanks for watching
It looks like a rouiting component is missing. Pricing clearly depends on the route length (the mentioned surge just scales this price up) and, sometimes, on traffic jams. You can't calculate ETA without a route. And most client UIs draw the route on the map.
Routing algorithms are hard, and we abstracted a lot of this logic away. The ETA service will of course calculate a route to get the ETA, the pricing algorithm must take a route as an input, and the route must be sent to the client to display it on a map. Thanks for watching!
This is great, now how do I explain requirements to someone who has no idea about uber/lyft or applications in general? You have good intuition here used to synthesize requirements and identify possible issues immediately. That intuition comes from knowing what you're trying to build, for new product development where the final product is not clear, how could we handle that?
You're right, we should've gone through the requirements first before diving into the solution. Our newer videos are much better about this! Thanks for watching.
It's tough to cover every part of systems like this in detail in a video (Uber has spent years building their systems), but we try to give you the core foundations you need to get started and to approach similar systems. Thanks for watching!
I think this is kind of a intern/junior level software design, which is good, but certainly lacking in most details that these systems require. For example, when you talk about sharding the database, it's not so easy to say just shard it. Some of the problems with sharding a db using such a key in this way would be referencing and constraint enforcement as well as update request stream consumers and their ordering. Also, your "rider and driver APIs" discussion is...flawed. Most of the system design for those APIs should be about actual data modeling and api design. You also can't just say to use "websockets" because it has some very bad tradeoff. For example, sockets are not stateless and requires consistent connection, which you will definitely not have in a driver's mobile phone. Another problem is data consistency. If your data producer sends a message to the socket but the receiver was in a temporary disconnected state; what happens to that request? Also, since it's websockets, who is doing the broadcast? And since cars can "come online" at any point; does the requester keep resending the request? If so, then isn't polling necessary anyways? If the concern is concurrent request scale, it's much easier to shard the api using the same H3 algorithm, isn't it? There are a lot more problems with this design, but for a student and assuming an ideal world, this is a good first draft. I would say though that it would be much better to assume those ideal conditions (perfect and constant connectivity, some scale target, some location target, etc), and then to better simplify your solution. I feel like a big problem with this presentation is the use of buzzwords without knowing how they work or what they mean. I'd rather you use fewer techy sounding words and be more accurate with your solution because your interviewer will certainly see right through these buzzwords and you will get drowned with real world problems that you can't explain (like the Kafka network limitations that makes this just not possible as described) Also, for this channel in general: this is not a good interview question for system design. Usually, system design interviews are more for mid level engineers so doing it at this scope is not good enough. When I interview engineers for system design, it's okay to start with a broad system question, but it's much more revealing to dive into one specific area because the complexity of a problem is when you have to design in the details. I think even just the payment processing is sufficiently complex to deep dive let alone trying to design every system in Uber. edit: I'm a senior software engineer at a FAANG and have conducted 100+ mid level engineer interviews.
This feels like a very genuine comment. If I may, Can you please provide your insights on how one should prepare for mid level interviews avoiding all the things you mentioned above like not using fancy words etc"? Thanks in advance
@@jagrit07 Thank you for the question. Unfortunately, in my experience, the most effective way of "learning system design" is to do it. It's easy to draw charts and read tutorials and watch yt videos about systems; but, doing a complex system from scratch by hand will teach you at least the surface level of how your tools work. Whether it be if you want to target being a senior at cloud infrastructure or data warehousing and reporting or payment processing or even graphics rendering or whatever it is, there is a very specific set of industry grade tools that real world engineers at senior levels are operating on. And it's not just about touching and operating on the actual tools and the programming languages, it's also about designing the system to solve a problem. And the tricky part is that I always ask a real-world problem and not a "leetcode" system design problem. A real world problem is extremely messy and often run into organizational issues. And you really can't understand or learn about those until you experience it. So my advice: focus on your junior level job and gain experience, offer solutions and offer to implement others' solutions and offer to fix live production problems. It will become obvious to you when you've reached senior level and you won't even have to worry about how to learn system design at a senior level. On a more personal opinion: I really admire a lot of engineers' passion for the industry; however, I often see them become frustrated very quickly and try to "grind" too hard by overworking and missing out on their young lives. My experience has been that solving problems and being responsible for operating complex systems is a burden that requires a lot of experience to be comfortable with it. And experience takes time. So I would advice that young people don't be so hard on yourselves, be patient, you're going to be just fine.
@@chihchang1139 Thank you so much for the detailed answer. I have 3 yrs of experience and I am very much inclined to what you said. I asked the above question after reading books/notes and watching videos etc etc and then revising them too. But still I felt I was missing something and that I believe is the practical gap. Thanks once again. Although I do have a follow up lol in the same lines but I don’t want to ask because you already put a lot of effort in explaining. But as they say if you never ask you never know lol so I will ask Now when it comes to practical knowledge right sometimes the project you work on is basic crud when it comes to functionality and all. And then you read system design problems like design uber design chatapp design netflix design twitter design amazon Now I specifically mentioned these 5 above because they are unique in some aspects like uber finding the cabs, chat doing the bidirectional thing using ws, netflix onboarding tons of videos and streaming them using adaptive bitrating, twitter generating the news feed and giving the follower/following and amazon having the whole selecting and buying thing. All these systems might be 60-70% same where they leverage tools like kafka, dbs etc etc which are common but usp of each is different. Now the piece of the puzzle I am missing is how would one get practical insights into system like above if he works day to day life on the crud apps etc. I would have the theory knowledge but if you talk practical it’s like building practical project like above with job with dsa with life more importantly. If we say okay get a new job and then it’s the same loop of hld lld etc etc. Then due to this comes the need of watching videos reading drawing charts etc etc. Now I know my problem/question might be very boring and you don’t have to answer but yeah. Still Thanks in advance
@@jagrit07 Firstly, I want to apologize for assuming your level and years of experience. Your concern is generally right on the money: how can you gain the experience needed if you are not given the opportunity to work on complex systems? And unfortunately, companies generally do not care about your personal journey, they can only provide the level of complexity required for their business. Good thing is that pretty much every business is complex. Even working on just CRUD APIs, it gives you insight into many fundamental systems design tools like networking, data modeling and storage, caching, microservices, frontend/bff/backend patterns, REST API modeling, permissions, SDLC, monitoring, reporting, etc. And it's also about learning about the details of the trade offs of your tools and when and how to apply them to each problem. Sometimes you're already good at working with your tools to solve complex problems or sometimes you're not familiar enough with the tools to know just how powerful they really are. A great CRUD API practitioner can solve almost every problem in system design. But yes, you won't have experience with real time data clusters, but that's okay. You're not going to master every domain. You're not going to also know the systems of complex machine learning clusters. You're not going to learn the systems of satellite communicators or systems of military encryption. The "CRUD API" systems domain is large enough that it needs senior engineers. Be good at what you do because when I interview mid level or senior engineers, I'm diving deep into your own solution to try and break it apart. I'm not looking to test you on things you just are never going to have touched unless its essential to the job, in which case, you're just not a great fit. Work towards the job domain that you want to be senior in. If you need more complexity than just CRUD, then seek out those jobs, but stay within your level so you can learn.
@@chihchang1139 wow, Thank you so much. I will take your inputs and focus. Will come back here later after 2-3 years to let you know how did it go lol. Thanks once again, You are awesome!
This is amazing! As im in a junior developer, this video is inspiring me with verious concepts of the highly efficient architecture in real time situation and suggesting ultimate goal! Thanks for sharing this great insight!
I think more detail could have been given on why you used Kafka. I understand how it makes sense, but maybe walking through a couple data flows would have made it more clear.
during the process of notifying the driver to either accept or deny why not send to lets say 5-10 drivers closest to rider and the first to accept gets it and all the responses sent to a queue and if theres a driver for the driver the rest get rejected for that trip , this would improve user experience as users would spend less time searching for a driver instead of waiting for 1 drivers response then switching to the next which can take up some time
Most of the microservices in this design do in fact own their own data! The core database is only being used by the driver and rider APIs, which of course need to see the same data. Thanks for watching!
Thanks, very interesting video, but I feel like combination of technologies is overwhelming and unnecessary overcomplicated. There might be a reason for it but it feels fragile to have so many moving parts.
Very good point-stuff does get complicated at this kind of scale, but it’s always best to start with a simple (maintainable) solution and add complexity once it fails to meet load. Thanks for watching!
Sure--a global index is essentially a copy of a database table, but organized differently. This allows us to query the data in different ways efficiently. Hope that helps, thanks for watching!
I’ve watched quite a few of your videos and I see that you mentioned websocket, but you don’t elaborate about the impact of that in terms of how to scale it. Thank you 🙏
Thanks for watching! Visit interviewpen.com/? for more great Data Structures & Algorithms + System Design content 🧎
Need basic design system for PayPal and Braintree 😮
I've seen lots of system design prep videos on youtube especially cause I'm being interviewed right now for Senior BE position and this guy really shows valid ideas on required abstract level. For everyone who has doubts, "yes" this is proper level for senior position and his conceptions are valid for every-day usage with this level of abstraction.
Thank you!
THIS IS PURE GOLD!
It just somehow magically landed in my recommendations and damn, I'm so grateful it did. Just 2 minutes into the video and was convinced to subscribe!
So thankful to Bobby, he has a great future ahead!
Thanks for the kind words! Have a great day
same for me too
Same thing happened to me, I'm grateful for that!
The way he explains everything is just pure gold
Thank you!
pretty decent take - I've seen some comments on specific improvements, but narrowing the scope of a given problem on SD interviews is A MUST, otherwise you won't make it inside the given time frame. functional/business prioritization on this case was very well done!
Super good point. This was very broad; usually a system design interview will drill into a specific part of this much larger system to see how deep your understanding goes. Thanks for watching!
I can't believe what I just saw. This man is literally what I want to become in the long term. For me, this video has set up my end goal as a senior. What an amaizing video. Thank you so much. New subscriber here.
Thanks for the kinds words - and thanks for watchinf!
This is not "senior" by any stretch of the imagination. Any junior / medior engineer can understand and explain these concepts.
@@DavidBcc Welp, I guess I'm way more junior than I thought.
@@DavidBccno they can’t lol
I swear there’s always gotta be someone like you
@@gewdvibes100% 😂
I'm not a software engineer or developer, but you just had my eyes glued to the screen for 16 minutes. While I might not understand the concepts in this system design, the way the information was presented was extremely well done and approachable for a wide range of viewers.
Thanks for the kind words!
Totally agree. I love how he put things together and it all makes sense to me and makes me want to learn more.
no offense, but the reason you like this is exactly because you are not an engineer or developer.
Almost everything seen in this video is obsolte and just a nice visualization of what the process of creating a real app looks like (which is done without all the painting)
@@Snprwlf This is a very necessary step, If Engineers didn't have architects their building would be a mess.
Great content. Please make more videos. Content is king. Even though there are tons of videos on Internet about System Design but the kind of clarity that one gets from these videos is unparalleled. It's to the point.
Worth watching again and again.
Really glad you enjoyed it, we've got more content on the way. Thanks!
I wish I can lecture the interviewer like this. Very well articulated.
Thanks!
This channel is easily gonna get a lot of subscribers, great content.
Thanks! We'll be posting more!
These system design videos are awesome! Really interesting topic with good visual explaining, keep up!
Thanks! We have a lot more coming
RUclips reccomended this video to me and 2 minutes into the video, i subscribed to the channel, i can't let this kind of content pass me by, please do more.
Thanks! Glad you liked it.
I cannot believe you've placed many valuable topics in a 16-minute video! This was worth more than a 12-hour course! Thank you!
Glad it was helpful!
This is so good. I have searched for this type if videos but this is very very good. Clean fast forward explanation with much detail. Thank you!
Appreciate it, thanks for watching.
This is one of the most educative videos I have seen in a long time
thanks for watching!
This channel just came at the right time, when I am doing system architecture, thank you and keep them coming you are indeed contributing to my career growth
ok! we'll post more and more! building a team rn
One of the best systems design videos I've seen! Great job!
Thanks!
WxCtly what I needed. I don’t have time to learn syntax, so I generate all my snippets but I need form of macro assembly training to manage larger models thanks a million
Beware that the second you mention a pub/sub system in an interview you may have to touch on the topic of message ordering. The immediate scalability of kafka event streams seems great in paper but if you have to process things like chats in order you may have to make kafka store all the relevant events in one partition which isn't scalable.
Great point. In this system, message ordering isn't a huge concern, but there are certainly situations where this is an important consideration. Thanks for watching!
We can use the partition key to route all the messages for one person to 1 partition.
Mate im just speechless, you earned my subscription and this is one of the best videos ive ever watched
Thank you!
I was asked to design Über in an interview about a month ago, if only I came across this before
This guy is really good at teaching, very informative and concise, thanks.
I appreciate that! Thanks for watching!
Haven't started watching the video yet but I know I have to subscribe!
Thank you!
This is absolutely perfect! Thank you so much for sharing!
Appreciate it, thanks!
Sir, this is golden. Thank you!
Thanks for watching!
Good analytical skills man. Congrats!
Thanks!
This is super cool man, thanks so much. Just a small feedback, I clicked on the video because I saw a really nice diagram. I'm a highly visual person, probably like a million others and when I saw your drawings I was a bit dissapointed, not because they are bad but because studying with other people hand writting definetly slows you down like "what does it say here?" Don't take this the wrong way, you are a genius, keep up the good work.
Glad you liked the video! Yeah, I know my handwriting isn't great, and I'm striving to improve that always. Thanks for the feedback.
@@interviewpen keep up the good work, you gained a new subscriber
I did not know that spark could be used to solve problems in this kind of project, very informative and detailed explanation thank you.
Thanks for watching!
After seeing this I am definitely buying a subscription in your webpage
Thanks!
Mind blown... this was amazing to watch...
Thanks!
Thank you for the this content... What a pleasure!
Thanks for watching!
one organization improvement I'd make on the diagram, not only for readability, but also for a better notion of resources management, would be delimiting some domain/namespace boundaries, organizing the services appropriately:
- Client APIs
- External APIs / Services
- Events Bus
- Internal APIs / Services
- Data / Storage
etc
Good point, thanks!
Cool stuff! Anyways, I think it is very important to get the requirements as clear as possible before jumping into the design. Ask questions, make sure you are not assuming too much. This solution could be way over-engineered for some cases and it might rise some red-flags among the interviewers.
Yes, you're absolutely right. This is a super important first step (and our more recent videos try to be better about this). Thanks for watching!
FYI (for those interested in project and program management) systems design and project management works hand-in-hand. Technical Project Management is breaking these down one by one, little by little then coming up with a solution - that's technical project management. More often than not, technical project managers help the engineers by solving what they cant: meaning issue X is more of a devops issue than an engineering issue, the tpm looks at the problem from an engineering standpoint, asks the devops team how to solve it, creates a plan, budget etc. If you're into firefighting all day, get into TPM, we need lots of people that wants to fight chaos.
I feel like you can make a whole series on this one problem man, would love to see an extended version of this where you go a little slower and into more detail. Either way, this video is super useful!
Glad you found it useful!
I agree! Typically during interviews, we'd take an hour to cover one of the many key concepts he addressed here: efficient data storage and retrieval at scale; distributed systems and comms; buy vs build tradeoffs; algorithms to for efficient search at scale; etc. I was very impressed at how good he was at packing all that info in under 20mins, and conveying it in a way that's easily digestible!
Excellent architecture work there bud.
Thanks!
You're very good at explaining! Thanks for this
Glad you enjoyed it!
Def gonna buy interview pen!! Awesome stuff!!
Thanks for supporting
🎯 Key Takeaways for quick navigation:
00:00 🚗 Key requirements for ride-sharing: map point selection, ETA, payment, matching, real-time updates.
02:03 📡 Using an event bus (e.g., Kafka) for system communication.
04:21 📊 Structuring the database with sharding for scalability.
06:00 🌐 Efficiently indexing drivers using H3 hexagonal cells.
09:44 💰 Leveraging services like Stripe, Mapbox, or Google Maps for payments and mapping.
10:12 🚀 Implementing a Spark streaming pipeline for demand-based pricing.
13:25 🚖 Matching riders with drivers through proximity-based services.
15:43 🛠️ Opportunities for optimization, data analytics, and advanced ETA algorithms.
Made with HARPA AI
Loved it. Well structured and comprehensive!
thanks for watching!
These videos are so good
Hope you post more
We will! Stay tuned - we will be posting weekly (try our best to).
I've heard Uber uses a structure called a Quadtree to split their map area into grids for easier rider - driver matching.
Yep, quadtree is another data structure that can be used instead of the H3-based approach discussed here. Both have pros and cons. Thanks for watching!
Uber uses H3
This is a great video for large distributed systems.
Thank you!
awesome effort guys please keep up this momentum!
Thanks!
instant subscribe over here. very clear information, amazing content. Keep it up! Thanks
Thanks!
thank u for making these videos. love to see how u would implement these in code
sure! we can experiment with that - will be releasing more content
This is a really good explanation. Thank you for sharing your knowledge.
Sure - thanks for watching!
Few things that felt like hand waving:
1. How is the driver location updated in the data structure? Uber has a few million drivers (say 3M) and if they send location every 3 seconds, there's 1M updates coming in every second on an average.
2. It seems disparate events are put on Kafka (ride request, payment, driver location), so instead of showing one Kafka box, perhaps having more than one would've helped. Currently the diagram looks like a spider web, and that's mostly because all the services on the right subscribe to one Kafka box.
3. It's not clear how the global indices are implemented in the trip DB? Is the data duplicated?
4. What's in Redis that is used for pricing calculation?
5. Can we hope that good content will be supported by a diagram not drawn by a 3-year old? Why, there are probably a dozen tools that can be used to draw boxes and stuff.
An actual system design interview is close to 45 minutes, so, three times the duration of this video. All systems consist of load balancers, event bus (Kafka), distributed cache (Redis), so, any candidate can draw those boxes. The details are what make for a more real interview experience.
1. As explained, the driver locations are held in a sharded data store using H3. This means we can efficiently query for only data in a certain area, and it means the writes from our 3M drivers are distributed.
2. Yes, there is one Kafka cluster with multiple topics; each service can subscribe to whatever topics it needs.
3. The implementation of global indices would depend on the database used (we try not to limit ourselves to one platform in these videos), but a commonly used approach is to create a secondary table with the indexed value as the shard key and another column pointing back to the primary table based on the primary key.
4. The pricing data is ephemeral in nature, so we're storing the results from our streaming pipeline in Redis for efficiency and simplicity.
5. Thanks.
Amazing except I needed this 5-6 years ago 😂
Thanks for watching!
These types of videos imo are the best as these can get one out of “tutorial hell”. Although challenging - I think it forces me out of my comfort zone and actually makes me think on how to program and shows me gaps in my knowledge. New subscriber - I would like to see more of your videos!
Thanks for the comment and thanks for watching 👍
This video is amazing and was recommended at just the right time for me. Subscribed within 2 minutes of watching!
Thanks for watching 👍
One critical part that is missing in this SD interview is, discussion about different trade offs. e.g. in the matching service, why use Uber's H3, what about quadtree, geohash and Google's S2, what's the pros and cons of using these different methods.
Good point, there’s tons of options for geospatial indexing. Thanks for watching!
Great video. The one thing I didn't hear him mention is what type of Database or databases he would use (other than Redis for caching).
What do you think the driver, rider, trips database should be and why?
Any shardable database would work-NoSQL is generally better in that regard. Cassandra or Mongo are good options. Thanks for watching!
@@interviewpen hmm.. But why NoSQL vs Postgres or MySQL? Also, if NoSQL, why Cassandra (columnar) vs Mongo (Document)?
Got me subscribed! Nice.
👏
Insane! Thank you
Sure - thanks for watching!
Hey @interviewpen, great video, thanks! One question I have is why did you go for the server side API app with load balancing rather than an API gateway to access Rides (and maybe other services) directly? I know load balancers and gateways are not mutually exclusive, just interested why you went with one app routing all the client requests?
Another question is about payments, will the user get payment confirmation on the client side as well? You show in the little diagram that user will receive the confirmation after the webhoot will send the message to kafka, so the server side of the app will be subsribed to the topic of payments filtering by user_id (for example), but how would the client side receive the confirmation afterwards?
Thanks for watching! Usually when people talk about API gateways, it's a nebulous term that probably means a load balancer. If we want to scale our API, we need some way to route our clients to one of several nodes, so a load balancer is critical. About payment confirmations, this is something we could set up by sending a message to the client over WebSockets after the server receives the message. Hope that helps!
Thanks for the video! which platform are you using to note?
We use GoodNotes on an iPad.
im so blown away
Thanks for watchinf!
Great content here. Subscribed
Thank you! Glad you liked it.
You got 1 more subscriber. Please keep posting with such a great explanation
ok, thanks!
tbh I am very impressed about a content, that autor produced despite of his young age he has a lot of knowledge in projects and it's greate! I found a lot of interesting things. Thanks a lot!
Thanks, glad you liked it!
Visual explaination was comprehensive
Thanks for watching!
This is a fantastic video - lots of content in a short space of time delivered clearly - thanks!
One question I had was regarding storing of driver locations and loading them into H3 - how would that be accomplished using the DB design here? Or would the H3 index constantly be updated separately? Thanks.
We showed the two databases separately to show that they can be decoupled, but they certainly could be done in the same one. However, the index would still need to be update separately either way. Thanks!
What’s the pros and cons of costs, availability, and maintenance of the websocket over polling design decision?
Major pro of WebSockets is that we don't have to keep making network round-trips for polling when there's no new data. This can decrease latency and load on the API servers. The con is that it's a bit harder implementation-wise--we have to do some special stuff on the load balancer, handle dropped connections, etc. Hope that helps!
Hey, first of all great video, great content, exceptional delivery - seriously wow.
Around minute 6:20 of the video you correctly say that the DB will have to be able to scale horizontally as it is expected to be very large with high traffic coming in. You said, the easiest way to do it is by sharding, which left me wondering, did you consider a noSQL option? Clearly it is easier for horizontal scaling. If so, why did you decide to stay with the relational approach?
I don’t think sharding implies a relational database, in fact you’re absolutely right that NoSQL databases are far easier to shard. Good thoughts and thanks for watching!
Amazing content. Instant sub.
Thanks!
lisan al gaib, he is the chosen one
I am not even a software engineer, but this was interesting to watch. The seemingly simple applications that we use in day to day have complicated backends. Hats off to the engineers.
Thanks!
Thank you. This is amazing
Thanks!
Wow👏👏👏👏👏👏👏👏 nailed it 💯👌
Thanks!
Love this!
thanks for watching - more videos coming soon!
You know ball dude, keep it up
Thanks!
Very well done. Cool.
Thank you!
The main challenge of Uber or Lyft , is the massive number of updates that they have to do in realtime and also persist , i don't see this tackled here ?
for this Uber using a variance of quadtree
Yep, this large influx of updates is why we introduced a sharded database, and the realtime location updates you're referring to are tackled by the rides database using H3. Of course in practice there's a ton of optimizations to be made! Thanks for watching
Excellent vid! Thanks!
Question: What is that drawing tool being used..?
GoodNotes - thanks for watching!
thanks for the information.
sure
Fantastic video.
Thanks!
It looks like a rouiting component is missing. Pricing clearly depends on the route length (the mentioned surge just scales this price up) and, sometimes, on traffic jams. You can't calculate ETA without a route. And most client UIs draw the route on the map.
Routing algorithms are hard, and we abstracted a lot of this logic away. The ETA service will of course calculate a route to get the ETA, the pricing algorithm must take a route as an input, and the route must be sent to the client to display it on a map. Thanks for watching!
It was just amazing, I really wish there was some course.
Thanks for watching! We have a full system design course on interviewpen.com :)
This is great, now how do I explain requirements to someone who has no idea about uber/lyft or applications in general? You have good intuition here used to synthesize requirements and identify possible issues immediately. That intuition comes from knowing what you're trying to build, for new product development where the final product is not clear, how could we handle that?
You're right, we should've gone through the requirements first before diving into the solution. Our newer videos are much better about this! Thanks for watching.
This is awesome! Could you kindly do a video on how to work around the same, like making this a complete project maybe. Thank you
It's tough to cover every part of systems like this in detail in a video (Uber has spent years building their systems), but we try to give you the core foundations you need to get started and to approach similar systems. Thanks for watching!
@@interviewpen Alright, it is still nice, let me try from here, thank you
I think this is kind of a intern/junior level software design, which is good, but certainly lacking in most details that these systems require. For example, when you talk about sharding the database, it's not so easy to say just shard it. Some of the problems with sharding a db using such a key in this way would be referencing and constraint enforcement as well as update request stream consumers and their ordering.
Also, your "rider and driver APIs" discussion is...flawed. Most of the system design for those APIs should be about actual data modeling and api design. You also can't just say to use "websockets" because it has some very bad tradeoff. For example, sockets are not stateless and requires consistent connection, which you will definitely not have in a driver's mobile phone. Another problem is data consistency. If your data producer sends a message to the socket but the receiver was in a temporary disconnected state; what happens to that request? Also, since it's websockets, who is doing the broadcast? And since cars can "come online" at any point; does the requester keep resending the request? If so, then isn't polling necessary anyways? If the concern is concurrent request scale, it's much easier to shard the api using the same H3 algorithm, isn't it?
There are a lot more problems with this design, but for a student and assuming an ideal world, this is a good first draft. I would say though that it would be much better to assume those ideal conditions (perfect and constant connectivity, some scale target, some location target, etc), and then to better simplify your solution. I feel like a big problem with this presentation is the use of buzzwords without knowing how they work or what they mean. I'd rather you use fewer techy sounding words and be more accurate with your solution because your interviewer will certainly see right through these buzzwords and you will get drowned with real world problems that you can't explain (like the Kafka network limitations that makes this just not possible as described)
Also, for this channel in general: this is not a good interview question for system design. Usually, system design interviews are more for mid level engineers so doing it at this scope is not good enough. When I interview engineers for system design, it's okay to start with a broad system question, but it's much more revealing to dive into one specific area because the complexity of a problem is when you have to design in the details. I think even just the payment processing is sufficiently complex to deep dive let alone trying to design every system in Uber.
edit: I'm a senior software engineer at a FAANG and have conducted 100+ mid level engineer interviews.
This feels like a very genuine comment. If I may, Can you please provide your insights on how one should prepare for mid level interviews avoiding all the things you mentioned above like not using fancy words etc"? Thanks in advance
@@jagrit07 Thank you for the question. Unfortunately, in my experience, the most effective way of "learning system design" is to do it. It's easy to draw charts and read tutorials and watch yt videos about systems; but, doing a complex system from scratch by hand will teach you at least the surface level of how your tools work.
Whether it be if you want to target being a senior at cloud infrastructure or data warehousing and reporting or payment processing or even graphics rendering or whatever it is, there is a very specific set of industry grade tools that real world engineers at senior levels are operating on. And it's not just about touching and operating on the actual tools and the programming languages, it's also about designing the system to solve a problem.
And the tricky part is that I always ask a real-world problem and not a "leetcode" system design problem. A real world problem is extremely messy and often run into organizational issues. And you really can't understand or learn about those until you experience it.
So my advice: focus on your junior level job and gain experience, offer solutions and offer to implement others' solutions and offer to fix live production problems. It will become obvious to you when you've reached senior level and you won't even have to worry about how to learn system design at a senior level.
On a more personal opinion: I really admire a lot of engineers' passion for the industry; however, I often see them become frustrated very quickly and try to "grind" too hard by overworking and missing out on their young lives. My experience has been that solving problems and being responsible for operating complex systems is a burden that requires a lot of experience to be comfortable with it. And experience takes time. So I would advice that young people don't be so hard on yourselves, be patient, you're going to be just fine.
@@chihchang1139 Thank you so much for the detailed answer. I have 3 yrs of experience and I am very much inclined to what you said.
I asked the above question after reading books/notes and watching videos etc etc and then revising them too. But still I felt I was missing something and that I believe is the practical gap. Thanks once again.
Although I do have a follow up lol in the same lines but I don’t want to ask because you already put a lot of effort in explaining.
But as they say if you never ask you never know lol so I will ask
Now when it comes to practical knowledge right sometimes the project you work on is basic crud when it comes to functionality and all. And then you read system design problems like design uber design chatapp design netflix design twitter design amazon
Now I specifically mentioned these 5 above because they are unique in some aspects like uber finding the cabs, chat doing the bidirectional thing using ws, netflix onboarding tons of videos and streaming them using adaptive bitrating, twitter generating the news feed and giving the follower/following and amazon having the whole selecting and buying thing.
All these systems might be 60-70% same where they leverage tools like kafka, dbs etc etc which are common but usp of each is different.
Now the piece of the puzzle I am missing is how would one get practical insights into system like above if he works day to day life on the crud apps etc. I would have the theory knowledge but if you talk practical it’s like building practical project like above with job with dsa with life more importantly.
If we say okay get a new job and then it’s the same loop of hld lld etc etc. Then due to this comes the need of watching videos reading drawing charts etc etc.
Now I know my problem/question might be very boring and you don’t have to answer but yeah. Still Thanks in advance
@@jagrit07 Firstly, I want to apologize for assuming your level and years of experience.
Your concern is generally right on the money: how can you gain the experience needed if you are not given the opportunity to work on complex systems? And unfortunately, companies generally do not care about your personal journey, they can only provide the level of complexity required for their business. Good thing is that pretty much every business is complex.
Even working on just CRUD APIs, it gives you insight into many fundamental systems design tools like networking, data modeling and storage, caching, microservices, frontend/bff/backend patterns, REST API modeling, permissions, SDLC, monitoring, reporting, etc. And it's also about learning about the details of the trade offs of your tools and when and how to apply them to each problem. Sometimes you're already good at working with your tools to solve complex problems or sometimes you're not familiar enough with the tools to know just how powerful they really are.
A great CRUD API practitioner can solve almost every problem in system design. But yes, you won't have experience with real time data clusters, but that's okay. You're not going to master every domain. You're not going to also know the systems of complex machine learning clusters. You're not going to learn the systems of satellite communicators or systems of military encryption.
The "CRUD API" systems domain is large enough that it needs senior engineers. Be good at what you do because when I interview mid level or senior engineers, I'm diving deep into your own solution to try and break it apart. I'm not looking to test you on things you just are never going to have touched unless its essential to the job, in which case, you're just not a great fit. Work towards the job domain that you want to be senior in. If you need more complexity than just CRUD, then seek out those jobs, but stay within your level so you can learn.
@@chihchang1139 wow, Thank you so much. I will take your inputs and focus. Will come back here later after 2-3 years to let you know how did it go lol. Thanks once again, You are awesome!
This is amazing! As im in a junior developer, this video is inspiring me with verious concepts of the highly efficient architecture in real time situation and suggesting ultimate goal! Thanks for sharing this great insight!
Thanks, glad you liked it!
Very impressive
Thanks!
Thank you, it was interesting and informatively
Thanks for watching
Love this video
Thanks!
I think more detail could have been given on why you used Kafka. I understand how it makes sense, but maybe walking through a couple data flows would have made it more clear.
Ok, noted. Thank you for watching.
Oh shit I actually understand this after taking 6 months of gov funded cloud bullshit training. I thought I wasted time but this actually makes sense.
Nice!
like it these architecture use case solutions
Thanks
during the process of notifying the driver to either accept or deny why not send to lets say 5-10 drivers closest to rider and the first to accept gets it and all the responses sent to a queue and if theres a driver for the driver the rest get rejected for that trip , this would improve user experience as users would spend less time searching for a driver instead of waiting for 1 drivers response then switching to the next which can take up some time
Sure, we could do that within this system. Thanks for watching!
@@interviewpen great video and I love your channel as I want to add the system design skill to my skill set as an software engineer currently
great video, why a monolithic db? should the microservices not own their own data?
Most of the microservices in this design do in fact own their own data! The core database is only being used by the driver and rider APIs, which of course need to see the same data. Thanks for watching!
We call these Activity Diagram in UML/SysML
Cool
Thanks, very interesting video, but I feel like combination of technologies is overwhelming and unnecessary overcomplicated. There might be a reason for it but it feels fragile to have so many moving parts.
Very good point-stuff does get complicated at this kind of scale, but it’s always best to start with a simple (maintainable) solution and add complexity once it fails to meet load. Thanks for watching!
Thank God I found u
Thanks!
It is very important to me levels to development the system like uder
Cool cool, thanks for watching
Great video
Thanks!
I feel like my Pokemon is about to evolve!
Thanks for the video, it's very educative.
I didn't fully get how global indexes are working. Would appreciate it if you can elaborate on it.
Sure--a global index is essentially a copy of a database table, but organized differently. This allows us to query the data in different ways efficiently. Hope that helps, thanks for watching!
I’ve watched quite a few of your videos and I see that you mentioned websocket, but you don’t elaborate about the impact of that in terms of how to scale it. Thank you 🙏
Thanks, we'll consider doing a video on that!
Bro calling this basic💀. Jokes apart the quality and explanation was very good
Glad you enjoyed it!