Heads up, you'll notice that the approach and framework in this video are different from our other ones. This video was made many months before the first from the recent series. While the approach here is by no means wrong, we recommend the framework in our other videos more. We considered deleting it, but have left this up because candidates continue to find it valuable and we don't want to take that away.
This video also seems very skinny compared to some of the more recent ones which drill into some of the details of the design decisions. Would you folks consider redoing this design with a more of an E6 focus again?
Hi, thank you very much for this video on system design, I stumbled on your channel yesterday and till now I've been consuming your video series on system design. Thanks again. However, I'd like to take a course dedicated to teaching system design with a well curated learning pattern(comprehensive and beginner friendly), would you mind recommending any ?
@@laffs4084 It doesn't follow their recommended structure. For example, you'd define the core entities and API contracts before hitting the high level design. And the design itself should be as simple as possible to satisfy the requirements. Then you'd go into different areas of deep dives that were interesting, rather than deep dive as you go and risk running out of time to make a complete system. That's my understanding at least!
I've been watching all of your videos in preparation and this actually makes me curious. Is this video closer to what you'd expect to see from a candidate since it was more "fresh" to your experiences as an interviewer, and the months you've since spent refining the process for the channel is more theoretical than practical? It's probably a common trajectory for creating educational content vs the rough experience on the ground, so to speak. Either way, I'm taking notes and fully appreciate the clear and concise instruction your channel has provided.
Dude, its insane we can get this content for free. Im just starting out to prepare for the system design interview and your content is a godsend. I cannot put my appreciation into words. Thank you ever so kindly.
I have been waiting for years for someone to create understandable, well thought out, structured explanations for hard concepts like the ones Alvin from Structy made for dynamic programming (that are now my single resource on the topic), but instead for system design. I think I finally found them. I immensely appreciate the work you put into this and the Hello Interview resources.
This videos helped me a lot to see things I did not see before and I did not know where to learn! Thanks a million! I always passed code interviews but failed the system design so my offers from Meta, Microsoft, Google and Amazon were always demoting... All this videos helped me a lot to realize what the problem was.
Thanks for making this video. One of the best system design for Twitter I have found that explains clearly. Probably, API reference and capacity metrics should have been discussed as well prior to designing the architecture diagram.
Great video as always, a few follow-up questions. 1> How user get the notification when someone likes or reply to his tweet? 2> Is Fanout done for dormant users too, if yes then it's a waste of expensive Cache.
This is amazing, starting to prepare for system design interview. This video puts everything together so well. Practicing on the AI dashboard now. Great work 👏🏻
You store replies in a separate database, why not store replies and retweets in different databases as well ? their scale is massively different . By the way, your website is probably one of the best I've seen in terms of learning system design - the way you present information with trade offs, questions and answers is really well done.
This video focuses on functionality that is also apart from the core Twitter design. Likes unfollow , reporting and spam detection are specific to Twitter. And also search can be done in depth. However load balancing aspects were covered. But API gateway could have been dealt more about its functionality specific to Twitter. However it’s a great effort for free content
Thanks man 👌 This is what API Gateway server special for. It is to handle any non-functional requirements. like rate-limit, caching, security. etc. But I think you had to mention that to not confuse the beginners
It’s main responsibility is to route api requests to the correct microservice, but it also can handle general responsibilities like auth, rate limiting, ssl termination, etc
One advantage that I could assume that the video is more effective is that it is a much shorter video. It shows a equally deep dived interview but with half/ a third the length. Although I will not use this structure during the interview, I would assume that a more senior eng could benefit from the video
Question about the fanout strategy. So based on how I understand it: Fanout-on-write: Async (pre?) constructing timeline in a cache, where the most recent tweets of the followed accounts are pre-pended so that when a user makes a READ request for the timeline, it already has that timeline available. Fanout-on-read: Helps solve the celebrity problem, where rather than writing to millions of timelines, wait until the READ request for a user's timeline is made. At that point, you can integrate the followed celebrity(ies)' most recent tweets and programmatically add them in. How about when a user, although unlikely, follows thousands of celebrities? That adds latency if millions of users are following tens of thousands of celebrities. The way I considered answering this in a SD interview: - That is unlikely, so don't solve (not a good answer) - Fanout-on-write. Determine users that cross a specific threshold of "celebrity" acounts, and when a celebrity tweets, DO write to specific timelines - the timelines for users that crossed that threshold. - Limit the number of followers. What is your advice in this scenario?
I think that in a microservice architecture, the API Gateway should be placed before the load balancer. The request goes to the gateway first, then each microservice has a load balancer.
Wonderful, I've watched all of the videos on your channel and I'm kind finishing up with this one. I guess this was a proof of concept perhaps? Either way, it's great. Love to see that you defined a framework for the other videos. Looking forward to more of those!
@@hello_interview for what it’s worth, I learned a ton from this one. - You showcase here that it’s worth delineating read/write paths for specific* services, which is not something you folks have done in the other videos (it sounds obvious but it’s not for someone of my level). - You spend a bunch of time towards the end talking about salient points we could make, whether related to security, monitoring and testing. Again, points you don’t quite bring up in the others. Now if you’re ever going to make a new Twitter design video, perhaps? I don’t know, you guys are doing a tremendous job leading this anyway. Thought I’d share some user-feedback. Cheers.
It's interesting to notice how much your content has evolved between this walkthrough and your other system design videos. Lots of breadth, less depth, and the design feels a like bit more complicated, likely due to the breadth covered.
On a litter note , Wondering, why companies ask for System Design of FB,Twitter , Netflix designs, etc. If at all I know all of these, I would be a Billionaire by now and don't even bother to attend technical interviews. 😆😆. But nevertheless, this video is a very good one !
Amazing video. Quick followups: 1. I see you have stored Likes on tweets in the tweet document itself. Please throw some light on how would the flow be for someone liking a tweet. I don't think it should go through the tweet CRUD route. 2. Please share more about the CDC.
Thanks for this video. You have definitely covered a lot of topics in breadth, but isn't the expectation in a design interview to atleast deep dive into one of the components ? Would you give a hire feedback for an E5 candidate who presented exactly the same content as that of this video in their onsite interviews ?
Great question! This solution would be adequate for an E4 candidate, but not for an E5+ candidate, for the reason you mentioned. As candidates get more senior, the expectation is less breadth and more depth. For an E5 candidate, I'd suggest you find 1-2 places to go deeper. For Twitter in particular, the search component could be one of these places.
@@hello_interviewthanks for your response. Do you have some suggestions reagrding the time distribution to finish a design(including deep dive in one component) in a 40 minute Meta interview ?
May be a noob question, why do you need Load balancer before Api gateway? shouldnt that be after api gateway as gateway itself will be resilient enough to balance the traffic? May be after gateway we decide if request is going to particular service then which server of that particular service it should goto?
Not a noob question, this is confusing. It is just an abstract here but i've dropped it since it confused a lot of people. In reality, each horizontally scaled service has its own load balancer. I just did not want to draw all of them.
two issues: 1- In describing ELK you said something like "..we use Elasticsearch for storing the logs, Logstash to analyze it and Kibana to present it (dashboard). this is not entirely correct. we use Logstash to capture logs from system components, Elasticsearch to store and analyze the logs, and Kibana as dashboard. 2- The API Gateway comes first before LB not the other way around as you presented, so the flow is something like " Client → API Gateway → Load Balancer → Backend Services (e.g., tweet service, user service) ...
Hey, for viral tweets, we get a lot of likes and retweets. In that case, we will need to edit the MongoDB document for that tweet thousands of times. So, is noSQL a better choice in that case or shall we choose SQL with normalised tables?
Great video! Can you please do a write up for system design of Twitter in your newer format on your site? I could only find Tweet search and Facebook feed that are related.
Why do you choose NoSQL over SQL for Tweets? Is it because: 1. Faster writes because no table structure needs to be followed? 2. Asynchronous writing in NoSQL? 3. SQL would generally ask me to normalise the likes and retweets table and thus, slowness in joins while reading?
This is how you do it. Whole twitter in 20 minutes. OP, you are the best. I didn't get though why user profiles need to be stored in a sql db, why mongo would be a poor choice? Analytics can be it's own separate DB. Will a SQL DB sustain analytics and logins with 300+M active users? if you shard the DB, then is it possible to do analytics? Also when a user follows another one, it should be recorded such that timeline takes that into consideration (oh you added it later :) ). What about rate limiters? Is it a separate service such that the throttling data per account is in central place? I liked you touched security, logging and monitoring. Is this something that interviewers are looking for?
"Is this something that interviewers are looking for?" - Depends on the interview. I have had a few where they told me ahead of time they wanted to touch on security, monitoring, and maintainability. This is likely something you can ask the recruiter ahead of time for clarification.
In this video why did we skip the core entities and API steps of the framework? I really enjoyed the video. I never thought of the replies being in a separate database. I wished I had discovered your channel before I had my interviews.
Hi, thank you very much for this video on system design, I stumbled on your channel yesterday and till now I've been consuming your video series on system design. Thanks again. However, I'd like to take a course dedicated to teaching system design with a well curated learning pattern(comprehensive and beginner friendly), would you mind recommending any ?
Feedback while watching the video - I'm not sure if it's from the diagramming app or added but the pounding sound effect when adding a new service is annoying and detracts from the viewing experience. The content is super helpful though.
Great video! Will will definitely check out the site. Couple of questions from your design: Why create a separate Auth service instead of handling it in API gateways? Also, how is Auth performed when creating Tweets if that service is under the Profile service? When building a cached timeline, do you see this as a key-value store with user_id keys and prepend-only log of tweet_ids? {user_id: [...]}
Didn't understand why you chose SQL for UserProfiles? 1. Querying on different attributes - Does NoSQL not provide this? Its all about indexing, right? 2. ACID - Why do we need ACID properties for UserProfiles? 3. Joins - Which table do you want the Users table to join with? All other tables are NoSQL or Graph, so how do we perform joins?
This is a totally different approach from the other systems interview questions... In the others we do a simplified diagram and then deep-dive to add complexity. Which is the better approach?
There is one little detail that bothered me. The system could populate the celebrity's feeds in the cache, but when the timeline service tries to get the final timeline, it has to know the celebrities list, so that it can retrieve it from the cache. But how to get the exact celebrities list that was ignored earlier?
How can you rate limit from stateless API gateway, are you suggesting a) using IP hash based load balancing and storing state on the gateway workers? b) using distributed state storage cache and reading from this upon every http request?
For timeline calculation, instead of messaging queue writing for each tweet to all followers cache, could we use pub/sub? Publisher = tweeter, subscribers = followers Or could we instead use streaming followed by batching? I guess in this case since timelines are all unique to each user, batch processing is overkill.
One part I’m confused, how do we efficiently tell whether user follows celebrities and thus know whether cache is updated or not. Do We maintain a service for that?like when user sub a influencer we add a tag to the user?
This is an awesome content. Best so far I have seen when compared to others on this problem . It would have been great to go into details of some topics like , how are we updating redis cache ? What does data model of followers /followee looks like?
I know you said you changed approach and framework, is that why you use an application gateway (and just say it does LB/Rate Limiting/Auth) in the Uber video but here you break things out?
You talked about using cache for popular tweets but where this cache is being used? If it's in the timeline service than how does the system knows that one of the tweet in the cache is from one of the user's follower and hence needs to be included in the timeline?
Where do we draw the line between handling of mega users vs normal users? To me this suggests an issue with the design, is this actually the way big social media companies handle this?
the last time i heard laptop button presses that loud i had a colleague (sre) that got a dell laptop where the mic and camera are just besides the keyboard. :D
This was not my fav in the series, I would recommend Jordan has no lifes Twitter Design to actually understand why some choicws were made. Just bought your System Design tool, looking forward to more questions being added
Hey Evan, Around 15 minute, when you create a message queue for timeline fanout -- in this case, wouldn't it be better if we used something like CDC with Kafka/Kinesis to capture new tweets directly from the database? I'm just thinking there might be more failure scenarios to handle with pushing directly to message queue from the tweet service -- i.e. how do we make sure we don't miss tweets for fanout if the message queue goes down? How do we handle if database write fails? Only push it to the queue when write is successful? Also I guess this is a more general question because I've struggled with it interviews -- how do you decide between using a message queue and a stream for a particular problem? Because I've found there is some overlap where both can work
I think in a very recent video, he mentioned that generally CDC should only be used for kicking off anything asynchronous, but should be avoided for synchronous low latency tasks. Since the task being kicked off here is asynchronous, I think you make a good point of using a CDC. Hope this helps as a vague guideline.
Hm. Why Redis plus CDN? Doesn’t the CDN already cache? Isn’t Redis for all tweets really expensive? On cost, no back of the envelope math? No discussion of microservice infrastructure? Lambda/EC2/EKS?
@@hello_interview so it's still should have been the other way around, even if you didn't wanted to draw each load balancer before each service, the right abstraction here should have been API Gateway first then Load Balancer. If you are drawing the load balancer first and then the API gateway it's like saying you duplicate the entire set of components behind it (i.e. duplication of several API gateways with services) so the load balancer will actually have somewhere to distribute the requests (note that this might be a valid solution as well for robust scale).
Not a concern imo. But you'll certainly find textbooks or microservice zealots who disagree. Managing more dbs than you need to is a pain and has higher downside.
Most DBs have some form of event capture nowadays. They expose connectors where you can configure a streaming technology where the DB places new events on the stream.
Heads up, you'll notice that the approach and framework in this video are different from our other ones. This video was made many months before the first from the recent series. While the approach here is by no means wrong, we recommend the framework in our other videos more. We considered deleting it, but have left this up because candidates continue to find it valuable and we don't want to take that away.
Can you outline what is different?
This video also seems very skinny compared to some of the more recent ones which drill into some of the details of the design decisions. Would you folks consider redoing this design with a more of an E6 focus again?
Hi, thank you very much for this video on system design, I stumbled on your channel yesterday and till now I've been consuming your video series on system design. Thanks again.
However, I'd like to take a course dedicated to teaching system design with a well curated learning pattern(comprehensive and beginner friendly), would you mind recommending any ?
@@laffs4084 It doesn't follow their recommended structure. For example, you'd define the core entities and API contracts before hitting the high level design. And the design itself should be as simple as possible to satisfy the requirements. Then you'd go into different areas of deep dives that were interesting, rather than deep dive as you go and risk running out of time to make a complete system. That's my understanding at least!
I've been watching all of your videos in preparation and this actually makes me curious. Is this video closer to what you'd expect to see from a candidate since it was more "fresh" to your experiences as an interviewer, and the months you've since spent refining the process for the channel is more theoretical than practical? It's probably a common trajectory for creating educational content vs the rough experience on the ground, so to speak. Either way, I'm taking notes and fully appreciate the clear and concise instruction your channel has provided.
Man, you covered everything. Thank you, please make such videos. Your videos are point-to-point without any distractions.
Dude, its insane we can get this content for free. Im just starting out to prepare for the system design interview and your content is a godsend. I cannot put my appreciation into words. Thank you ever so kindly.
I have been waiting for years for someone to create understandable, well thought out, structured explanations for hard concepts like the ones Alvin from Structy made for dynamic programming (that are now my single resource on the topic), but instead for system design. I think I finally found them. I immensely appreciate the work you put into this and the Hello Interview resources.
I have watched many hours of system design videos. This video is the best one that covers majority of the important topics. Thank you for making this!
Amazing video! One of the best system design videos I've seen. Clean, simple, yet goes over a lot of important topics and beyond. Keep it up!
This videos helped me a lot to see things I did not see before and I did not know where to learn! Thanks a million! I always passed code interviews but failed the system design so my offers from Meta, Microsoft, Google and Amazon were always demoting... All this videos helped me a lot to realize what the problem was.
Thanks for making this video. One of the best system design for Twitter I have found that explains clearly. Probably, API reference and capacity metrics should have been discussed as well prior to designing the architecture diagram.
Honestly one of the best SD explanations I’ve seen
Great video as always, a few follow-up questions.
1> How user get the notification when someone likes or reply to his tweet?
2> Is Fanout done for dormant users too, if yes then it's a waste of expensive Cache.
This is amazing, starting to prepare for system design interview. This video puts everything together so well. Practicing on the AI dashboard now. Great work 👏🏻
Warm greetings from Armenia. I searched so long time content like this, before i had found. Tnx for really precious content🙏
Hey this was great! Would love to see more of this.
You store replies in a separate database, why not store replies and retweets in different databases as well ? their scale is massively different .
By the way, your website is probably one of the best I've seen in terms of learning system design - the way you present information with trade offs, questions and answers is really well done.
Youre diagrams are somehow so clear despite having complex lengthy requirements !
This video focuses on functionality that is also apart from the core Twitter design. Likes unfollow , reporting and spam detection are specific to Twitter. And also search can be done in depth. However load balancing aspects were covered. But API gateway could have been dealt more about its functionality specific to Twitter. However it’s a great effort for free content
I have read and looked into many articles about building SNS. This one is the best so far. Thank you so much
The pounding sound while dropping a new diagram object is UNBEARABLE 🔨🔨
Checkout the new videos. Totally different format and much improved. You won’t see it again :)
At first I thought the dude was slamming his hand on the table each time 🤣
@@griffsterb 💥🙉
Thanks man 👌
This is what API Gateway server special for. It is to handle any non-functional requirements. like rate-limit, caching, security. etc.
But I think you had to mention that to not confuse the beginners
It’s main responsibility is to route api requests to the correct microservice, but it also can handle general responsibilities like auth, rate limiting, ssl termination, etc
One advantage that I could assume that the video is more effective is that it is a much shorter video. It shows a equally deep dived interview but with half/ a third the length. Although I will not use this structure during the interview, I would assume that a more senior eng could benefit from the video
finally a prepared system design rather than doing it live. simplified it well. thank you
dude, you rock for making hellointerview
Question about the fanout strategy. So based on how I understand it:
Fanout-on-write: Async (pre?) constructing timeline in a cache, where the most recent tweets of the followed accounts are pre-pended so that when a user makes a READ request for the timeline, it already has that timeline available.
Fanout-on-read: Helps solve the celebrity problem, where rather than writing to millions of timelines, wait until the READ request for a user's timeline is made. At that point, you can integrate the followed celebrity(ies)' most recent tweets and programmatically add them in.
How about when a user, although unlikely, follows thousands of celebrities? That adds latency if millions of users are following tens of thousands of celebrities. The way I considered answering this in a SD interview:
- That is unlikely, so don't solve (not a good answer)
- Fanout-on-write. Determine users that cross a specific threshold of "celebrity" acounts, and when a celebrity tweets, DO write to specific timelines - the timelines for users that crossed that threshold.
- Limit the number of followers.
What is your advice in this scenario?
Why Cassandra was not considered for writing the tweets? Its a write optimised DB and good for huge volume.
I think that in a microservice architecture, the API Gateway should be placed before the load balancer. The request goes to the gateway first, then each microservice has a load balancer.
Wonderful, I've watched all of the videos on your channel and I'm kind finishing up with this one. I guess this was a proof of concept perhaps? Either way, it's great. Love to see that you defined a framework for the other videos. Looking forward to more of those!
Yah this was the first one i made a while ago and honestly consider deleting
@@hello_interview for what it’s worth, I learned a ton from this one.
- You showcase here that it’s worth delineating read/write paths for specific* services, which is not something you folks have done in the other videos (it sounds obvious but it’s not for someone of my level).
- You spend a bunch of time towards the end talking about salient points we could make, whether related to security, monitoring and testing. Again, points you don’t quite bring up in the others.
Now if you’re ever going to make a new Twitter design video, perhaps? I don’t know, you guys are doing a tremendous job leading this anyway. Thought I’d share some user-feedback. Cheers.
It's interesting to notice how much your content has evolved between this walkthrough and your other system design videos. Lots of breadth, less depth, and the design feels a like bit more complicated, likely due to the breadth covered.
We're trying to pull some of that depth into our deep dive series. We're going to try to have and eat the cake!
What a great video! Very insightful stuff.
On a litter note , Wondering, why companies ask for System Design of FB,Twitter , Netflix designs, etc. If at all I know all of these, I would be a Billionaire by now and don't even bother to attend technical interviews. 😆😆. But nevertheless, this video is a very good one !
Awesome! well Explained. Elastic Search should also CDC with user profile just in case if someone want to search user profile or accounts.
Load balancer in front of API GW is redundant.
Amazing video. Quick followups:
1. I see you have stored Likes on tweets in the tweet document itself. Please throw some light on how would the flow be for someone liking a tweet. I don't think it should go through the tweet CRUD route.
2. Please share more about the CDC.
Found your content on reddit and this is great. tyty
Thanks for this video. You have definitely covered a lot of topics in breadth, but isn't the expectation in a design interview to atleast deep dive into one of the components ? Would you give a hire feedback for an E5 candidate who presented exactly the same content as that of this video in their onsite interviews ?
Great question! This solution would be adequate for an E4 candidate, but not for an E5+ candidate, for the reason you mentioned. As candidates get more senior, the expectation is less breadth and more depth. For an E5 candidate, I'd suggest you find 1-2 places to go deeper. For Twitter in particular, the search component could be one of these places.
@@hello_interviewthanks for your response. Do you have some suggestions reagrding the time distribution to finish a design(including deep dive in one component) in a 40 minute Meta interview ?
amazing, best on youtube! please post more
Great! Super common question. Instagram or link shortener next please!
Thank you so much. You really help me a lot.
Awesome man! Keep these valuable videos!!
hey, great video! i know you touch on this in more recent videos, but how would the mid-level/senior/staff expectations differ for this problem?
May be a noob question, why do you need Load balancer before Api gateway? shouldnt that be after api gateway as gateway itself will be resilient enough to balance the traffic? May be after gateway we decide if request is going to particular service then which server of that particular service it should goto?
Not a noob question, this is confusing. It is just an abstract here but i've dropped it since it confused a lot of people. In reality, each horizontally scaled service has its own load balancer. I just did not want to draw all of them.
two issues: 1- In describing ELK you said something like "..we use Elasticsearch for storing the logs, Logstash to analyze it and Kibana to present it (dashboard). this is not entirely correct. we use Logstash to capture logs from system components, Elasticsearch to store and analyze the logs, and Kibana as dashboard. 2- The API Gateway comes first before LB not the other way around as you presented, so the flow is something like " Client → API Gateway → Load Balancer → Backend Services (e.g., tweet service, user service) ...
Thanks for the video, very informational.
Shouldn't API gateway go before Load balancer?
and each service has it's own load balancer?
Hey, for viral tweets, we get a lot of likes and retweets. In that case, we will need to edit the MongoDB document for that tweet thousands of times. So, is noSQL a better choice in that case or shall we choose SQL with normalised tables?
it would be great if you have deep dived into datastore , how its handled internally.
HelloInterview to the world! Great great content.
Thank you.
Great work. Thanks for sharing!
thanks for this video 😊
Thanks for the great video!
Amazing video as always! Would you consider re-doing this with the newer format of your videos or so twitter search/newsfeed?
Eventually, yah :)
Great video! Can you please do a write up for system design of Twitter in your newer format on your site? I could only find Tweet search and Facebook feed that are related.
Checkout newsfeed, it’s super close!
Simple recommendation algo can be one word: popularity
Why do you choose NoSQL over SQL for Tweets? Is it because:
1. Faster writes because no table structure needs to be followed?
2. Asynchronous writing in NoSQL?
3. SQL would generally ask me to normalise the likes and retweets table and thus, slowness in joins while reading?
Load balancer should be placed after API gateway, right ?
Everything, including the gateway, needs to be balanced. It’s just an abstraction, don’t worry too much about it
This is how you do it. Whole twitter in 20 minutes. OP, you are the best. I didn't get though why user profiles need to be stored in a sql db, why mongo would be a poor choice? Analytics can be it's own separate DB. Will a SQL DB sustain analytics and logins with 300+M active users? if you shard the DB, then is it possible to do analytics? Also when a user follows another one, it should be recorded such that timeline takes that into consideration (oh you added it later :) ). What about rate limiters? Is it a separate service such that the throttling data per account is in central place? I liked you touched security, logging and monitoring. Is this something that interviewers are looking for?
"Is this something that interviewers are looking for?" - Depends on the interview. I have had a few where they told me ahead of time they wanted to touch on security, monitoring, and maintainability. This is likely something you can ask the recruiter ahead of time for clarification.
In this video why did we skip the core entities and API steps of the framework? I really enjoyed the video. I never thought of the replies being in a separate database. I wished I had discovered your channel before I had my interviews.
Tbh, because this video is old and before we introduced the delivery framework. Left it up because people still find it valuable
Hi, thank you very much for this video on system design, I stumbled on your channel yesterday and till now I've been consuming your video series on system design. Thanks again.
However, I'd like to take a course dedicated to teaching system design with a well curated learning pattern(comprehensive and beginner friendly), would you mind recommending any ?
This was really good. Can you do one on designing an online ticket system. e.g Ticketmaster
We have it :)
Feedback while watching the video - I'm not sure if it's from the diagramming app or added but the pounding sound effect when adding a new service is annoying and detracts from the viewing experience. The content is super helpful though.
FR it feels like Evan gets mad and hits the desk out of anger 😂😂😂
That’s crazy, it makes so easy to focus
Great video! Will will definitely check out the site. Couple of questions from your design:
Why create a separate Auth service instead of handling it in API gateways? Also, how is Auth performed when creating Tweets if that service is under the Profile service?
When building a cached timeline, do you see this as a key-value store with user_id keys and prepend-only log of tweet_ids? {user_id: [...]}
Didn't understand why you chose SQL for UserProfiles?
1. Querying on different attributes - Does NoSQL not provide this? Its all about indexing, right?
2. ACID - Why do we need ACID properties for UserProfiles?
3. Joins - Which table do you want the Users table to join with? All other tables are NoSQL or Graph, so how do we perform joins?
This is a totally different approach from the other systems interview questions... In the others we do a simplified diagram and then deep-dive to add complexity. Which is the better approach?
The other ones. This was made way before the others. Consider deleting, but people still find value from it so left it up.
Amazing content! Keep up the great work! 🙌
thanks for the video.. video is really amazing but the background music is a bit distracting..
This was an early attempt at making these. All the superfluous noises are long gone. Check the more recent videos
There is one little detail that bothered me. The system could populate the celebrity's feeds in the cache, but when the timeline service tries to get the final timeline, it has to know the celebrities list, so that it can retrieve it from the cache. But how to get the exact celebrities list that was ignored earlier?
How can you rate limit from stateless API gateway, are you suggesting a) using IP hash based load balancing and storing state on the gateway workers? b) using distributed state storage cache and reading from this upon every http request?
Excellent👌
Excelente video, thanks🎉
For timeline calculation, instead of messaging queue writing for each tweet to all followers cache, could we use pub/sub? Publisher = tweeter, subscribers = followers
Or could we instead use streaming followed by batching? I guess in this case since timelines are all unique to each user, batch processing is overkill.
One part I’m confused, how do we efficiently tell whether user follows celebrities and thus know whether cache is updated or not. Do We maintain a service for that?like when user sub a influencer we add a tag to the user?
What's the pupose of using a Load Balancer if you decided to use API Gateway?
This is an awesome content. Best so far I have seen when compared to others on this problem .
It would have been great to go into details of some topics like , how are we updating redis cache ? What does data model of followers /followee looks like?
Yah tbh this video is pretty high level, targeting more mid level (ic4) candidates. For much more depth check out the other two videos on the channel!
@@hello_interview will do . thx!
Thank you for the great content. Really appreciate it!
I know you said you changed approach and framework, is that why you use an application gateway (and just say it does LB/Rate Limiting/Auth) in the Uber video but here you break things out?
Each follower has timeline cache. Since its for each follower data is stored, which kind of data store its is ?
Can we rely on API gateway to handle rate limiting and authentication without adding separate services for those?
You talked about using cache for popular tweets but where this cache is being used? If it's in the timeline service than how does the system knows that one of the tweet in the cache is from one of the user's follower and hence needs to be included in the timeline?
too good!
Where do we draw the line between handling of mega users vs normal users? To me this suggests an issue with the design, is this actually the way big social media companies handle this?
the last time i heard laptop button presses that loud i had a colleague (sre) that got a dell laptop where the mic and camera are just besides the keyboard. :D
can load balancer act as api gateway also ?
no background music please
Big fan of your explanation since I found you in a Reddit comment! Does this one have a written version? I couldn't find it on the website
Nope, this the only one that doesn’t hit the newsfeed write up should be similar
This was not my fav in the series, I would recommend Jordan has no lifes Twitter Design to actually understand why some choicws were made. Just bought your System Design tool, looking forward to more questions being added
regarding twitter DB sharding, do you suggest sharding by userID or by twitterID, and why?
userId, more queries to see a users profile than to look up a specific tweet id
Hey Evan,
Around 15 minute, when you create a message queue for timeline fanout -- in this case, wouldn't it be better if we used something like CDC with Kafka/Kinesis to capture new tweets directly from the database?
I'm just thinking there might be more failure scenarios to handle with pushing directly to message queue from the tweet service -- i.e. how do we make sure we don't miss tweets for fanout if the message queue goes down? How do we handle if database write fails? Only push it to the queue when write is successful?
Also I guess this is a more general question because I've struggled with it interviews -- how do you decide between using a message queue and a stream for a particular problem? Because I've found there is some overlap where both can work
I think in a very recent video, he mentioned that generally CDC should only be used for kicking off anything asynchronous, but should be avoided for synchronous low latency tasks.
Since the task being kicked off here is asynchronous, I think you make a good point of using a CDC.
Hope this helps as a vague guideline.
Hm. Why Redis plus CDN? Doesn’t the CDN already cache? Isn’t Redis for all tweets really expensive? On cost, no back of the envelope math? No discussion of microservice infrastructure? Lambda/EC2/EKS?
Can you please share HLD of something like mixpanel?
Awesome content. What is the whiteboarding tool that you use?
Excalidraw
Silly question really: what's the name of the tool you used to take notes and sketch diagrams?
Excalidraw
Which draw tool is this?
Why have a load balancer in front of the API gateway?
This is great but just didn't got one thing - why put load balancer before api gateway?
Just an abstraction. Each service (gateway, services) will have a load balancer
@@hello_interview so it's still should have been the other way around, even if you didn't wanted to draw each load balancer before each service, the right abstraction here should have been API Gateway first then Load Balancer. If you are drawing the load balancer first and then the API gateway it's like saying you duplicate the entire set of components behind it (i.e. duplication of several API gateways with services) so the load balancer will actually have somewhere to distribute the requests (note that this might be a valid solution as well for robust scale).
Hi I have seen there are two micro services consuming same database , isn't it going to coupled between two of them and hard to scale ?
Not a concern imo. But you'll certainly find textbooks or microservice zealots who disagree. Managing more dbs than you need to is a pain and has higher downside.
which is software is this that you use to make the video drawings?
Excalidraw
How do you guys concentrate for so long on coding and know exactly what to do i want to do a career change but how should i get in
Like most things, it gets easier after you do it for a while!
is cdc some endpoint that database provides out of the box, or is it some tool, or a stand-alone microservice?
Most DBs have some form of event capture nowadays. They expose connectors where you can configure a streaming technology where the DB places new events on the stream.
This is a great resourc, but I don't see the interactive AI powered whiteboard in your website. Is that still in development?
Click on mock interviews in the nav then “AI mocks” on the left nav
why sometimes you design the actual API and http methods in detial, other times you draw requests to microservices without mentioning APIs
Does twitter mainly use MySQL?