System Design: How to design Twitter? Interview question at Facebook, Google, Microsoft
HTML-код
- Опубликовано: 12 июн 2024
- Designing the architecture of Twitter and similar social networks is a popular engineering interview question asked at companies like LinkedIn, Microsoft, Google, Snapchat, NVidia and others. This interview question is extremely broad but gives you the opportunity to talk about technologies like in-memory databases, replication, sharding etc. It's important to give a clear high level overview of the problem, ask clarifying questions and talking confidently about strengths and weaknesses of the proposed solution. Every architecture has trade-offs and interviewers want to hear you talk about them.
Follow SuccessInTech on Facebook: / successintech
Follow SuccessInTech on Twitter:
/ _sh4dy_
Details taken from a presentation of the VP of Engineering at Twitter: www.infoq.com/presentations/T...
Music: www.bensound.com - Наука
I'm doing a little experiment over on IGTV, the SiT VLOG. Check it out! instagram.com/tv/BkYf4GphfQz/
Nice effort and really explain the things. Please mention the sources you have referred. This would allow us to go in-depth of a topic.
Can you please also add low level design?
In some orgs they asks HDL and LLD. How would you classify this? I think these boxes you drew were HLDs but the tables and ER diagrams would be part of LLD right?
I wanna get into Amazon, how can I get your referral?
Great job!
I know html, css, js, reactjs for frontend, python for backend, and mysql. Now I want to understand how to glue these together, build an application and deploy on server.
What do I need to learn next?
Is this the right video?
man i cant express my happiness. you are the only one on youtube(infact the internet) concentrating on high level systen design. many companies are shifting their focus from algorithms to system design now a days. it was so hard to figure out how to come up with answers to these. Your videos are a life saver sir. You people are literally changing lives. the minimum i can do is say a big thank you to you for making these vids.
+Biswajit Singh I‘m really stoked to hear that, man! Happy I can help you out. You would do me a huge favor if you could share my videos on your social networks. There will be more interesting videos to come! 👍
The minimum you can do is pay your first month salary to his patreon account, if he has one.
For that I’ll make a Patreon 😄
companies arent shifting focus, as u r becoming senior, u r facing more architect lvl questions.
Check "Tech Dummies".. .he is much better then him
I used the caching strategy described in this video in my system design interview. You are part of the reason why I received an offer from one of my dream company. Thank you!
hi, what caching strategy is described in this video? The Redis fan out part? Could you elaborate more? Thanks!
@@ethanlyu4839 Yes, the Redis fan out for active users, but I wasn't asked exactly to design Twitter. I borrowed this part of the design in my answer.
What design interview question u got ?
Can you share ? It would be of great help for me.
@@jeevithatd9221 just look up github repo for: system-design-primer . It covers a lot of the most common system design questions as well as giving you the fundamentals before giving you the problems.
NEW: Check out my brand new website www.successintech.com
I'm halfway there and just overwhelmed with the kind of explanation this guy has put into the videos. Probably will complete this and come back again for more such videos. Thank you!
Hi Ramon. Thanks for making these SDI videos. There are quite a few important things missing from this video to be considered a complete and correct answer in a real interview: the list of different micro/services that makes the platform run. Full database design/schemas. API commands from client to server and in between important micro services. And most importantly - “back of the envelope” estimations I.e. number of users DAU, QPS, storage requirements, throughout requirements etc.
I hope you’ll continue making SDI videos that contain this info too in the future. Many thanks and best of luck
Very few people explain as well as you do and cover these topics. As a software engineer, I am very interested in these topics, and the community needs more videos like this! Keep up the good work!
Thanks for doing this! One suggestion: You should have separate playlist for system design and algo related questions.
+Venkat Raman That’s a good point, will do! Thanks
This was amazing! Thank you so much, as a beginner on System design, you explained it beautifully.
I'm committed to watching your design video once a day till i finish them... then repeat. Thank you.
Love it :D All the best!
Time flew, amazing stuff man. Crazy ideas are being implemented when it comes to huge systems.
Awesome video. Thank you. It gives me a basic idea about how to approach system design questions. This design covers a lot of things which is used in real-world huge systems. It includes relational databases, In-memory databses, hashing, load balancers and most important how to design system based on actual requirements, like eventual read consistency in case of twitter.
Thanks for this insight with great simplicity. Hope to see detailed videos on followup topics as well.
Thanks!
It was a very good head start into how I can approach a problem. Thanks a lot.
Great video :) really happy to see someone explaining overall system design in depth. Waiting for more exiting videos on system design.
Hello Mr. Lopez! I loved this video. But it is always very likely to face a system design question totally out of what you had prepared for an interview. So a video on all possible system design components and how they are used for specific use cases in real life products can be very useful. So once building blocks are available, its easier from there. For example, REDIS database with its in-memory function is a good takeaway from this video which I can use in different scenarios.
Thanks for your feedback! I‘m planning something alone those lines. Don‘t forget to subscribe ;)
Thank you so much for posting this video. This is great! Many regards.
I have an upcoming test about distributed databases and WDM, this has been such a help in considering how to answer these problems. Thank you!
8:06 great breakdown by system traits to design improvement. Network access availability > consistency
9:27 I like how you go into higher lvl overview with actual scenario/api for when tweets made
I actually shared this with my newsletter for acing coding interviews. Great way of identifying problem areas and solving them
The best design interview videos in your channel.
Your videos have been amazing. They are a great complement to other videos that are more algorithm focused.
Wonderfully explained. Excellent stuff. Thank you much.
This is great! Your approach, time management and advise to solve the problems are spot on. Thank you and keep up the good work!
+Ravi M Hey Ravi, thank you for the kind words! Stay tuned for more videos :)
What I like about your explanation is that u r not rushing it by preparing the content beforehand. Many of the videos does that cramming so much information in very little time. You are carefully walking thru the solution giving us ample time to make sense of a point u made. Can you suggest some resources(books,articles,lectures,seminars, utube channels like urs) to read/watch to get good at system design
I have looked into couple of other twitter system design videos, but I felt your videos are way more explanatory. Your video answered my questions like "how the redis node is choosen out of many?", "for users with thousands of followers and uncertain about their next login, will constructing home timelines for such users is worth it?". I believe your design is not complete w.r.t analytics and search functionality, but still very informative and nicely explained. Thank you.
Thank you!
excellent job!!! Thank you very much for sharing.
Amazing video on System design. Your way of explaining things is simple and on high level. Many thanks.
Thank you!
Awesome explanation and Thanks
Extremely grateful for your videos!
Glad you like them! If you want to support the channel and future content please share my videos and spread the word on your social media =)
Thank you a lot for taking your time and aring this awesome video.
Fantastic tutorial. It certainly helped me to get a perspective of the system design. Keep rocking and thanks for helping the world !
Thanks for the kind words!
Thank you for this video. I guess it’s the best one on the topic
Thanks a lot for the video. it helps us to think the system design in a broader perspective.
I have two questions here. You said conventional Relational Database would be a bottleneck in this kind of systems. Does NOSQL would be the ideal one here for storage?. Also during the entire video, you have talked about In Memory Database. At what point of time, this data gets persisted into the database?
He mentioned there should be a machine between the Load balancer and the redis clusters. I would guess that machine would take care of persisting the tweet into the database (preferably in an async manner)
Yes, it will get persisted in NoSQL for sure, As @cats3xxx mentioned. Initial POST and GET will always happen on Redis and I see his design shows Redis is kind of persistence cache for faster tweets flow.
awesome video. Thank you for sharing!
Fantastic video. You can also optimise ram needed and computational load by having a redis cluster per region and by tracking where reads come from per user to only rebuild their timeline in regional clusters they are likely to read it from. (Dont worry about rebuilding my timeline in the UK if I only ever read from Australia). Of course you can divide the computation that way too with at least a worker per region. Also you can optimise the read requests themselves by only loading the most recent slice of the timeline and loading in the next slice when you scroll to the very bottom.
I learned interesting things from this video but it was also pretty historical. I mean, I'm not sure how much signal I got about his design skills and tech leadership capabilities; I can tell he knows how Twitter works.
Great video. Thank you
your every word is useful and informative!!!
Great video. Thanks. You make system design interesting. Tnks
You save my life! Thanks!
It's a wonderful explanation about Tweeter Timeline, User Followers in details with respect to the system design. That really rare and deep in terms of getting advanced topics that most of the top-level organization ask to clarify and see their confidence. Thank you so much for the sharing perfect video which I was eagerly searching for. I would like to request you one more topic about - Google Map and Gmail system design in detail. Thanks in advance. Better Luck.
thank you for these videos. they are very nice and well explained.
+Copacel You‘re very welcome!
Your content is amazing. You should create Udemy course on System Design Interview Questions.
This guy is awesome! Subscribed
Thats very helpful, thank you
I liked the video before i finish the first 5 mins! thanks a lot for the great video =)
Thanks! Glad to hear :)
Excellent Vdo for beginners like me.... Thanks a lot man.... :)
today I was asked this question during an interview which makes me wonder what is the gain in asking something that pretty much can be memorized from videos just like typical common algorithms questions, I really don't see too much gain in companies expecting you to play to 'design the internet' I came up with something similar to this but replacing redis with temp tables 🌬️🔥 thanks for the info
Thank you so much for this!
+Fahran Kamili You are very welcome!
This is a great video. I have a quick question with using list in Redis. The video only mentioned store the tweet_id and sender_id for Bob's list. What about the actual tweet? Is the actual tweet store in Redis and we will need to do a look up by each tweet_id to get the actual text?
I believe tweet gets also stored in redis, considering its only text+links. It wouldn't be much useful if we still have to fetch tweets from DB.
Saving all the tweets for entire duration could be memory and computation intensive.
Hence, I believe twitter uses time expiration mechanism in Redis. redis.io/commands/ttl
I can say this because it takes only few seconds in you're looking at your feed. On the other hand, it takes more seconds for a query when you search it on Twitter.
Why do we store 3 times in redis?
Sriram Subramanian To handle failures of cache nodes utilizing a number of replicas
really great video!!!!!!!Keep it up!!!!!!!!!
Love ur tutorials. Please do a system design for a ecommerce website
good video helpfull i will show to my team .thank you
Thank you a lot sir for the amazing content please upload more
Awesome video and stuff.. I was trying hard to get hold of Design Solutions but could not find good content.... Keep it up and continue making great videos... :)
+ankita gupta Thanks, Ankita! I‘ll do my best :D
Here are few others design / architecture which i am curious to know ... would be great if you could create them in the near future:
1. RUclips architecture and design or similar video streaming websites
2. Amazon or any E-commerce website
3. Instagram
4. Google Search Engine
This is great! Can you talk about how to design a recommendation system like people purchased this product also bought these other products?
Thank you :D Yeah I‘ll take a note of this
thank you very much for valuable post
This is genius thank you!
Hi, Can you please do a video on designing a service like google docs and how to keep everything in sync, concurrent writes by multiple users etc
The only thing I didn't like about this video is that I can only like it once. What a great Video!!!
really good video. Thanks a lot
Thankyou for sharing this descriptive video. This is definitely the cleaner strike as you were aware of some of solutions and tech stack that Twitter has already incorporated. I would however more interested to know how the tweets with the visual content would be handled. May be some exploration toward CDN and CMS related solutions? I can understand covering all aspects in one video is not possible for anyone and would look forward for more contents posted by you. Great Going!!
amazing tutorial just subscribed
Great video, thanks a lot. Shouldn't load balancer connects to servers and the servers access external persistent memory like Radis?
Radis is in-memory which is very fast compared with external database. Moreover, fetching data from external DB is much costly
Redis boxes are servers by themselves. You can decide to put your in-memory caching either in the same machines that are serving the initial HTTPs requests or have a dedicated fleet (most used).
Thank you so much!! :)
It will be great if the architecture of maintaining hastags in twitter can also be explained: Search, top trending hashtags etc.
Why is it being replicated 3 times on the redis machine though? Why isn't one redis machine enough
To avoid a single point of failure. If there would have been only one Redis machine and it failed before persisting data into the database then data will be lost.
Thanks for the great video! I have a question regarding timeline - does the system stores the whole user's timeline from the beginning (in Redis), or some portion of the timeline? For example, from last login?
Depends on what your goals/constraints are, right?
According to the source video, they store about the last 800 tweets from a user's home timeline.
Great video. Thanks for taking effort to make it available for us all. A quick question if you can help me understand, wondering when is the tweet stored and where? Who can be responsible for such cases? Thanks!
Thanks Ramon such good explanations,
1. purpose of 3 cluster ? is only for - the fastest one response to be taken as result?
2. user bob table and follower table are created in radish cache only not physical db tables?
Thanks! 1. speed and replication 2. yes they are stored in a conventional DB too, as a backup so to say.
Thanks a million!
Thank you for the video, Sir! May I ask why you choose to mix the implementation details with the design, is it the standard practice? For instance, you mentioned Redis as a in memory DB in the diagram. Why not just leave it at "in memory DB"(the design) and leave out the Redis (the details). Much thanks!
can u please make a video for designing online payment system
I love your system design videos! They are so helpful for me to prepare for interviews. Would you be able to make a video for system design of gmail?
Thank you!! Really loved the way you explained everything: crisp and clear!
Can you please explain:
Your use case: Alice posted tweet. Bob follows Alice. Bob's timeline updated with Alice's tweet in say Redis 1,2,3
Assume one more use case: Kate posted tweet. Bob follows Kate. Bob's timeline updated with Kate tweet in say Redis 4,5,6
When Bob is viewing the timeline and we do the HashMap lookup to find the 3 Redis machines which of the above 3 machines will be returned to display Bob's timeline?
Suppose Alice and Kate stay far away, will Bob's timeline be always updated in Redis 1,2,3 only or can it change?
Miles to go before you sleep.
Could you please prepare system design and LLD for the following:
1. Simulation of a cricket match, football match etc.
2. Implementation of Queue like Kafka
3. Ecommerce price drop notification system for 50M products
4. Amazon like website and order management system i.e. everything that happens after clicking checkout
5. Elevator system
6. Scrabble
7. Chess game
8. A library for evaluation of expression
Hi, please do a system design for broker/worker architecture, like how would you build a system like Kafka or rabbitmq
amazing Video!Thanks.Can you also add how to design LinkedIn?
What's the rationale for having 3 Redis databases instead of 1? Optimize recovery time based on user location/server load? Thank you!
Do you have any book recommendations for these sort of high level design that we can read and get better?
System design can be understood by reading articles and video blogs. There is no complete books to best of my knowledge.
Designing Data Intensive Applications is an absolutely amazing book!
@@GabrieleCimato Hi! I just wonder if this book is friendly for beginners? Thanks!
@@xiaoshengliu5860 that's a good question, it starts from very basic stuff and then it gets more intricate. I wouldn't say it's for beginners but if you're willing to put in the time it'll give you a deep understanding of modern data management.
@@GabrieleCimato Thanks!!!
Thanks for the nice video, it is informative. I have two questions. 1) You mentioned that data will be duplicated on three reddis servers. How to are these three servers been selected? Do they intentionally choose three reddis in three different locations? For example, one in local (US), another at Asia, and another one at EU? Then, the question is what if the user travels to Australia? 2) I may missed it, is this design, sounds like one tweets will get duplicated at the home page of all followers. That means a lot of duplications, which will end up with much more memory/storage usage. Is there any way to relief this?
Hello, is RabbitMQ is a good choice for user notifications feature in webapps like twitter/fb ?
I think you wanted to say 'everyone that follows you' in the video at 11:13.
Thank you.
Hi, Can you please do a video on designing a service like Uber/Lyft? Including services like location based look-ups for cabs, computing route, fare etc. It seems to be a common interview question. Great job by the way.
+Eldo Joseph yes! Thats exactly what I have planned for the next system design video. Thank you!
Also designing a recommendation system please. Thank you so much for taking the time to make these videos. They are very helpful and resourceful. Glad and lucky to have come across your channel.
+Akshatha Thank you, thats always great to hear! I’ll do my best to make some more of these asap :D
Nice one!
Thank you so much for sharing this. But I think it is just some parts of the system design. We still have a lot of things to introduce. Redis is a memory database, but what if all the replicas are down? We should store the data in disk, with Redis itself or other no-sql databases. Shall we consider how the servers work, what servers we should have, with read and post servers respectively? How we consider the security issues, shall we user a gateway, and introduce SOA theory tools like service registry and discovery?
+Yasen Zhang Yeah, nobody expects you to cover the in‘s and out‘s of such a system in a 45min interview. Architectures like this grow over years. If the interviewer wants you to cover a specific topic then you should dive deeper into it.
Got it. I've never had a system design interview before. But I gonna take one next week. It really helps me . Thank you.
I love the video, thank you for doing this. To design the system like Tweeter in the time constraint of an interview, I would probably start with no caching layer whatsoever. Just bunch of distribiuted databases. In every case, REDDIS is typically in-memory and has to have a traditional database as its source. Another point I would cover is once the user is logged and receives his initial snapshot of timelines, how does Tweeter merge live updates into user timelines.. Then an interesting question is if half of your friends are local and half are on the other continent. How does twitter merge tweets with different latency profiles. In any case, thanks for doing it!!!
I loved it 😍
can we simulate a basic scalable setup like this in cloud, like AWS, and test the performance ?
Great video!, just wondering why would redis update 3 times if a single request came in?
Thank you for this amazing video.
I have one question:
Are we storing the tweet ID or the actual tweet in the Redis list?
Coz If we are storing the tweet id than don't we have to run a SQL query in the tweet table while building the home timeline
you'd store a map from ID => tweet (redis is a key/val store in memory)
@@howardwang2821 Redis has many complex data structures, we can have lists
Great video. Thanks for the upload. Some thoughts or questions I had was about how the data is being stored in redis. You mentioned a hash table that stores the ip address of the redis machine, but isn't that too over simplistic for an interview? A little more dive deep on data sharding and access will be helpful.
Yeah you could dive deep into the exact routing mechanisms or even talk about tools doing that. But in the end it all boils down to some sort of map which maps a user to it's stored timelines. That map is probably also replicated, under strict latency limits, frequently updated, thread safe etc. I would always start with the basic idea, which is: "You have a primary key (user ID) and you need to know FAST where the timeline is stored". And then build on top of that if time allows. Regarding sharding: I'm currently working on a system design video on the topic of storage mechanisms and I'll try to cover sharding there as well. Stay tuned :D
Thanks that would help. More than once i have been asked to get into details of data sharding, consistent hashing, data replication and handling failures when a node goes down, and what happens to the sharded data, etc etc.
Seems like it's a big omission to just draw a line from the client to the load balancer to redis. Is the load balancer connecting directly to redis and storing that information there? Or is it connecting to a webserver which is storing the information in redis? Seems like the webserver would be worth a box on the diagram if the load balancer is.
Great video! A few things: this is more architecture than system design. Also in an interview the interviewers probably want you to focus on _your_ design rather than what twitter is already doing. And why does twitter create identical copies of the same tweet in each user list, seems redundant? Why not have each tweet only store the tweet id or something instead? Just curious ... :)
The following was not mentioned in the video, I wanted to know if this is an acceptable idea.
Since twitter like most social media is read heavy, we can maintain different servers for read/write operations.
This makes sense cause, we can scale our servers accordingly Ex: if read server is 50 TB, then write server could be 10 TB or similar.
This way, we can also make efficient use of in-memory cache since table reads will be different for read and write and thus mixing up them in same cache doesn't make much sense.
Thank you so much :)...plz make design of Ecommerce like flipkart or amazon!!!
Hi, Can you please do a video on designing a service like BookMyShow ? It seems to be a common interview question
Thanks for such wonderful videos. Is it possible to share shopping cart design question