I got hired by Facebook London due to your videos. This is hands down the best system design channel. Your templates and tips are so simple and elegant that it makes system design far less daunting. Thanks a lot Sandeep!
This is a masterpiece. It covers pretty much all aspects of the Social Media Platform. And it has a lot of insights into various system aspects. Keep making great content.
Hello, I randomly came across this channel while preparing for my next interivew. The content is amzing and really well structured and presented. On top of that use cases covered here while going over the design is also fabulous. big THANK YOU!
I have learnt a lot from your videos. Thank you so much. If possible please continue posting videos on System Design or Low Level Design. You can help a lot of people to become great Software Engineers
Thank you for the explanation! I have a few questions: 1. What is the purpose of the user & group service? It seems that Graph Service is responsible for the same purpose because it knows who and how close a user connected to other users. 2. At 28:27 you said that if a user is a celebrity then the posts will not be in Redis, but will be returned from the DB. However, it seems that because of the number of people connected to the celebrity then there will be a lot of requests there. I thought about putting in the Redis with priority (first normal user's posts will be deleted, and then famous users' posts will be deleted) and by the timestamp of the post.
Really good explanation. Would love to see some calculations on data size estimations, servers per service, load estimations on kafka and how to scale that
Hey...i usually dont comment...but this System Design playlist is just great! Covers a lot of ground on the design/arch unlike others who get deviated and are not 'to the point' like yours! Great work!. Saying this after going through many of them here on YT. BTW Why have you stopped?? Haven't seen anything posted in the last 3 months.
Thanks buddy!! We'll get started again soon, not sure how soon :) Some of us got a bit too busy with work, and because of the whole COVID thing we moved to different cities and it got a bit difficult to record.
It would be good to detail out the schema while talking about data storage in both persistent and in-memory storage. Also for like, I guess every feature backed action (e.g. post, like, comment etc.) are stored in corresponding tables, the likes count is then simply select count(*) where id = .... type of query in those tables. But given volume of records (and possibly data in shards) querying may be inefficient and would put system under load for such side tasks, work around could be to normalize the FR (Functional Requirement) aggregates into normalized table (having id and number_clicks primarily, which gets updated at every action or in batches) so that whenever there is need to get number of likes or number of comments, simple select query is sufficient. For older posts (even from few hours ago from now), this record can be cached into redis. This option selects performant "availability" over "consistency" ("exact data"). Simply having these things in Redis, may result in loss of feature (durability wise) and user experience (as most content creators have attachments towards these metrics).. Overall a good presentation!
It is very disheartening to see very less subscribers to this channel. I request to all the viewers who are gearing up for system design interviews , please at least subscribe it or like the video. There are not very good channels for such complex designs end to end . I see some copy pasted channels who just copy the content from a webpage and present in a lucid manner had 100k+ subs which is good however the originality and the in-depth info this channel offers is completely unmatched. There is so much effort goes in architecting the things , lot of trade offs before finalizing the approach , hats off to the Codekarle team for presenting this content. Wishing you 1M + subs..
Your videos are simply amazing. You cover every aspect of a system design in a very simple manner which not only helps to understand from an interview perspective but also helps in remembering these concepts while actually designing scalable systems in our day-to-day jobs. Thanks for creating such awesome videos. Keep doing the good work and helping the community :)
I think this would be quite close to the complete architecture at Twitter. It's beyond the FRs mentioned in the beginning. Out of curiosity, how long did it take you to come up with this design?
Your videos are amazing. I am a developer at Microsoft and I get different ideas watching your videoes. Are you up for a interview kind of system design video. In that video I will be asking you questions while you are designing something. This will help the audience in better understanding.
@codeKarle It will be great if you can also add database schema/data models for all databases. Those help understanding how to shard the database or how to build indexes. Since I am new to System design, I am getting all info from here but need to go to other websites to understand data models for these kind of distributed systems.
Good overview of the complete system but in a real interview you should be only working on few prominent features. There is no way anyone can design a complete system in 45 minutes.
Thanks for this detailed video and covering all the aspects of this design, although request you to share all calculation for storage n/w bandwidth and latency and other?
Thanks for the video. I have one doubt regarding Archival Service. I didn't get the need of it when we have all the posts related data stored in Posts Cassandra. From what I see, Redis here contains info about users news feed which contains the posts of his friends in an order. Why would we need to persist this timeline posts information?
Why do these big companies use relational DBs to store the graph relationship? I mean why not use graph dbs like neo4j? One of the reasons I've read about is that these companies are much older, even before graph dbs came into existence, and moving to a new stack so late would not be easy. Another reason I've read is that these graph dbs are no SQL, hence don't provide most of the guarantees that SQL dbs do. Any thoughts and suggestions on this?
Here are my thoughts : 1. Graph DB's are new and yet to mature to handle the kind of scale being handled. 2. NoSQL has its own pros and cons, one should not use NoSQL for all the use cases. Also a lot of effort has been put to tune the existing data stores which works pretty well with the scale.
I don't see much difference between twitter system design and facebook design ? what extra we are doing here as scale of facebook will be much higher than twitter?
Sandeep, 1 thing is not clear- can we use a NoSQL DB for user-onboarding also? reason being there doesn't seem any strong ACID guaranty requirement.. what's you view on that @codeKarle.
Great video...just curious would it be possible to annotate your design with different level of experience along the way? I get that the entire picture would be your recommendation but how about for someone who is interviewing for engineering manager I vs director :)
Best explanation, though i have some doubt. Here for cassandra which partition key should we use? you mentioned in the video that we can use "userId" but definitely not "date-range". My query is won't "userId" also create "read hotspots" when fetching posts of "famous users", can't we partition by "postId" but yeah then in that case we have to do some aggregation of data when fetching posts of particular "userId" from cassandra as it will be spread across cassandra cluster and since there is no inbuilt aggregation feature in cassandra and thus we have to do it manually and it will add some complexity to the system. Also in that case is it a good idea to add secondary index as "userId" for cassandra clusters? Can you please share your thoughts OR can someone help with this?
First point don't u think Cassandra won't handle frequent updates like likes and comments ...I mean if you have ever worked with Cassandra this is worst usecase for it .... Anything that updates this frequently Cassandra is not correct db candidate.... partitioning candidate key secondary to this question..... Cassandra requires equal almost partitioning and that logic is subjective ...graph db should have been used in this usecase because Cassandra will go down in production with such frequent updates
Thank you for the great detailed video. How would you design Facebook newsfeed. Is timeline considered newsfeed? If so, what pieces would be needed to design newsfeed.
Very well explained! I am confused in 1 part. How do we manage the DB clusters or service clusters. Don't you think k8s or zookeeper would be needed? Can you please elaborate?
Posts are saved in Cassandra. We need to 1. Get post by id 2. Get all posts by user_id 3. Get all posts made by user_id where createdAt > X If we have chosen the partition key to be post_id, then we can support only Query1. If we have chosen partition key as user_id (not very good as it can lead to imbalanced servers) then we can support Query 2 and 3 but not 1. One way to support all the queries is by having multiple copies of data but that will take huge storage. What are your thoughts on this?
First of all great video 👌 Keep doing the good work. I have seen your Twitter video as well, you have mentioned for large data you prefer Cassandra. But why not Mongo db, from their website they claim it can handle huge data? Then why Cassandra .
Mongo is decent enough, but the choice of Database ideally depends upon the data structure and the query pattern, and in this case, the query pattern would be of the form that Cassandra is good at since all the queries are on message Id. Have a look at this Video: ruclips.net/video/cODCpXtPHbQ/видео.html It'll give you good insight into which DB should you choose when.
Neo4j at the scale starts giving a lot of problems, so probably not a good idea. But if we have a much lesser scale, then that's a much better choice because of a powerful featureset. This can be a tradeoff we make, based on the Non Functional requirements.
Regarding trends. Spark streaming gets us the tags with their count, how exactly this data is used. I believe users would be more interested in posts corresponding to trending tags right? where exactly that information comes from?
Hi I have few queries. 1. What if we have to query posts by userId. Since we are using postId as a key in Cassandra to get userId for a post we need a full global scan of DB. 2. Can't we use NoSQL for User Table. There was no particular advantage of using MySQL as User DB Cluster cannot have join with other tables. Please comment on my queries.
Thank you for your Amazing videos.Wondering how would the post Ingestion Service handle that much of high throughput. Is the idea to maintain replicas ?. Also can we use Queue for post Ingestion Service before sending to post service and save the data in database from post service ?
Basically Post => Kafka => Post Service (Kafka Consumer) => Save to Database(Cassandra). This is what I was thinking. Is anything wrong with this approach ? Also post Service will talk to Asset svc and shortUrl instead.
I believe to handle more posts, we'll have to add more hardware, both on the service and the DB. The DB would take care of maintaining the replicas. We'll not have to worry about it explicitly, unless you are looking from a DBA's perspective. The only problem in adding a Kafka before persisting a post is that, if we loose out the data before persisting it somehow, let's say if the Kafka consumer dies down, it is lost forever. You can build a proper acknowledgement mechanism/retry etc to sort this out, but it the complicates the implementation. And Data loss is BAD! One more potential drawback is that if the lag is Kafka is high, the user might not see there post immediately, which is again bad user experience. So I would rather use a Kafka to spread out information about that post to others, but not before persisting it. In this approach, worst case, it does not show up to some people, maybe it's not tagged correctly, but it's still in the system, and will show up a sometime for sure, maybe a bit later.
We could do it either ways :) Just that the idea was to keep the data in hadoop as well for further processing and also do some counting while reading from kafka.
It's the service that helps you to generate the timeline for a given user. Have a look at this article, hopefully it should get clear after this. www.codekarle.com/system-design/facebook-system-design.html
It would depend on a role ideally. For a usual Software engineer position or a few yrs experienced engineer , the expectation would never be to build this whole thing, but if you are applying for an architect role, I believe the person is expected to build such a system with ease.
I think whole approach to this problem is wrong by stating FB stats. FB never started with 1.7B users nor engineers designed it initially to handle that many users. Also, in a real interview no one going to ask keep in mind the localization issue. There is no way one can draw so many components of the system. Probably few and deep dive into on of those.
You should always consider the stats. We are trying to solve a problem that would state how to build a system that can scale upto these nunbers. You're right, this is not how FB was build, but if we were to build it now, how could we do it? One another common question many interviewers ask is "If you were to build the system you build in your current company again, how would you build it to make it better?" And yes, we could skip out certain Functional requirements, but if you have covered most of the things and have time in the interview, people do talk about not just localization, but things beyond what is covered in this video as well :)
I got hired by Facebook London due to your videos. This is hands down the best system design channel. Your templates and tips are so simple and elegant that it makes system design far less daunting. Thanks a lot Sandeep!
That's fantastic to hear! Congratulations!
@@codeKarle I would like to connect with you on LinkedIn, please accept my invite
@@shubhankar915 whats your linkedin id?
Excellent job buddy .. Watched it for the 2nd time today after looking at all 3rd party tools and it made a lot more sense.
This is a masterpiece. It covers pretty much all aspects of the Social Media Platform. And it has a lot of insights into various system aspects.
Keep making great content.
Hello, I randomly came across this channel while preparing for my next interivew. The content is amzing and really well structured and presented. On top of that use cases covered here while going over the design is also fabulous.
big THANK YOU!
I have learnt a lot from your videos.
Thank you so much.
If possible please continue posting videos on System Design or Low Level Design.
You can help a lot of people to become great Software Engineers
the best system design channel on youtube.
Thank you sir. My friend Anshu Kumar Choudhary, ex-Qualcomm employee is definitely your biggest fan.
Thank you for the explanation!
I have a few questions:
1. What is the purpose of the user & group service? It seems that Graph Service is responsible for the same purpose because it knows who and how close a user connected to other users.
2. At 28:27 you said that if a user is a celebrity then the posts will not be in Redis, but will be returned from the DB. However, it seems that because of the number of people connected to the celebrity then there will be a lot of requests there. I thought about putting in the Redis with priority (first normal user's posts will be deleted, and then famous users' posts will be deleted) and by the timestamp of the post.
Sandeep, bro you're a legend. Such simplified Uber app design.
Classic coverage of concepts in detail. Very much loved it.
These are very detailed microservices system design - awesome.
Really good explanation.
Would love to see some calculations on data size estimations, servers per service, load estimations on kafka and how to scale that
These videos are amazing! I am so glad I found your playlist. I also saw that you recently started as a Tech Lead at Facebook, congratulations man!!
congrats, are you working in facebook BLR ?
Hey...i usually dont comment...but this System Design playlist is just great! Covers a lot of ground on the design/arch unlike others who get deviated and are not 'to the point' like yours! Great work!. Saying this after going through many of them here on YT. BTW Why have you stopped?? Haven't seen anything posted in the last 3 months.
Thanks buddy!! We'll get started again soon, not sure how soon :)
Some of us got a bit too busy with work, and because of the whole COVID thing we moved to different cities and it got a bit difficult to record.
wow , this video is amazing , you have explained everything in so much depth , thank you and keep sharing :)
It would be good to detail out the schema while talking about data storage in both persistent and in-memory storage.
Also for like, I guess every feature backed action (e.g. post, like, comment etc.) are stored in corresponding tables, the likes count is then simply select count(*) where id = .... type of query in those tables. But given volume of records (and possibly data in shards) querying may be inefficient and would put system under load for such side tasks, work around could be to normalize the FR (Functional Requirement) aggregates into normalized table (having id and number_clicks primarily, which gets updated at every action or in batches) so that whenever there is need to get number of likes or number of comments, simple select query is sufficient. For older posts (even from few hours ago from now), this record can be cached into redis. This option selects performant "availability" over "consistency" ("exact data"). Simply having these things in Redis, may result in loss of feature (durability wise) and user experience (as most content creators have attachments towards these metrics)..
Overall a good presentation!
Thanks brother :) I got offers from multiple places! And apart from that got a basic idea of how to go about designing a system.
Thank you so much for taking time to record and share your knowledge 🙏
It is very disheartening to see very less subscribers to this channel. I request to all the viewers who are gearing up for system design interviews , please at least subscribe it or like the video. There are not very good channels for such complex designs end to end . I see some copy pasted channels who just copy the content from a webpage and present in a lucid manner had 100k+ subs which is good however the originality and the in-depth info this channel offers is completely unmatched. There is so much effort goes in architecting the things , lot of trade offs before finalizing the approach , hats off to the Codekarle team for presenting this content. Wishing you 1M + subs..
I read this channel name as Code Karl. Now I heard it as Karle. 😋
Good job buddy 👍. Subscribed.
Excellent explanation with great depth on each service and functional requirement. Good work !!
It's amazing, I was looking for same from long time...but finally I got
Thanks!! Glad to hear that :)
Thanks dude! Everytime I watch your video, it makes me more intelligent 😀🤗
Loved this video. thanks for making it.
Your videos are simply amazing. You cover every aspect of a system design in a very simple manner which not only helps to understand from an interview perspective but also helps in remembering these concepts while actually designing scalable systems in our day-to-day jobs. Thanks for creating such awesome videos. Keep doing the good work and helping the community :)
I think this would be quite close to the complete architecture at Twitter. It's beyond the FRs mentioned in the beginning. Out of curiosity, how long did it take you to come up with this design?
Very good content!!!! No wonder he works at FB now!!
Your videos are amazing. I am a developer at Microsoft and I get different ideas watching your videoes.
Are you up for a interview kind of system design video. In that video I will be asking you questions while you are designing something. This will help the audience in better understanding.
Prof Amazing vidoes.
AWESOME Content Sandeep. Thanks a ton!!!
But, why why you guys stopped?
@codeKarle It will be great if you can also add database schema/data models for all databases. Those help understanding how to shard the database or how to build indexes. Since I am new to System design, I am getting all info from here but need to go to other websites to understand data models for these kind of distributed systems.
Really good video series. Would recommend to friends.
I would advice in working towards improving sound quality of videos.
Really nice! Thanks! :)
Thanks a lot for this video! It's amazing
Good overview of the complete system but in a real interview you should be only working on few prominent features. There is no way anyone can design a complete system in 45 minutes.
Thanks for this detailed video and covering all the aspects of this design, although request you to share all calculation for storage n/w bandwidth and latency and other?
Good job on the explanation. I appreciate it. It was very helpful!!
Thanks for the video. I have one doubt regarding Archival Service. I didn't get the need of it when we have all the posts related data stored in Posts Cassandra. From what I see, Redis here contains info about users news feed which contains the posts of his friends in an order. Why would we need to persist this timeline posts information?
This is really good content. Please a video on Dropbox/GDrive arch
awesome..great explanation
Your videos is really very nice...Please add videos for LLD also
Awesome video
It's an awesome content which is mainly focusing on module. Please write the blog for this as well like the Twitter.
Why do these big companies use relational DBs to store the graph relationship? I mean why not use graph dbs like neo4j? One of the reasons I've read about is that these companies are much older, even before graph dbs came into existence, and moving to a new stack so late would not be easy. Another reason I've read is that these graph dbs are no SQL, hence don't provide most of the guarantees that SQL dbs do. Any thoughts and suggestions on this?
Here are my thoughts : 1. Graph DB's are new and yet to mature to handle the kind of scale being handled. 2. NoSQL has its own pros and cons, one should not use NoSQL for all the use cases. Also a lot of effort has been put to tune the existing data stores which works pretty well with the scale.
Amazing video! Thanks...
I don't see much difference between twitter system design and facebook design ? what extra we are doing here as scale of facebook will be much higher than twitter?
Why 2nd mysql db cluster at 13:00 for user graph service?
Can you please do this - "Design a logging system where many servers are writing to same log. Alert on some keywords."
Great explanation, can you please make a system design for payment gateway like PayPal
Really great explanation. If possible spend some time on Data model, Data/DB size, capacity set of details also
Sandeep, 1 thing is not clear- can we use a NoSQL DB for user-onboarding also? reason being there doesn't seem any strong ACID guaranty requirement.. what's you view on that @codeKarle.
great work sir!
Please make video on online games like Chess/Ludo/Poker/Snooker.
Great video...just curious would it be possible to annotate your design with different level of experience along the way? I get that the entire picture would be your recommendation but how about for someone who is interviewing for engineering manager I vs director :)
Best explanation, though i have some doubt. Here for cassandra which partition key should we use?
you mentioned in the video that we can use "userId" but definitely not "date-range".
My query is won't "userId" also create "read hotspots" when fetching posts of "famous users", can't we partition by "postId" but yeah then in that case we have to do some aggregation of data when fetching posts of particular "userId" from cassandra as it will be spread across cassandra cluster and since there is no inbuilt aggregation feature in cassandra and thus we have to do it manually and it will add some complexity to the system.
Also in that case is it a good idea to add secondary index as "userId" for cassandra clusters?
Can you please share your thoughts OR can someone help with this?
First point don't u think Cassandra won't handle frequent updates like likes and comments ...I mean if you have ever worked with Cassandra this is worst usecase for it .... Anything that updates this frequently Cassandra is not correct db candidate.... partitioning candidate key secondary to this question..... Cassandra requires equal almost partitioning and that logic is subjective ...graph db should have been used in this usecase because Cassandra will go down in production with such frequent updates
Thank you!
Why did you stop making new videos ?
Awesome video! Could you pls help me to understand why is spark streaming required for likes and comment feature?
Can you provide more details about the structure of storing the timeline in casandra db and whow to choose the partition to avoid the hotspot?
Thank you for the great detailed video. How would you design Facebook newsfeed. Is timeline considered newsfeed? If so, what pieces would be needed to design newsfeed.
Very well explained! I am confused in 1 part. How do we manage the DB clusters or service clusters. Don't you think k8s or zookeeper would be needed? Can you please elaborate?
great video, covered all most everything , however could have been bit shorter video what i feel
it's awesome !!
May I know why we may not prioritize a Graph DB like Neo4j to hold friends/followers data?
Better than paid content
Posts are saved in Cassandra. We need to
1. Get post by id
2. Get all posts by user_id
3. Get all posts made by user_id where createdAt > X
If we have chosen the partition key to be post_id, then we can support only Query1. If we have chosen partition key as user_id (not very good as it can lead to imbalanced servers) then we can support Query 2 and 3 but not 1.
One way to support all the queries is by having multiple copies of data but that will take huge storage.
What are your thoughts on this?
please make system design video on type ahead suggestion service
First of all great video 👌 Keep doing the good work. I have seen your Twitter video as well, you have mentioned for large data you prefer Cassandra. But why not Mongo db, from their website they claim it can handle huge data? Then why Cassandra .
Mongo is decent enough, but the choice of Database ideally depends upon the data structure and the query pattern, and in this case, the query pattern would be of the form that Cassandra is good at since all the queries are on message Id.
Have a look at this Video: ruclips.net/video/cODCpXtPHbQ/видео.html
It'll give you good insight into which DB should you choose when.
Shouldn't We be using neo4j for graph service db
Neo4j at the scale starts giving a lot of problems, so probably not a good idea. But if we have a much lesser scale, then that's a much better choice because of a powerful featureset.
This can be a tradeoff we make, based on the Non Functional requirements.
How shall we use the load balancer? Is it predesigned in the service?
Regarding trends. Spark streaming gets us the tags with their count, how exactly this data is used. I believe users would be more interested in posts corresponding to trending tags right? where exactly that information comes from?
Can you please make HLD of Book my Show ?
I think it would have been great if you added db schema, api definintion..
Hi I have few queries.
1. What if we have to query posts by userId. Since we are using postId as a key in Cassandra to get userId for a post we need a full global scan of DB.
2. Can't we use NoSQL for User Table. There was no particular advantage of using MySQL as User DB Cluster cannot have join with other tables.
Please comment on my queries.
Thank you for your Amazing videos.Wondering how would the post Ingestion Service handle that much of high throughput. Is the idea to maintain replicas ?. Also can we use Queue for post Ingestion Service before sending to post service and save the data in database from post service ?
Basically Post => Kafka => Post Service (Kafka Consumer) => Save to Database(Cassandra). This is what I was thinking. Is anything wrong with this approach ? Also post Service will talk to Asset svc and shortUrl instead.
I believe to handle more posts, we'll have to add more hardware, both on the service and the DB. The DB would take care of maintaining the replicas. We'll not have to worry about it explicitly, unless you are looking from a DBA's perspective.
The only problem in adding a Kafka before persisting a post is that, if we loose out the data before persisting it somehow, let's say if the Kafka consumer dies down, it is lost forever. You can build a proper acknowledgement mechanism/retry etc to sort this out, but it the complicates the implementation. And Data loss is BAD!
One more potential drawback is that if the lag is Kafka is high, the user might not see there post immediately, which is again bad user experience.
So I would rather use a Kafka to spread out information about that post to others, but not before persisting it. In this approach, worst case, it does not show up to some people, maybe it's not tagged correctly, but it's still in the system, and will show up a sometime for sure, maybe a bit later.
@@codeKarle Totally Makes sense. Can you please add System Design Video on Trending Topics more in detail
Also while adding new connection/Friend, was thinking FB uses Graph Database like neptune or neo4j. Is it not the case ?
why we are pushing data from spark to hadoop. Spark can do the analytics work or directly push data from kafka to hadoop for batch processing.
We could do it either ways :)
Just that the idea was to keep the data in hadoop as well for further processing and also do some counting while reading from kafka.
Great explanation. 👍. Only direct message requirement is missing.
I didn't see the need of short url service
Great content but audio is pretty bad to be honest
what is timeline services?,
It's the service that helps you to generate the timeline for a given user. Have a look at this article, hopefully it should get clear after this.
www.codekarle.com/system-design/facebook-system-design.html
why there is no comment here?
Really they expect to make this entire architecture and explain in 30 min ?
It would depend on a role ideally. For a usual Software engineer position or a few yrs experienced engineer , the expectation would never be to build this whole thing, but if you are applying for an architect role, I believe the person is expected to build such a system with ease.
I think whole approach to this problem is wrong by stating FB stats. FB never started with 1.7B users nor engineers designed it initially to handle that many users. Also, in a real interview no one going to ask keep in mind the localization issue. There is no way one can draw so many components of the system. Probably few and deep dive into on of those.
You should always consider the stats. We are trying to solve a problem that would state how to build a system that can scale upto these nunbers. You're right, this is not how FB was build, but if we were to build it now, how could we do it?
One another common question many interviewers ask is "If you were to build the system you build in your current company again, how would you build it to make it better?"
And yes, we could skip out certain Functional requirements, but if you have covered most of the things and have time in the interview, people do talk about not just localization, but things beyond what is covered in this video as well :)
@@codeKarle Absolutely correct
This is long winded and boring