This is the best instagram system design video without any fluff. One correction is, CDNs typically cache content based on its popularity and request frequency (so, not just for celebrities). So, a pic posted by some normal person which goes viral is also cached by CDN.
Great content Pratiksha and loved your structured approach to the overall design. May I point out a small mistake in the storage estimation. 100 mil posts per day with an average size of a post being 10 MB would be 100 * 10^6 * 10 * 10^6(1 MB) B = 1000 * 10^12 = 1000 TB
Finally a properly structured sys design video with all the pieces in the right order. One feedback is, you went a little fast on of few parts without proper reasoning for the choices you are making. Like why fanout service is needed to update the feed of users and how it will work. How multiple containers will be able to talk to multiple containers of upload media service before you put in the message queue. How CDN will work only for celebrities? Why not for all? One of the most important question in these systems designs is how to speed up the feed generation by pre generating the feed instead of generating it on runtime and putting the load on DB. That's the part you should have spent sometime on. Overall of great structured video. 10x better than the already available videos on RUclips. Subscribed.
Good one yet again. one suggestion, Can you please make a dedicated vIdeo on fault tolerance? As most of the videos just contain a template of server and database replication as a default fault tolerance strategy, diving deep into how the system will recover and rollback when a distributed transaction happens across Microservices will help a lot for the viewers.Also,naming of the microservices can be a little more intuitive>if it was intentionally used for simplicity reasons kindly ignore.Also please justify the tradeoff of choosing NoSQL for storing the Posts data, as NoSQL is not transactional in nature.
Hey. Ur system design video is really helpful and very much simpler to understand. The approach n sequence seems great. Just a suggestion plz give details more about using elastic search, Kafka or any messaging queue.also spark or Hadoop if necessary anywhere in design
What does mean "create post service simply adds media to the queue"? Just push binary data to the queue is bad idea, so we need to save this data in tmp storage (disk maybe) and push link to the location as an message in queue. Or what did you meant?
I find your system design videos to be very pragmatic. Can you do a video for technical retrospective as well. Would love to hear how you deep dive into a previous project
one suggestion why do we create a separate database for each feature(user, Post, interactions etc.) I got the idea of Relationships(GraphDB). If we maintain separate databases for each type of object we need to make 3 additional DB calls to fetch the data.
4. we pick nosql because the nested structure of comments but then you provided a flat structured schema for the Interactions database. I am missing something
Please put more effor in researching and reading about the existing social media applications before comming up with a design. Fetching everything via get feed service - even media files - seems very wrong as that data can be fetched independently (via cdn or some other read post service) once postIds are retrieved. Same goes for iteration db data as that also can be fetched independently. If this all would have been returned by a single API - it will take forever.
Also, Can posts and interaction data reside in a single database? Say if we need to retrieve a post and all its interactions, having a n/w call from post service to interaction service is really costly. Instead ,if we can have both of them stored locally in the same database< retrieval is much more convenient>Also we can add a caching layer on top of the posts and interactions data for frequently accessed celebrity posts which were least frequently updated
It seems you take user as a db and not a table. Why so? Why every entity is treated as a different db and a table? If it is microservises we need to talk about the overhead pf talking to each other and add a gateway. Am I missing something?
I found the information you provided to be very helpful and informative, especially considering it only took 12 minutes. Would it be possible for you to share a PDF version of this material? That would be greatly appreciated.
Hello Ronish, Thank you for sharing the feedback ! I would be happy to create pdf for the new videos I make. For past videos, I will check and see if it’s easy to make pdfs out of what I have.
Hello @vanvothe4817, You will find all the resources in "How to Ace a System Design Interview" video. In the beginning of the video, I have shared important concepts that are useful to learn but you can skip over that and directly go to the resources mentioned later in the video. Hope this helps :) All the best!
I dont understand Interaction Database design part. You are storing PostID and the UserID along with comment and you mention nesting. But how will you support nesting this way ?
Also if youre posting the comment then the Post DB would get updated right ? so why would you need another interaction DB to begin with... hmmm I would have made the post DB to be noSQL and added nesting there for comments. You could make the POST API call to update the POST DB given the same parameters and the index of the new comment on the post
Thank you, Divya! Are you asking for the editor I am using? It’s Excalidraw. If you want to know more about the interview preparation tools then watch “crack System design interview” video. Towards the end there are some great resources!
If the data in the Post DB is archived (for example every 6 months as you have mentioned) how can older data (> 6 months) be accessed if a user tries to access older post
We can have a on-demand retrieval mechanism, that restores archived content as user tries to access it. User will experience a slight delay when image is loading. Ex: Amazon S3 Standard-Infrequent Access (S3 Standard-IA) could be an ideal candidate. Please read more here: aws.amazon.com/s3/storage-classes/
She has a video named "Ace the system design interview" (or something similar), where she shows at the end that she is using excalidraw. She also shows which shape libraries she is using.
Have a lot of problems, 1- this system for monolithic and big system use microservice and in microservice system design is different, 2- in database image is another table, and it's a very bad example of system design. 3- in API request must have pagination, no limitation for result. And ...
This is the best instagram system design video without any fluff. One correction is, CDNs typically cache content based on its popularity and request frequency (so, not just for celebrities). So, a pic posted by some normal person which goes viral is also cached by CDN.
Great content Pratiksha and loved your structured approach to the overall design. May I point out a small mistake in the storage estimation. 100 mil posts per day with an average size of a post being 10 MB would be 100 * 10^6 * 10 * 10^6(1 MB) B = 1000 * 10^12 = 1000 TB
this is by far one of the most comprehensive and concise system design video for Instagram I've ever seen. Well done!
Great to hear!
Literally you made system design so simple, thank you much
Thank you so much! This made my day :)
Very simple and easy to understand! Looking forward to more design videos. Thanks!
Finally a properly structured sys design video with all the pieces in the right order.
One feedback is, you went a little fast on of few parts without proper reasoning for the choices you are making.
Like why fanout service is needed to update the feed of users and how it will work. How multiple containers will be able to talk to multiple containers of upload media service before you put in the message queue.
How CDN will work only for celebrities? Why not for all?
One of the most important question in these systems designs is how to speed up the feed generation by pre generating the feed instead of generating it on runtime and putting the load on DB. That's the part you should have spent sometime on.
Overall of great structured video. 10x better than the already available videos on RUclips. Subscribed.
My question exactly
Awesome 😮 information mam . It's my first step learning system design.. i think it's a great start ... ❤
Good one yet again. one suggestion, Can you please make a dedicated vIdeo on fault tolerance? As most of the videos just contain a template of server and database replication as a default fault tolerance strategy, diving deep into how the system will recover and rollback when a distributed transaction happens across Microservices will help a lot for the viewers.Also,naming of the microservices can be a little more intuitive>if it was intentionally used for simplicity reasons kindly ignore.Also please justify the tradeoff of choosing NoSQL for storing the Posts data, as NoSQL is not transactional in nature.
This is excellent and to the point! BTW, which drawing tool do you use/suggest?
Amazing Video! What's the tool that you're using?
The best system design interview I have seen and this gives me confidence for the interviews
Thank you so much for sharing that! Gives me encouragement to do more of these :)
Very useful thank you
Clear😊 thanks
Excellent, you made the system design so simple. Thank you so much. Keep posting good content.
Thank you so much :)
Clear ,Concise and structured explanation . Thank you so much
Glad it was helpful!
Thank you for the video! Could you please share which libraries you use with Excalidraw for system design?
Hey. Ur system design video is really helpful and very much simpler to understand. The approach n sequence seems great. Just a suggestion plz give details more about using elastic search, Kafka or any messaging queue.also spark or Hadoop if necessary anywhere in design
You're simply the best in these system design tutorials
Thank you, iSaac! Appreciate the feedback! Will upload more videos soon!
Awesome content out of all watched so far..simple and relatable
Glad you liked it!
Thank you for this content hope to see more system design interview questions covered by you
More to come soon!
The best sytem design videos aroud. really like the method of starting small and dealing with high throughput and availability next.
What does mean "create post service simply adds media to the queue"? Just push binary data to the queue is bad idea, so we need to save this data in tmp storage (disk maybe) and push link to the location as an message in queue. Or what did you meant?
Hands down the best Instagram system design video. Would you also be able to do a system design video on trading system or a position keeping system ?
Your videos are very helpful..please continue doing more videos...please post videos on microservices and kubernetes
Thank you so much! More videos to come:)
Very good
I find your system design videos to be very pragmatic. Can you do a video for technical retrospective as well. Would love to hear how you deep dive into a previous project
That's a great idea! Once I have enough system design videos, I will consider this a next topic. Thanks
Thanks Pratiksha for always delivering informative contents.
Thank you so much @machinelearning6726 for sharing the feedback!
one suggestion why do we create a separate database for each feature(user, Post, interactions etc.) I got the idea of Relationships(GraphDB). If we maintain separate databases for each type of object we need to make 3 additional DB calls to fetch the data.
Very helpful
What tool do you use for designing systems in your videos
4. we pick nosql because the nested structure of comments but then you provided a flat structured schema for the Interactions database. I am missing something
Thanks for making this video. Was the ending abrupt? Is there a part 2 of this ?
I have covered all the content, there is no part 2. Thank you for pointing that out! It’s good feedback, I will do proper closure in next videos!
Thank you Pratiksha for quick and informative contents, please make videos on different category of system design questions
Thank you 😊
You should have talked about fan out service, how it will pre create user feed
Why didn't you use NOSQL for posts, since we are ok with eventual consistency and it also scales well.
Your channel is underrated..
For storage it should be 1000TB or 1PB per day? 100M * 10MB
Please put more effor in researching and reading about the existing social media applications before comming up with a design. Fetching everything via get feed service - even media files - seems very wrong as that data can be fetched independently (via cdn or some other read post service) once postIds are retrieved. Same goes for iteration db data as that also can be fetched independently. If this all would have been returned by a single API - it will take forever.
Also, Can posts and interaction data reside in a single database? Say if we need to retrieve a post and all its interactions, having a n/w call from post service to interaction service is really costly. Instead ,if we can have both of them stored locally in the same database< retrieval is much more convenient>Also we can add a caching layer on top of the posts and interactions data for frequently accessed celebrity posts which were least frequently updated
very helpful, thanks
Thank you!
problem with graph storage is that its gonna be a big mess solving distributed queries.
It seems you take user as a db and not a table. Why so? Why every entity is treated as a different db and a table? If it is microservises we need to talk about the overhead pf talking to each other and add a gateway. Am I missing something?
I found the information you provided to be very helpful and informative, especially considering it only took 12 minutes. Would it be possible for you to share a PDF version of this material? That would be greatly appreciated.
Hello Ronish,
Thank you for sharing the feedback ! I would be happy to create pdf for the new videos I make. For past videos, I will check and see if it’s easy to make pdfs out of what I have.
Really helpful!
Glad you think so!
very nice SD videos you are doing :)
Thank you 🙏🏼
Zerodha grow and any fintech design please
How to learn design system? Do you recommend any book?
Hello @vanvothe4817,
You will find all the resources in "How to Ace a System Design Interview" video. In the beginning of the video, I have shared important concepts that are useful to learn but you can skip over that and directly go to the resources mentioned later in the video.
Hope this helps :) All the best!
correction the memory req would be 1 petabyte.
I dont understand Interaction Database design part. You are storing PostID and the UserID along with comment and you mention nesting. But how will you support nesting this way ?
Also if youre posting the comment then the Post DB would get updated right ? so why would you need another interaction DB to begin with... hmmm
I would have made the post DB to be noSQL and added nesting there for comments. You could make the POST API call to update the POST DB given the same parameters and the index of the new comment on the post
Really well informative and structured video! Also, which tool you are using for high-level design?
Thank you, Divya! Are you asking for the editor I am using? It’s Excalidraw. If you want to know more about the interview preparation tools then watch “crack System design interview” video. Towards the end there are some great resources!
@@pratikshabakrola Sure, will definitely watch it. Thanks for the suggestion Pratiksha.
where did you get your accent?
ha ha ! I am not sure! I think I pick up accent pretty quickly!
What is the tool that you are using to draw?
I am using Excalidraw. It's a great tool for practicing interviews or any realtime collaborations. It also has tons of built-in libraries of graphics.
If the data in the Post DB is archived (for example every 6 months as you have mentioned) how can older data (> 6 months) be accessed if a user tries to access older post
We can have a on-demand retrieval mechanism, that restores archived content as user tries to access it. User will experience a slight delay when image is loading.
Ex: Amazon S3 Standard-Infrequent Access (S3 Standard-IA) could be an ideal candidate. Please read more here: aws.amazon.com/s3/storage-classes/
Why all microservices are tightly coupled , Talking to each others database . Its a very basic design.
Hi, which app/website are you using to create this diagram ?
She has a video named "Ace the system design interview" (or something similar), where she shows at the end that she is using excalidraw. She also shows which shape libraries she is using.
I am using Excalidraw
@jelenamarusic3641
Thank you for helping others with these questions :)
Have a lot of problems,
1- this system for monolithic and big system use microservice and in microservice system design is different,
2- in database image is another table, and it's a very bad example of system design.
3- in API request must have pagination, no limitation for result.
And ...
Something wrong when calculating storage of posts per day.
are you a human or robot ?
Do I sound like a robot? lol
You've lost one 10 in your calculation, it's actually 1000TB/day
Thank you for the video! Could you please share which libraries you use in Excalidraw for system design?