Can somebody/@ByteMonk explain the calculations? According to me Storage Capacity per month = Avg Image size * Image uploaded per month by 1 user * MAU (10% of 10^9 registered users ) = 0.005 GB * 500 * 10^8 = 250 PB Network Capacity = Total data transferred/Time taken = Concurrent Uploads * Avg Image size/avg image upload time = 100000 * 0.005GB/5 = 100Gbps
That is a great design. How to handle a use case of user scrolling a feed. We have to track that user viewed the post/feed. So , we won't show the feed once again on his timeline. How to handle this use case.
Instead of creating the Feed on the fly, wouldn´t it be less "stressful" to the DB and Servers to use a NoSQL DB to store a "pre-bulit" Feed? Meaning when a post/coment/like is created a worker updates a feed object in a NoSQL db? This way, fetching the feed would be just accessing one object instead of "Joining" massive tables, what do you think of this approach?
Nice Animation & the Diagram . I could easily understand it . Really Appreciate the effort . Watched the video after checking out other Videos & Blogs . Thanks
there is a problem with user feed design. You mentioned that service first fetches the followers list from metadata service and then gets posts corresponding to each follower. Imagine if a social media addict following 100K users, do u fetch 100K users and all their posts, feed service will take forever to load.
In the Schema design represented as relational, the design is not normalised. We already have a relation between user and post in post table. You could have used the id of this table as foreign key in comments and like table.
My question is do we really. need so many microservices for like, comment, feed, etc? Or one service with multiple endpoints/API? What are the pros and cons?
For new feed generation, why is the request not using the news feed cache that was asynchronously generated by the fanout service by the user's followers? Or is it an hybrid approach that you mentioned here. For users who have a lot followers the feed service queries the metadata and image service to get the feed in real time, but for the users who have few followers it, the fanout service creates a newsfeed cache?
Kindly explain your capacity planning, how did you reach those numbers, you just say the numbers and move on. Explain the formula and how you reached the number, as it is not clear, our Capacity planning numbers is not matching yours
Great video. I’m delivering a mobile app product to the Austin, Texas market that’ll be supported by a similar framework, with an additional layer of ML to encourage in person exploration based on relational account interests
Web servers are a common platform for hosting APIs, but there are other options available. For example, APIs can be hosted in cloud platforms, such as AWS or Microsoft Azure.
@@ByteMonk this means API gateway will forward request to web server (Apache running on ec2). Which will forward request to microservice running on Lambda?
@@mz7640 I think you would typically have the API gateway forward the request to the microservice directly. Each microservice will be running an HTTP server to handle incoming HTTP requests.
@@mz7640 your assumptions are as per the monolithic architecture but Instagram example here is built using Micro Service Architecture which does not follow traditional webserver-Appserver interaction approach.
tbh, I thought other than meta employees or big tech people who're into this stuff, no common people know these things. Because this is legitimately complex. Obv they know stuffs we don't hence they're the leaders of the world and we're the followers.
Every expert was once a beginner. Knowledge is accessible to anyone willing to learn and grow, so don't underestimate your potential - with dedication and effort, We all have the ability to be leaders of our own journey :)
Tbh, a lot of tech people treat things as far more complicated than they need to be and other things as far simpler than they are. This video is detailed, but it is also high level in some places. He is showing you a method of how things COULD be implemented. It’s a jumping off point, an inspiration.
The actual stuff here is simple, but its not easy to complete. I think thats where the confusion comes. Simple does not equal easy. Ex. Push ups are simple, but doing 200 push ups is hard. Take it a step at a time and as you go, youll realize you know more than you thought
Excellent observation. This is going to be a long answer and a good conversation topic during the interview. While real-time loading offers a dynamic and engaging user experience, it's important to carefully consider the trade-offs and design choices to ensure optimal performance and scalability. Advantages of Real-time Loading: Dynamic Content: Real-time loading allows users to see the latest updates and activities as they occur, creating a more engaging and interactive experience. Instant Feedback: Users receive immediate feedback on their actions, such as likes and comments, enhancing their engagement with the platform. User Engagement: Real-time feeds can increase user engagement and retention, as users are more likely to stay active and interact with the app. Challenges and Bottlenecks: Scalability: Real-time systems need to handle a high volume of concurrent users and activities, which can strain the infrastructure and impact performance. Latency: Ensuring low-latency delivery of real-time updates can be challenging, especially as the user base grows. Network Traffic: Frequent updates can generate significant network traffic, affecting the app's responsiveness and consuming bandwidth. Mitigating Challenges: Caching: Implement caching mechanisms to reduce the load on the backend and minimize redundant requests. Distributed Architecture: Employ a distributed architecture to distribute the load and scale horizontally as the user base grows. Batch Processing: Aggregate and process updates in batches to minimize the number of real-time requests. Content Prioritization: Prioritize important updates and ensure that less critical updates are delivered in a less resource-intensive manner. Ultimately, the decision to implement real-time loading depends on the specific requirements of your application, the user expectations, and the technical capabilities of your infrastructure. It's essential to conduct thorough performance testing, monitor system health, and continuously optimize the real-time feed delivery to strike the right balance between engagement and performance
This looks like a chat GPT reply - I’m not saying it’s wrong to use GPT to reply but it’s also less genuine than hearing the authors original thoughts.
@@frankcallahan3631 In our experience chatGPT and other tools hallucinate a lot, and produces too much information. However, when needed, we prompt them with our thoughts to get feedback and use some of the text it generates, but always proof read and refine them. Thanks for your feedback, will strive to be as genuine and raw as possible in future.
Thanks for the feedback, I try to balance speed for each video. For this video, can you also check out playback speed RUclips provides and try playing it at 0.75%
These are very very "Interview"-heavy videos... In a way you cannot really clone it with something like Node.js + Postgres + AWS (any hosting).... And HOW ON EARTH DO YOU IMPLEMENT THE BLOCKING SYSTEM?
Yeah, these videos are made from interview perspective. I imagine Implementing a blocking system will involve handling database operations and building the necessary API endpoints to enable users to block and unblock others.
@@ByteMonk I think block is more complicated than just blocking and unblocking.. Suppose you have blocked an user and that user commented in one of your friends post. How do you prepare custom post metadata that doesn’t include the blocked user activities. ?
Great point, this could be a follow up question/discussion during an interview or a separate system design question altogether. To prepare custom post metadata that doesn't include the activities of blocked users, you can follow these steps: 1. Retrieve Post Metadata: Fetch the metadata of the post that you want to display, including the comments and activities associated with it. 2. Filter Blocked Users: Iterate through the comments or activities and identify any users who have been blocked by the viewer. Check against the blocked user list to determine if the comment or activity was generated by a blocked user. 3. Exclude Blocked User Activities: Remove any comments or activities generated by blocked users from the post metadata. This can be done by either excluding them from the response or flagging them as blocked. 4. Prepare Custom Metadata: Create a modified version of the post metadata that excludes the activities of blocked users. This may involve creating a new object or modifying the existing one to remove the blocked user information. 5. Return Custom Metadata: Return the custom post metadata to the client application for display. Ensure that the modified metadata only contains the activities of non-blocked users, providing a customized view that respects the blocking preferences of the viewer. By filtering out the activities of blocked users during the preparation of post metadata, you can create a custom view that doesn't include any content or actions from blocked users. This ensures that the viewer's experience is tailored to their preferences, providing a more personalized and relevant feed without unwanted interactions from blocked individuals.
okay not only is this a great explanation but your presentation voice is spectacular, its so easy to listen to you
Can somebody/@ByteMonk explain the calculations? According to me
Storage Capacity per month = Avg Image size * Image uploaded per month by 1 user * MAU (10% of 10^9 registered users ) = 0.005 GB * 500 * 10^8 = 250 PB
Network Capacity = Total data transferred/Time taken = Concurrent Uploads * Avg Image size/avg image upload time = 100000 * 0.005GB/5 = 100Gbps
Also, 1 GBps == 8 Gbps
idk the calculations in the video didnt make sense to me too.
Benefit to every techie. Kudos
That is a great design. How to handle a use case of user scrolling a feed. We have to track that user viewed the post/feed. So , we won't show the feed once again on his timeline. How to handle this use case.
Instead of creating the Feed on the fly, wouldn´t it be less "stressful" to the DB and Servers to use a NoSQL DB to store a "pre-bulit" Feed? Meaning when a post/coment/like is created a worker updates a feed object in a NoSQL db?
This way, fetching the feed would be just accessing one object instead of "Joining" massive tables, what do you think of this approach?
Nice Animation & the Diagram . I could easily understand it . Really Appreciate the effort .
Watched the video after checking out other Videos & Blogs . Thanks
there is a problem with user feed design. You mentioned that service first fetches the followers list from metadata service and then gets posts corresponding to each follower. Imagine if a social media addict following 100K users, do u fetch 100K users and all their posts, feed service will take forever to load.
In the Schema design represented as relational, the design is not normalised.
We already have a relation between user and post in post table.
You could have used the id of this table as foreign key in comments and like table.
My question is do we really. need so many microservices for like, comment, feed, etc? Or one service with multiple endpoints/API? What are the pros and cons?
for upload, I think endpoint should be done through user, something like this: POST user/{userId}/Image
Excellent video with detailed information.
Very well explained the high level overview!
Really very awesome video. Thankyou so much for great explanation 🔥🔥
Sorry. This may be a question out-of-context. Just curious to know what tools you use for designs and diagrams
is it a beginner mistake or does the requirement calculation's maths not add up?
Thank you bro it's very helpfull for me. ❣
Thank you, you did a great job. It is beneficial.
For new feed generation, why is the request not using the news feed cache that was asynchronously generated by the fanout service by the user's followers? Or is it an hybrid approach that you mentioned here. For users who have a lot followers the feed service queries the metadata and image service to get the feed in real time, but for the users who have few followers it, the fanout service creates a newsfeed cache?
Thanks for sharing this
Kindly explain your capacity planning, how did you reach those numbers, you just say the numbers and move on. Explain the formula and how you reached the number, as it is not clear, our Capacity planning numbers is not matching yours
Quality content bro😊
Great video.
I’m delivering a mobile app product to the Austin, Texas market that’ll be supported by a similar framework, with an additional layer of ML to encourage in person exploration based on relational account interests
That is awesome!
@@ByteMonk where are you located?
Any startup experience? Funding, technical oversight?
The API box is a web server ?
Web servers are a common platform for hosting APIs, but there are other options available. For example, APIs can be hosted in cloud platforms, such as AWS or Microsoft Azure.
@@ByteMonk this means API gateway will forward request to web server (Apache running on ec2). Which will forward request to microservice running on Lambda?
@@mz7640 I think you would typically have the API gateway forward the request to the microservice directly. Each microservice will be running an HTTP server to handle incoming HTTP requests.
@@mz7640 your assumptions are as per the monolithic architecture but Instagram example here is built using Micro Service Architecture which does not follow traditional webserver-Appserver interaction approach.
Rest api endpoints seems wrong
9:25 - distributed queue
tbh, I thought other than meta employees or big tech people who're into this stuff, no common people know these things. Because this is legitimately complex. Obv they know stuffs we don't hence they're the leaders of the world and we're the followers.
Every expert was once a beginner. Knowledge is accessible to anyone willing to learn and grow, so don't underestimate your potential - with dedication and effort, We all have the ability to be leaders of our own journey :)
@@ByteMonkdamn thanks for this❤
Tbh, a lot of tech people treat things as far more complicated than they need to be and other things as far simpler than they are. This video is detailed, but it is also high level in some places. He is showing you a method of how things COULD be implemented. It’s a jumping off point, an inspiration.
The actual stuff here is simple, but its not easy to complete. I think thats where the confusion comes. Simple does not equal easy. Ex. Push ups are simple, but doing 200 push ups is hard.
Take it a step at a time and as you go, youll realize you know more than you thought
@@ThatsJoshTyler Thankyou for your brief insight and pov on the subject matter being discussed here. I gotcha
👍
9:59 - Use cases
Damnnm
loading feed in realtime? I think it is a bottleneck.
Excellent observation. This is going to be a long answer and a good conversation topic during the interview.
While real-time loading offers a dynamic and engaging user experience, it's important to carefully consider the trade-offs and design choices to ensure optimal performance and scalability.
Advantages of Real-time Loading:
Dynamic Content: Real-time loading allows users to see the latest updates and activities as they occur, creating a more engaging and interactive experience.
Instant Feedback: Users receive immediate feedback on their actions, such as likes and comments, enhancing their engagement with the platform.
User Engagement: Real-time feeds can increase user engagement and retention, as users are more likely to stay active and interact with the app.
Challenges and Bottlenecks:
Scalability: Real-time systems need to handle a high volume of concurrent users and activities, which can strain the infrastructure and impact performance.
Latency: Ensuring low-latency delivery of real-time updates can be challenging, especially as the user base grows.
Network Traffic: Frequent updates can generate significant network traffic, affecting the app's responsiveness and consuming bandwidth.
Mitigating Challenges:
Caching: Implement caching mechanisms to reduce the load on the backend and minimize redundant requests.
Distributed Architecture: Employ a distributed architecture to distribute the load and scale horizontally as the user base grows.
Batch Processing: Aggregate and process updates in batches to minimize the number of real-time requests.
Content Prioritization: Prioritize important updates and ensure that less critical updates are delivered in a less resource-intensive manner.
Ultimately, the decision to implement real-time loading depends on the specific requirements of your application, the user expectations, and the technical capabilities of your infrastructure. It's essential to conduct thorough performance testing, monitor system health, and continuously optimize the real-time feed delivery to strike the right balance between engagement and performance
This looks like a chat GPT reply - I’m not saying it’s wrong to use GPT to reply but it’s also less genuine than hearing the authors original thoughts.
@@frankcallahan3631 In our experience chatGPT and other tools hallucinate a lot, and produces too much information. However, when needed, we prompt them with our thoughts to get feedback and use some of the text it generates, but always proof read and refine them. Thanks for your feedback, will strive to be as genuine and raw as possible in future.
@@ByteMonk I loved the video by the way.
And I subscribed
Tinder System Design
you read very fast. Very difficult to understand your voice and fit into mind. Please take some more time and try to teach slow
Thanks for the feedback, I try to balance speed for each video. For this video, can you also check out playback speed RUclips provides and try playing it at 0.75%
It’s fine for me.
These are very very "Interview"-heavy videos... In a way you cannot really clone it with something like Node.js + Postgres + AWS (any hosting)....
And HOW ON EARTH DO YOU IMPLEMENT THE BLOCKING SYSTEM?
Yeah, these videos are made from interview perspective. I imagine Implementing a blocking system will involve handling database operations and building the necessary API endpoints to enable users to block and unblock others.
@@ByteMonk I think block is more complicated than just blocking and unblocking..
Suppose you have blocked an user and that user commented in one of your friends post.
How do you prepare custom post metadata that doesn’t include the blocked user activities. ?
Great point, this could be a follow up question/discussion during an interview or a separate system design question altogether. To prepare custom post metadata that doesn't include the activities of blocked users, you can follow these steps:
1. Retrieve Post Metadata: Fetch the metadata of the post that you want to display, including the comments and activities associated with it.
2. Filter Blocked Users: Iterate through the comments or activities and identify any users who have been blocked by the viewer. Check against the blocked user list to determine if the comment or activity was generated by a blocked user.
3. Exclude Blocked User Activities: Remove any comments or activities generated by blocked users from the post metadata. This can be done by either excluding them from the response or flagging them as blocked.
4. Prepare Custom Metadata: Create a modified version of the post metadata that excludes the activities of blocked users. This may involve creating a new object or modifying the existing one to remove the blocked user information.
5. Return Custom Metadata: Return the custom post metadata to the client application for display. Ensure that the modified metadata only contains the activities of non-blocked users, providing a customized view that respects the blocking preferences of the viewer.
By filtering out the activities of blocked users during the preparation of post metadata, you can create a custom view that doesn't include any content or actions from blocked users. This ensures that the viewer's experience is tailored to their preferences, providing a more personalized and relevant feed without unwanted interactions from blocked individuals.
@@ByteMonkso helpful! Thank you!