Instagram System Design | Facebook Feed | Promise Based Cache | Feed Generation Design

Поделиться
HTML-код
  • Опубликовано: 21 ноя 2024

Комментарии • 80

  • @thekingraghuraman
    @thekingraghuraman 2 месяца назад +1

    Amazing explanation.
    Would just to add that while there was a Load balancer added to separate read and write request, there is a pattern called CQRS which does the same. it involves setting a separate read DB and a write DB with their own services

  • @prajyotlawande193
    @prajyotlawande193 9 месяцев назад +1

    Man this is one of the best System design videos I’ve seen in a while. You deserve to become more famous 👌🏼

    • @TheTechGranth
      @TheTechGranth  9 месяцев назад

      Do share like and subscribe 😀

  • @shandubey1704
    @shandubey1704 3 года назад +1

    Finally got nice content for Instagram System Design. Covered most of the point. Really helpful. Thanks

    • @TheTechGranth
      @TheTechGranth  3 года назад

      Glad it was helpful. Do like and subscribe and share with others 🙂

  • @3dlove100
    @3dlove100 3 года назад +1

    Learned few new things here .. FAN OUT SERVICE , PROMISED BASED CACHE.. Thanks for such detailed explanation..! Keep it up.

    • @TheTechGranth
      @TheTechGranth  3 года назад

      Glad it was helpful. Do like share and subscribe 🙂

    • @kvv6452
      @kvv6452 2 года назад +1

      @@TheTechGranth promise based cache was icing on the cake. Thanks. Instagram recently mentioned in one of the tech talks.

  • @yodaddy05
    @yodaddy05 2 года назад +2

    Hands down prob one of the best system design videos out there. Covered everything so elegantly. Highly underrated. I'm subscribed!

    • @TheTechGranth
      @TheTechGranth  2 года назад

      Hope it was helpful. Do share with others 😀

  • @iammjpops
    @iammjpops 2 года назад +1

    I am not sure why this playlist is not popular. The most crisp and to the point content. Thanks a lot for your efforts.
    Also it would be great, if at the end of the video we can get some real world solutions orgs are using. Like you mentioned about promise cache which Instagram is using, Similarly I think there is a gossip protocol SWIM which Uber uses.
    Also to add, for comment and likes count we can take help of a fantastic DS introduced by Redis, HyperLogLog. You can have a look at it, and maybe show us one use case in your next video.

  • @arunpatil2041
    @arunpatil2041 3 года назад +3

    Liked the way you explained the feed generation and 'like' aggregation logic. Also, it was overall very detailed design with lots of information. Thank you!

    • @TheTechGranth
      @TheTechGranth  3 года назад

      Glad it was helpful. Do like and subscribe and share with others

  • @BinayRay
    @BinayRay 3 года назад +5

    Also instead of having all data in tables, i guess we can have a combination of RDBMS and NOSQL dbs. That will be much faster. But it’s my opinion I may be wrong. On the positive side, i really like your videos they are really insightful.

    • @TheTechGranth
      @TheTechGranth  3 года назад

      Yes it can be split in rdbms and nosql based on exact requirements we are going to meet.
      Thanks a lot, hope these are helpful. Do like and subscribe and share with your friends

  • @theSilentPsycho
    @theSilentPsycho 2 года назад +4

    I think it is better to store comments/likes for a post inside the post itself. comments/likes cannot exist without the post. when the post is deleted, all other things need to be deleted. Moreover, comments/likes are not searchable on any platform (for a reason of course).
    In the cache we may only store the number of likes and top few comments. When the user hits "show more comments" then we actually hit the post db, to find out more comments on that post. To delete a comment/like, user can pass us the [postID, commentID] from the UI. what do you suggest ?
    post = {
    id:...
    ...
    comments:[ ],
    likes: [ ]
    }

  • @shreddedvarun
    @shreddedvarun 7 месяцев назад

    I have seen many instagram design videos, this one is better than others

    • @gauravsingla6444
      @gauravsingla6444 6 месяцев назад

      I see the same comment under every other video.😂

  • @ShekharKumar8034
    @ShekharKumar8034 5 месяцев назад

    Hey man great video and thanks for explaining Instagram design in simple terms. Just one small suggestion: In the ending section of the video, it would be super helpful if you can also show the final architecture diagram to grasp the full picture once again, just like we do on the whiteboard when we are done providing the solution :)

  • @vikramsaurabh8240
    @vikramsaurabh8240 3 года назад +8

    I wonder why this video has so low views or likes...It is very well explained apart from the estimations :)....way better than those overhyped channels...way to go bro...this helped me a lot...thanks for your hard work!!

    • @TheTechGranth
      @TheTechGranth  3 года назад +1

      Glad it was helpful to you :) Do like and subscribe and share with others. It might help the views and likes 🙂

  • @Paradise-kv7fn
    @Paradise-kv7fn 3 года назад +3

    One of the most detailed system design videos for Instagram. I wasn't much aware of the fan out concept before this video. Thank you.
    I have a question though. At 32:35, you said that we will only be reading a small number of columns for which a columnar db makes more sense. But lets say the Post table consist of post_id, user_id, caption, created_on, image_link. So, wouldn't all this information be required? I mean we should show the author, image, caption, created_on etc along with the post in the User feed(The same happens in actual Instagram too). So, why are we saying that we only need to read "some" columns in majority of the cases.
    I understand that it might be difficult to scale RDBMS at such large scale through sharding but other than that, the only reason I can think of for not using RDBMS is that we need partition tolerance and availability for which cassandara might be a better choice. Am I missing something else which might indicate as to why we shouldn't use RDBMS?

    • @TheTechGranth
      @TheTechGranth  3 года назад +1

      5-6 columns are small set of columns here the major problem arises when we have to aggregate stuffs like number of comments and likes for each post for each user. This is where columanar store will do its magic 🙂

  • @sathishrajasekar1155
    @sathishrajasekar1155 3 года назад +1

    Got an overview on the System Design, Capacity Planning and soon... Thank you.

    • @TheTechGranth
      @TheTechGranth  3 года назад

      Glad it was helpful, do like and subscribe and share with other 🙂

  • @RS-vu5um
    @RS-vu5um 2 года назад +1

    Very well explained. Your effort is greatly appreciated

  • @chisomedoka5651
    @chisomedoka5651 23 дня назад +1

    This is Gold, thank you so much

  • @vcreations1110
    @vcreations1110 2 года назад +1

    Thanks for this wonderful session! I have one question regarding sharding, Can you explain how is sharding by postid efficient ? data would be loaded equally but incase when we want to query all post of a specific user we may need to query multiple shards rt?

  • @mr.6889
    @mr.6889 7 месяцев назад

    two questions:
    1) why not using some other cache service separately for celebs?
    2) why not storing the count of likes in post tables and separate like on other tables so when user will like it will increament the like in post table also insert like in like table!

  • @ravishekhawat5489
    @ravishekhawat5489 3 года назад +2

    Please make a separate video the functionality of columnar database. In what use cases, it is advisable to go for the same?

    • @TheTechGranth
      @TheTechGranth  3 года назад

      I have already come up with a video on, how to choose a database for you system. Have covered the requirements you are asking:
      So like share and subscribe 🙂
      ruclips.net/video/leGv3PIaCn4/видео.html

  • @10_min_infra
    @10_min_infra 2 года назад

    Very pratical and details , Thanks man

  • @saravanasai2391
    @saravanasai2391 6 месяцев назад

    That is a great explanation. But, How will you handle the user viewed feed/post. If user is scrolling the feed fast.We need to track that user view this post.So, we don;t show the same post/feed again.

  • @aditigupta6870
    @aditigupta6870 8 месяцев назад

    In the post table, you mentioned sir that photoURL will be the path to photo in S3, but a single post will have mulitple pics/videos each of which will have unique photoURL from S3 na?

  • @omkarpatil7448
    @omkarpatil7448 3 года назад +1

    stilll not sure why we need a rdbms as you mentioned in the earlier part of the video. can you elaborate on that in detail please? have a loop coming up haha

    • @TheTechGranth
      @TheTechGranth  3 года назад

      It is just to store the relational data. When it comes to photos and videos you have to store the metadata for these as well
      Check out this:
      ruclips.net/video/leGv3PIaCn4/видео.html

  • @PranitKothari
    @PranitKothari 2 года назад

    From where did you learn in such details?

  • @mehtavijapur
    @mehtavijapur 3 года назад

    Good to know about promised based cache! thanks

    • @TheTechGranth
      @TheTechGranth  3 года назад

      Glad it was helpful. Do like share and subscribe :)

  • @BinayRay
    @BinayRay 3 года назад +1

    The storage you calculated in the beginning is on the image n video. Also you are saying that image storage will be in s3 that means 973 gb will not be in db. Db will have only metadata. The data in database will be high because of lot of users and we need to shard I agree with you but it will be less than 973gb as image/video storage is separate.

    • @TheTechGranth
      @TheTechGranth  3 года назад

      The estimation I added here was just for image storage, metadata and users will be in db and yes size will be more

    • @kvv6452
      @kvv6452 2 года назад +1

      @@TheTechGranth 970 Gb estimated was for images. They are being stored in S3.
      We need to have additional estimations for the Db. Right ?
      Also, can we use graph db for representing connections or follower relationships. ? Will there be CDNs present for storing the images/reels.
      Can you make notes on reconcile service for computing the likes. Need some clarity on the flow

    • @TheTechGranth
      @TheTechGranth  2 года назад

      @K V V yes you are correct regarding the estimate.
      Graph db may not be required here as the schema will be straight forward, you can check the Instagram reels system design video for the likes and db part.

    • @TheTechGranth
      @TheTechGranth  2 года назад

      @@kvv6452 ruclips.net/video/OPo_FB35E04/видео.html

  • @sunny0287
    @sunny0287 7 месяцев назад

    How the Url Shortner service will save the space in this case for photos ??

  • @saranyavivekanandan9044
    @saranyavivekanandan9044 Год назад

    Why user and follower tables are mySql and other tables that are related to post service in Cassandra?

  • @mkalicharan
    @mkalicharan 3 года назад +1

    Very nice video boss.

    • @TheTechGranth
      @TheTechGranth  3 года назад

      Gald it was helpful. Do like and subscribe and share with your friends 🙂

  • @aditigupta6870
    @aditigupta6870 8 месяцев назад

    How do we ensure practically that few instances of a service are for writing and few instances are for reading

  • @anjanagupta5614
    @anjanagupta5614 2 года назад +1

    Just woww

    • @TheTechGranth
      @TheTechGranth  2 года назад

      Do like share and subscribe :) and check out other videos on system design hld and lld

  • @AnilKumar-f4p1n
    @AnilKumar-f4p1n 2 месяца назад

    is it good to use graph db here?

  • @sunilbansal8659
    @sunilbansal8659 2 года назад +1

    Is one table enough for the "follow" part. Some videos suggest two tables : one for the followers(user's followers) and another one for followings(the people who the user follows). It seems one can serve the purpose of both. Not sure if there can be any advantage of having two seperate tables. Any thoughts?

    • @TheTechGranth
      @TheTechGranth  2 года назад +1

      Query to pick up the following part seemed simple and straight forward to me, plus the way we shared the data, would be able to handle the query load. Duplicating data makes sense in case where we have significant performance improvement, here I do not see any such thing

  • @jainso
    @jainso 3 года назад +2

    can you explain why we need elastic search and mysql db. Can't elastic search handle all the operations.?

    • @TheTechGranth
      @TheTechGranth  3 года назад +1

      Elastic search has it's own capabilities and cons. When dealing with structured and relational data, I would always prefer a rdbms over a no sql database
      This is my take on choosing database
      ruclips.net/video/leGv3PIaCn4/видео.html

    • @jainso
      @jainso 3 года назад +1

      @@TheTechGranth Hi thanks for your reply. I am confused in the explanation why we need both elastic search and mysql for storing same data. Are we n't doing duplication. if we need to use elastic search for string based search query cant we use it for searching a particular object by it's id. I am not much familiar with elastic search so please feel free to redirect to some link if that can be helpful.

    • @TheTechGranth
      @TheTechGranth  3 года назад +1

      @@jainso I thought of that to optimize the search api and the user api both, yes data duplication is there but trade off is faster response time and consistent user data.
      This is my thought process

    • @jainso
      @jainso 3 года назад +1

      @@TheTechGranth thanks for your quick response.

    • @TheTechGranth
      @TheTechGranth  3 года назад +1

      @@jainso You are welcome. Do like and subscribe and share with your friends 🙂

  • @subee128
    @subee128 3 месяца назад

    Thanks

  • @aakash1763
    @aakash1763 3 года назад +2

    one doubt what is the use of like_id in like table?

    • @TheTechGranth
      @TheTechGranth  3 года назад

      It is just the primary key for that table

  • @diboracle123
    @diboracle123 9 месяцев назад

    1. The distributed cache is storing lots of data for 100M users. is there any limit on the cache. I know it is storing only meta data.
    2. The bottle neck is distributed cache if it is failed application will be slow down.
    3. can I store the 6 month's data like post, comment etc into MySql ( RDBMS) post that means after 6 months old data to NoSQL DB.
    Kindly help me to understand above points

  • @FWTteam
    @FWTteam 3 года назад +2

    If the user sees the post, how to maintain that we don't show user that post again? How you will be storing that post in cache. Can you give some concrete design.

    • @TheTechGranth
      @TheTechGranth  3 года назад

      Prepend post in timeline, based on post time

    • @FWTteam
      @FWTteam 3 года назад +1

      @@TheTechGranth not always as we sometimes give more priority to post due to user preferences and behaviour, prioritise post based on what user like, rather than just time stamp.

    • @TheTechGranth
      @TheTechGranth  3 года назад

      @@FWTteam Got your point, this won't be a simple post in that case, you need to run some analytics 1st to understand the likes and behaviour, which can then be fed to some ML model for assigning real time priority. For example Insta reels, where you are shown reels according to your liking also this can be done for recommended posts and not the post from a friend. For post shown on timeline, which belongs to friend, it will always be in chronological order

  • @LearnByDoing7
    @LearnByDoing7 2 года назад

    Great video!

  • @rishirajtandon3849
    @rishirajtandon3849 3 года назад +1

    Hi Tech Granth,
    The hybrid approach for sending News Feed contents to the users: We can move all the users who have a high number of follows to a pull-based model and only push data to those users who have a few hundred (or thousand) follows.
    Plz, update the video accordingly.

    • @TheTechGranth
      @TheTechGranth  3 года назад +2

      That is what I explained at 34:30

    • @rishirajtandon3849
      @rishirajtandon3849 3 года назад

      @@TheTechGranth ok thanks

    • @TheTechGranth
      @TheTechGranth  3 года назад +1

      @@rishirajtandon3849 hope it was helpful. Do like and subscribe and share with your friends :)

  • @sriramganesh5982
    @sriramganesh5982 2 года назад

    What is the user of like_id in like table? If we want to generate who liked a post, then shoudln't we have posters's userId nd likedUserID? That way se can query which users liked the post. Kindly correct me if am wrong.

  • @rupasajan6588
    @rupasajan6588 7 месяцев назад

    🎉
    0:23

  • @sushmitagoswami7320
    @sushmitagoswami7320 2 года назад

    I have a few questions
    1. when we duplicate the storage to make the system fault tolerance, shouldn't there be multiple copies of db instances?
    2. Usually in this problem, we will search the system by username first and then we will dig into their posts, so if we shard on post id, will the queries be faster?

  • @GauravGoswami-y8s
    @GauravGoswami-y8s 4 месяца назад

    Not able to understand the Comment and Like Design

  • @harshitgarg8008
    @harshitgarg8008 7 месяцев назад

    30:29-36:00