System Design Mock Interview: Design Instagram

Поделиться
HTML-код
  • Опубликовано: 26 сен 2024

Комментарии • 470

  • @BoNyKiD
    @BoNyKiD 3 года назад +145

    "And then we can talk about the API a little bit later"
    Never talks about API

    • @josephmorales652
      @josephmorales652 7 месяцев назад +4

      I know my ass would not get hired if I did that.

  • @JamesMortenson-fz7ch
    @JamesMortenson-fz7ch Год назад +13

    This is a great example of what you should strive to have completed at the halfway point of the interview.

    • @benisrood
      @benisrood 16 дней назад

      Yes, maybe closer to the 1/3 point.

  • @NianLi
    @NianLi 2 года назад +94

    A good to add would be message queue. As we know, when we update for example a big image, user shouldn't be waiting there for a long time. Also, if we write the huge chunk of data directly to to the image storage, it will explode (storage usually won't have such a big RAM)! Hence, a good practice would be push it to the queue with a publisher, and have multiple listeners to update these images data (binary formats) into the storage.

    • @lagneslagnes
      @lagneslagnes Год назад +2

      pushing large objects to S3 won't explode it

    • @coolY2k
      @coolY2k 11 месяцев назад +2

      @@lagneslagnes and even if it did, where you would store 5mb image? In queue? Azure Blob/S3 are by design much more scalable than any queue that can handle 5mb binaries within a message.

  • @babybear-hq9yd
    @babybear-hq9yd 3 года назад +89

    Unreal video! I'm relatively new to Systems Design but this was easy to follow and aligned with a lot of other good information I've been finding on the web. Thank you!

    • @tryexponent
      @tryexponent  3 года назад +1

      Glad you liked it! Be sure to subscribe for more like this :)

    • @shivamdashore6864
      @shivamdashore6864 5 месяцев назад

      @@tryexponent How can I use this type of whiteboard ? Is it free of I have to pay something to use ?

  • @udb_music
    @udb_music Год назад +4

    After scrolling through 100s of SD videos finally I found a way to approach the questions in the interviews. Your step by step approach from start to finish was really helpful. Thanks

  • @meqdaddev4341
    @meqdaddev4341 2 года назад +126

    Great job.. Thanks for the video.
    I think we can add the following:
    - Elaborating the DB sharding and partitioning in the system.
    - DB Replication, by adding redundant blocks for database based on master/slave relation, where the master is for write, and slaves for read.
    - Talking about the data searching, like considering to use Elasticsearch for example.
    - Adding a block for CDN to clarify where it should to be in the system diagram.
    - Adding a message/uploading queue
    - Making an estimation for bandwidth

    • @sourabh258
      @sourabh258 Год назад +1

      I think DB Replication is understood for default.
      Rest all points are great 👏

    • @igorbragaia4895
      @igorbragaia4895 9 месяцев назад +1

      This video is great introductory resource, however it is a little bit superficial. Anyway was great watching and looking forward for the other videos on this channel!

  • @markp9827
    @markp9827 4 года назад +109

    I think this video is really very generic, so much so that this design could be applied to any problem. Would have loved see you deep diving into details of any one component in this 30 min video. For e.g is database sharing required? Do we need noSQL databases? how would the fan out process work? How to handle influencers who have 5 million followers? Disappointed to see those details missing here.

    • @nextlevelgamestudios
      @nextlevelgamestudios 3 года назад +1

      I would think for say N-Million followers, that work is either broken down aycnch and its broken across a number of shards in a region.

    • @Damouse007
      @Damouse007 3 года назад +3

      He spent the whole time on the easier part of the design. He didn't model the feed, or describe when and how its built.

    • @nextlevelgamestudios
      @nextlevelgamestudios 3 года назад +3

      @@Damouse007 I think recently I just saw a architecture mock interview for Twitter in which they go over is designing the seed which was actually genuinely helpful it’s a very interesting approach to modeling I would probably look that up if you’re interested

    • @namanmishra08
      @namanmishra08 Год назад +2

      Agreed, this much detail will not clear a system design interview

  • @thatsamorais584
    @thatsamorais584 4 года назад +7

    This was an enormous help in my interview today! I adapted the flow to talk about the specific requirements. Calculating scale and talking about the data model with a segue into the overview diagram was very coherent. Most of all it produced every possible opportunity to highlight my experience with the components I presented, which is the overall goal.

    • @tryexponent
      @tryexponent  4 года назад +1

      I'm so glad this was helpful Alex! Definitely let us know how the interview went!

    • @thatsamorais584
      @thatsamorais584 4 года назад +1

      @@tryexponent I moved forward in the process, and the SDI was a large part of it! I discovered through the interview process about their product and the responsibilities of the role that it was not what I would like to pursue, but that's a different story.

    • @tryexponent
      @tryexponent  4 года назад +1

      @@thatsamorais584 Congratulations!!

  • @西兑
    @西兑 3 года назад +117

    Nice pace of the mock interview. However, the diagram can actually fit most of the system. Shouldn't we focus more on how to fan-out photos, generate news feed, etc, which are more specific to "designing Instagram"? I wonder if we only talk about the high-level things and teaching the interviewers those basic architectural concepts (read-write, load balancer, s3, CDN) as this video does can really ace the interview.

    • @denebgarza
      @denebgarza 3 года назад +43

      I kept waiting for the actual system design to start. Then the video ended. What this video covered is basically the skeleton of any system where you upload/view single files at a time.

    • @BrijeshBolar79
      @BrijeshBolar79 3 года назад +10

      Yes found it to be really basic. Even I was looking forward to fanning out photos as well some replication to the dB as only one dB instance is used for both read and write operations for 10 million users

    • @kyletham9914
      @kyletham9914 3 года назад +5

      @@BrijeshBolar79 what does fanning out photos mean in this context? and any good resource on db replication / sclaing? thought that part was lacking definitely.

    • @tsaregrad
      @tsaregrad 3 года назад +2

      I see that in so many videos on RUclips. 0 specifics about the actual system, just generic LB, DB schema, object storage stuff. I really wonder if interviews are buying this….

    • @denebgarza
      @denebgarza 3 года назад +3

      @@tsaregrad they're not. You won't get any reputable offer with details RUclips videos provide. These are just for the views.

  • @vennyroxz
    @vennyroxz 4 года назад +57

    Loved the video, great explanation. What is the UI/ tool used here to draw for the system design?

  • @basselabuelhija7366
    @basselabuelhija7366 3 года назад +18

    Thanks for the video! Really enjoyed it as it gives a great exploration of the problem.
    However, I think the candidate needs to further explore the scaling of the data and provide estimations about read/write requests per second, traffic and storage needed that we need for instance for 10 years, and what to do when the DB is full that we need to scale vertically (as we you have gone for a MySQL).
    Those information will help out understanding the overall scale of our system and brightens the scene for the interviewer that we are set with with expected traffic and can manage to handle the requests.

  • @sourabh258
    @sourabh258 Год назад +1

    Its nice, thank you !
    1. CDN could have been added
    2. Encoding of videos and photos

  • @LaVengeanceInc
    @LaVengeanceInc Год назад +4

    The feed generation part needs a bit more detail IMO. That's the most challenging part of this. Would you generate it on read or on write? Or a combination of both? Also, what if it's not a linear timeline of events but needs to be ranked? Also, details around comments and replies?

  • @ryanm6528
    @ryanm6528 3 года назад +3

    Very good video for very novice engineers looking to learn about systems. However, there are big gaps here when it comes to scaling, ie. using a relational database (which also decreases performance) and having a synchronous system.

    • @rodoherty1
      @rodoherty1 2 года назад

      I had some questions about this, too. I can see the rationale for an RDBMS but from a scaling perspective, it would surely run into trouble. I was hoping the video might discuss how failover to another Region. The topic of Failover would surely come up in an interview.

  • @jamiepearcey9335
    @jamiepearcey9335 3 года назад +24

    I am not sure that a relational database is the best choice here. There are often relationships in data but sometimes you need to determine the right trade offs in terms of scalability. Though I appreciate that a relational model can be designed to work at such scale (read replicas, sharing etc), I would consider using NoSQL solution for the media metadata api, and a graph database for the activity feed.

  • @AnhNguyen-vu7mc
    @AnhNguyen-vu7mc 8 месяцев назад +1

    a deep dive into the feed service would be greate. because obviously thats instagram's core feature

  • @黄海天-h9l
    @黄海天-h9l 3 года назад +15

    Actually I think a write-through cache, or cache-aside to the write service will be easier to generate the feed. Bcs the cache server will know what content is pushed recently.

    • @ConernicusRex
      @ConernicusRex 2 года назад

      So will the objects themselves who are stored with time and location data. Any component dealing with those objects will have that data, not just the cache.
      You want the use case to be “new to the user” not “new to the system” anyway.

  • @andrej7838
    @andrej7838 4 года назад +7

    Hi, great video, one thing to take into account is that Instagram uses Cassandra as their primary DB, because of the partitioning, scaling issues relational databases have. While SQL databases are scaled vertically, NoSQL databases are scaled horizontally (scale-out across commodity servers). Also, NoSQL databases can store relationship data, they just store it differently than relational databases. So by saying that I wouldn't agree that a SQL database is the best choice here, at scale NoSQL is much faster, cheaper than a relational database

    • @mannydsz
      @mannydsz 4 года назад

      You can scale sql databases horizontally using sharding and a decent shard key. Cassandra does that automatically for you if you categorize well your column families. If you mess you data model, you will have a bunch of hotspots on your cluster and will face the same problems of vertical scaling. The technology alone won’t solve your problem if you don’t know how to use it well. By the way, before Facebook bought Instagram it already scaled horizontally and used Postgres, which is a SQL database.

  • @deathbombs
    @deathbombs 3 года назад +2

    Should client be calling to web server before the app server? App server handles business logic, web server handles the user's requests

  • @tonyhuang9959
    @tonyhuang9959 3 года назад +8

    very helpful, can you make a more mock system design interview!!!????? thanks!

  • @kumarashutosh229
    @kumarashutosh229 2 года назад +1

    It's all about how not to answer your design question!
    I mean, really this is how you design an instagram, or in a more generic sense any feed system?
    Note: These are my personal opinion.
    I would rather say these are the points that I was expecting from a 5+ year experieced dev perspective.
    He never talked about "post" and related stuff. Is instagram all about uploading your profile pic? Everything you upload would be a post and you may add image/file/feeling/re-share etc. Where is your post table and id?
    Are you going to use integer for storing your user-ids? Think again!
    He also did not mention the space incurred from text data, 1 PB of space would just for images. I guess we can use geo-location based storage.
    Notification being one of the most important component in any social-media platform. It should have been covered in requirements.
    I was expecting him to cover the latency in uploading images and the wireframe for image upload and render.
    Last but not the least, the re-share functionality! How to deal to re-post/share a post.
    I'm not being a pessimist, just thought to put my points. Apologies. Nice work btw

  • @_johnathan
    @_johnathan 3 года назад +4

    Really wish I had seen this before my second round TPM interview at a major FAANG company. LOL Great video!

  • @vlad7780
    @vlad7780 3 года назад +17

    Great video, thank you. Good example of CQRS in action. For further improvement it would be a good idea to use another DB for feed generation, because access patterns will be slightly different: OLTP for user actions and OLAP for feed (maybe some recommendation system under the hood).

  • @meow-mi333
    @meow-mi333 2 года назад

    SQL for dynamic complex queries, ACID properties (some NoSQL also supports this). Relation or not is really not a strong reason. You can do the same in NoSQL with different tables using seperate requests or the same table using one request (Adjacency List Pattern). NoSQL for easy read/write scalability, high availability.

    • @devkashyap9049
      @devkashyap9049 2 года назад

      I came here to post the same comment and found your comment. Totally agree. It does not make sense to use SQL DB in this use case.

  • @swang7291
    @swang7291 2 года назад

    Nice solution, definitely deserve an Amazon SDE II.

  • @taheerahmed1120
    @taheerahmed1120 2 года назад

    Exponent provides best mock interview videos

  • @kafychannel
    @kafychannel Год назад +2

    Cool really liked it thanks a lot

  • @ivantrofymenko1308
    @ivantrofymenko1308 3 года назад +2

    I'm far from an expert, but I disagree with your reasoning for choosing a SQL (relational) database. All data that's worth representing is relational in some way. That doesn't mean that SQL dbs are always the way to go. The types of queries you described can be performed efficiently on NoSQL dbs with smart use of indexes and would offer better scalability for a social platform like Instagram. No mention of Graph representation either, which is ideal for representing many to many relationships.

  • @Ultrawega1
    @Ultrawega1 Год назад

    If I'm going to rate this mock interview from 1 to 10 I will go with 3. every matter was considered at it basic part. if FAANG interviews are like this i may be the director of tech at one of those companies!

  • @mrarun8007
    @mrarun8007 2 года назад +1

    Great stuff. Please post some mock interviews where the candidates failed. Thank you.

  • @shambhavishindems4255
    @shambhavishindems4255 3 года назад +7

    Can we use graphs to record follow relation between users?

    • @PradeepSingh-vm1gl
      @PradeepSingh-vm1gl 3 года назад

      Yes. Graphs. Graphs database is best to store all the user's information & relationships between them. The response time for getting the relationship as well information about the user is so very fast with the Graph database. Graph databases are best suited for this type of many-2-many relationships as compared to the traditional relational databases. I do not agree with this guy saving user's details in the relational database. But basically, you must need all diff kind of databases for diff-diff purpose in such a large scale application. The architecture he designed is a common architecture for a simple mobile app. Have a look how
      complex the architecture can be for Instagram/Facebook.
      github.com/codekarle/system-design/blob/master/system-design-prep-material/architecture-diagrams/Facebook%20System%20Design.png

  • @prashantsingh1096
    @prashantsingh1096 3 года назад +1

    Thank you . This video shows that even hard topic like system design can be taught in a simple way . I like this .

  • @yobbei
    @yobbei 2 года назад +5

    Great video. Have you considered talking more about the APIs?

  • @richann6637
    @richann6637 2 года назад +3

    Nice video! Which software do you use for the diagram?

  • @smalex
    @smalex 2 года назад +1

    most instagram photos were compressed to JPEGs between 150 kb and 190 kb.
    not 5 MB per photo.

    • @shreynaygandhi8044
      @shreynaygandhi8044 2 года назад

      Yeah, thats true. This tutorial should be taken as design of social media app in general. Fb does not compress so they need the storage. Whereas instagram needs to keep large files in memory for compression operation, so its a different challenge. Thats what I think

  • @javacodeexercises3996
    @javacodeexercises3996 3 года назад +1

    Great video, excellent explanation. There are two aspects that I would do differently, wondering what everyone else thinks:
    1 - Database is shared between multiple services, which introduces coupling between them.
    2 - I don't see a massive benefit in having separate services for read and write paths, I'd just have one service with a cache behind the reads.

    • @rafaelpierre4263
      @rafaelpierre4263 2 года назад

      Instagram has multiple databases for read and write. They explain the reasoning for this in this keynote ruclips.net/video/hnpzNAPiC0E/видео.html

  • @stepanmanko5733
    @stepanmanko5733 Год назад

    There are relations between data, so we will use a relational DB system. This is a very debatable statement, since there are always relations between data.

    • @tryexponent
      @tryexponent  Год назад

      Hi Stepan! While relationships between entities does not necessarily mean that a relational DB system should be used, it is the best choice to do so. Hope this helps!

  • @Dan-tg3gs
    @Dan-tg3gs 3 года назад +1

    100TB should be 0.1PB not 1PB

  • @AmanVerma-lt7px
    @AmanVerma-lt7px 3 года назад

    This is a great video and I haven't evenr watched it complete. I really like how the requirements are collected and quantified

  • @tannerbarcelos6880
    @tannerbarcelos6880 3 года назад +3

    Learned a ton from this. Never have had a Systems interview and I am a new grad so my potential onsite coming up has a design portion so trying to grind through these and learn a ton!

    • @tarunstv796
      @tarunstv796 3 года назад +1

      Ton?!

    • @arobot4398
      @arobot4398 3 года назад +5

      doesn't make sense asking a fresh grad system design questions.

    • @SumedhSen97
      @SumedhSen97 3 года назад

      I am a new grad as well and have a sys design interview coming up. How deep did you go in the system? what all did you discuss and what all did you leave abstracted?

    • @ernestmummey6446
      @ernestmummey6446 2 года назад

      @@arobot4398 I have a interview coming up that is a system design interview, they are pretty common to be honest

    • @ConernicusRex
      @ConernicusRex 2 года назад

      @@arobot4398 most F500s who want to hire anything above entry level often have at least one systems design and architecture interview in the process.

  • @rahul10anand1
    @rahul10anand1 Месяц назад

    This is a very basic attempt at solving this problem. Though I appreciate you putting out this video for us to view - it doesn't cover many topics. For cases where you need to optimize your Database, you might need to go into sharding - sharding being a beast of topic in itself you should have further explained the shard key and the reasoning behind it. The meat of the problem statement for designing either twitter or Instagram is the optimisation of news feed. If you make a block around it and never explain it in depth then it doesn't serve any purpose. Further, there is fanout process or polling process to get new feeds and special handling when a celebrity uploads a post.

  • @AlikElzin
    @AlikElzin 4 года назад +1

    Love the number crunching around 6:00.

    • @_SoundByte_
      @_SoundByte_ 4 года назад

      Alik Elzin how is 100tb equal to 1.2pb ?

    • @AlikElzin
      @AlikElzin 4 года назад

      @@_SoundByte_ he then multiplied by a year - 12 months.

    • @tryexponent
      @tryexponent  3 года назад

      We love it too :)

  • @mannydsz
    @mannydsz 4 года назад +4

    Good for a basic understanding of a systems design interview. However, it is a quite incomplete example. How do you make 1.2PB of data fit into a single "Metadata DB"? How do you shard it? Based on what? You didn't use much of the data from your back-of-the-envelope estimates in your design. Didn't specify much on API interfaces. How do you update your cache generated by the feed generation service based on new posts from users you follow assuming you always want to see newer posts first? Lots of very basic assumptions and half-baked solutions.

    • @IsraelLazoPlus
      @IsraelLazoPlus 3 года назад

      Are you sure its 1.2 PB of database data? aren't you refering to the image files? which don't go into the database.

    • @mannydsz
      @mannydsz 3 года назад

      @@IsraelLazoPlus in terms of image it is probably what instagram support in hours.

  • @didoma73
    @didoma73 3 года назад +1

    17:13 one of the most well explained breakdown of monolithic arch vs microservices I've ever heard

  • @robertkozik4845
    @robertkozik4845 3 года назад

    Overall, this is a good demonstration of what a systems design interview could be like a less experienced candidates, but Google or Facebook would likely want to see more than this if you're more senior. For a variety of reasons: your implementation language can greatly effect a design e.g. stateless vs stateful, traffic is not evenly distributed throughout the day, if they shipped a service you'd hit 10M users in hours or days, etc. All of which would require you to handle writes(and to a lesser extent reads) differently than what you've shown in this design.
    Not shitting on the video, I think its quite accurate, and a good introductory video.

    • @gsb22
      @gsb22 3 года назад

      Yes, if you have more than 4-5YOE, this video wont cut. This is basically generic distributed system.

    • @SumedhSen97
      @SumedhSen97 3 года назад

      so, there's no need to go into more depth for a new grad sys design interview right? I have one coming up and am super confused as to how deep i should go, cuz i don't have knowledge about most of the details of a system

    • @gsb22
      @gsb22 3 года назад +2

      I'm surprised they're asking SD to a new grad. Yeah, you would be fine as long as you can put correct components and explain them a bit..

    • @SumedhSen97
      @SumedhSen97 3 года назад

      @@gsb22 any more resources you would suggest I could refer for an abstracted overview of SD? which would be not too complex, but detailed enough for a new grad interview?

    • @gsb22
      @gsb22 3 года назад +1

      @@SumedhSen97 Google System Design primer github, but this would be too deep. I would suggest watching generic vdos to get hang of it. You seriously aren't supposed to defend single leader or leaderless replication. Checkout Gaurav sen and all vdos on YT

  • @MykolaDolgalov
    @MykolaDolgalov 8 месяцев назад

    This is a very high level answer

  • @SilentTremor
    @SilentTremor 2 года назад

    this is functional design, more precise upload/retrieve images.

  • @ksuhdilla
    @ksuhdilla 2 года назад

    I just have to comment the sound of your keyboard is so nice

  • @afraz-khan
    @afraz-khan Год назад

    Its simple and compact, loved it, thanks.

  • @hewhocanfly
    @hewhocanfly 2 года назад

    Good explanation. You made a complicated problem easy to follow and understand!

  • @ADiLetJoldoshbekov
    @ADiLetJoldoshbekov 3 года назад +1

    100TB = 0.1 PB

  • @tavoleyva8235
    @tavoleyva8235 4 года назад +3

    Hi! Why not explore Hybrid database implementation using RDBMS and NoSQL databases? How would this system handle the bottlenecks when start scaling? What would be the response time? Also, implement queues between the cache and the database could help. It's a clean implementation, great quality, it's a possible approach.

    • @tryexponent
      @tryexponent  4 года назад +3

      Hey Gustavo! Good questions. In a sense, we are using a hybrid approach with a relational database plus a key-value store like Redis, but you could use other types of RDBMS and NoSQL systems, too. In reality, Instagram uses Postgres + Cassandra, with some advanced indexing and sharding strategies which you can read about on their blog! Let us know if that helps! instagram-engineering.com/handling-growth-with-postgres-5-tips-from-instagram-d5d7e7ffdfcb

    • @thatsamorais584
      @thatsamorais584 4 года назад

      @@tryexponent i think i came across the same article in my own research. totally agree.

  • @nguyenhoanvule5755
    @nguyenhoanvule5755 2 года назад

    Your content is good enough for overview but not depth enough for system design interview

  • @BABEENGINEER
    @BABEENGINEER 2 года назад

    What if you're asked to build a Logging API? Then it would just be client-side right? Would you go into details about servers and databases if it's mostly client-side?

  • @shehbazjaffer
    @shehbazjaffer 3 года назад +1

    Excellent Video! I was wondering what rationale should an interviewee give if asked about the scalability (specifically, write scalability) after choosing a relational database model?

  • @akashjain2990
    @akashjain2990 4 года назад +5

    Great video! Thanks for creating this content! I thoroughly liked the half hour video. If one thing I could suggest, would be little bit more deep dive into the design - API, sharding, replication etc. Also in case of read, you are connecting the mobile device directly to Object Store, then why not in case of write as well? The device can directly write to object store, and just push meta data through the App server (write). Are there any downsides you were thinking?

    • @stanislausaprankou3495
      @stanislausaprankou3495 3 года назад +2

      I might be wrong, since I'm also not an expert in system design, but my reasoning would be as follows: We can introduce a CDN between S3 and clients, making serving static files much faster. But I'm not sure if we can speed up the upload process (from the client to S3) in a similar way.
      Also writing directly to the S3 bucket just doesn't seem right to me from the security perspective

    • @Arrygoo
      @Arrygoo 2 года назад +2

      @@stanislausaprankou3495 you can request for a signed url from the server, and use that to upload directly to S3. Passing all those large files through the server is going to add so much unnecessary cost.

  • @JovaCoder
    @JovaCoder 7 месяцев назад

    The title is a bit misleading. Although this video is a great explanation of how to design a system such as Instagram but this is far from a mock interview. It does not capture the essence of an on-site interview with an interviewer asking you to design a system where you are responsible for getting a buy in of your use-cases and requirements of the system by the interviewer. In this demonstration the requirements are perfect since they are made up by the only person attending the interview. Again this is great for designing a system and how to speak out loud about your design but not a mock interview. There is a big difference.

  • @nicoschwab545
    @nicoschwab545 3 года назад +1

    The one thing that I would argue here is the DB choice. You are creating 1.2 PB of information per year, storing that in SQL db does not scale. You will rapidly find yourself sharding and fighting with key hashing. Also you will need a master slave architecture to handle the read heavy. So basically you will have a sharded DB and each shard with a M-S architecture.
    That points out that probably the best choice is a NoSQL DB, in this case like Cassandra (Column family db). These kind of DB were created to scale much better than a SQL and will provide you with High Availability and fault tolerance, sacrificing consistency.

  • @FirefoxGuy18
    @FirefoxGuy18 3 года назад +1

    Great work , I learnt a lot today and the video was well paced and informative. Thanks

  • @anandjain8668
    @anandjain8668 3 года назад

    I loved this video. You explained in such a easy way. i was able to connect everything. Thanks man :)

  • @sugurulovestokyo1260
    @sugurulovestokyo1260 2 года назад

    Thank you for posting this! Really helpful!

  • @rohitparthasarathy6671
    @rohitparthasarathy6671 3 года назад

    Wow - You just earned an subscriber - Excellent pace and detail Thanks !!

  • @OffbeatTravelVK
    @OffbeatTravelVK 2 года назад

    Really helpful video Jacob, super detailed in explaining the architecture. Thank you

  • @MrNate858
    @MrNate858 3 года назад +4

    What program do you to make these designs on your videos?

  • @aditya8404
    @aditya8404 3 года назад +1

    Why was the choice RDBMS and not something like a GraphDatabase as it makes more sense for querying on larger scale

  • @xiangweichen
    @xiangweichen 2 года назад +1

    What’s the drawing tool you are using? Looks really nice! Share a link?

  • @noelomondi4849
    @noelomondi4849 3 года назад +5

    I have a system design interview coming up next week, what tool are you using for sketching?

    • @furkan2640
      @furkan2640 2 года назад

      Which tool you used in the interview can you mention please

  • @mariamcduff4394
    @mariamcduff4394 8 месяцев назад

    Very helpful - Thank you.

  • @mysterio7385
    @mysterio7385 2 года назад

    The database model wouldn't scale, since we can potentially have n^2 rows in the follower table, where n is the number of users.

  • @YousefSh
    @YousefSh 2 года назад

    Might need a pub/sub service, a CDN (which was talked about but not shown), and a notification service

  • @BeHappyAndNice
    @BeHappyAndNice Год назад

    Thanks for sharing this amazing knowledge.
    I could not understand whether the generated feeds are stored in the cache based on each user or not. Because I think, storing feeds based on each user, causes Cache faces lots of hits.

  • @nicholaslorio2985
    @nicholaslorio2985 2 года назад

    Great video. But why is no one else talking about how great your keyboard sounds? Whats the keyboard build?

  • @rajeshdansena
    @rajeshdansena 4 года назад +2

    This is one of the shortest and best system design video I have came across. although you could have talked about CAP theorem while discussing the tradeoff of sql and Nosql no worries though. I would really appreciate if you could further discuss about Class digram and relation among them and APIs which should be exposed(API design/Class) in your upcoming system design video.
    Thanks again for the video. Keep up the good work buddy!!

    • @tryexponent
      @tryexponent  4 года назад +1

      Thanks Rajesh, you got it!

  • @sudhaganesh6419
    @sudhaganesh6419 4 года назад +2

    Great presentation. Thank you :)

  • @seemadubey7247
    @seemadubey7247 Год назад +1

    Great video , knew all the components but to put it all together you did a great job, one small note caching layer is generally implemented at CDN or Load balancing layer so it is closer to where the users are and prevents backhauling of traffic to where the servers are

  • @anuragsengupta3725
    @anuragsengupta3725 9 месяцев назад

    What is the difference between the App server read and the feed generation service?

  • @riteshthombre2846
    @riteshthombre2846 4 года назад +2

    Once the endpoint URLfor the image is figured out by making the search in MetaData DB, would a new request fire of again through the client to get that image through S3 or CDN? Or would thius happen in same request, as in we have an additional MicroService which would make a call to S3 or CDN and give back the image in same req?

    • @tryexponent
      @tryexponent  4 года назад

      Hey @Ritesh Thombre! Typically the client will receive the first response from the web server, then make separate requests to the CDN for the image/video assets. This allows the client to optimize when to load these images for the best performance (e.g. when scrolling)

    • @riteshthombre2846
      @riteshthombre2846 4 года назад

      @@tryexponent Agreed. I guess, just a thing about design choice. Ours is a mini banking application and we make search of a customer's eStatement through its Metadata, get the DocID and generate the URL. Then hit the cloud storage and give back the document in same request.
      What you are proposing looks like will add another network call but that's fine. I think its a matter of design choice.

  • @AnasMughal
    @AnasMughal 2 года назад

    Good video... Thanks.

  • @markdavis1358
    @markdavis1358 2 года назад

    Great video, really helpful.

  • @shashanktadaiya7027
    @shashanktadaiya7027 4 года назад +5

    What is the name of the whiteboard software you use for teaching?

    • @tryexponent
      @tryexponent  4 года назад +4

      Hey Shashank! It's called whimsical.com/ - what did you think of the tool?

    • @jaydeeppatwardhan9120
      @jaydeeppatwardhan9120 4 года назад +2

      @@tryexponent Whimsical is an excellent tool which I plan to use in my upcoming TPM interview. Thanks for the excellent video, it's a good high level.. wish you could elaborate little more on the "pre-compute" logic for the news feed generation and how it should differ for the "normal" users vs "Celebrity" users because I guess that's an important design consideration as well

    • @tryexponent
      @tryexponent  4 года назад

      Thanks Jaydeep! How would you have thought about the celebrity users?

    • @shashanktadaiya7027
      @shashanktadaiya7027 4 года назад +1

      Thank you team!

    • @shashanktadaiya7027
      @shashanktadaiya7027 4 года назад +1

      @@tryexponent Just explored it. It really cool

  • @truemanrt
    @truemanrt 3 года назад +1

    Good video, it would help if you can include rest interfaces (not in great detail, but something) as well in the future. A bit more diving into depth would help.

  • @benisrood
    @benisrood 16 дней назад

    The feed is most likely going to be the *crucial* aspect of the design requirement for this scenario, and you didn't even do it.

  • @cloud5887
    @cloud5887 4 года назад +12

    Awesome video, thanks for that. Question: You have estimated a storage capacity of 1.2PB but still decided for a SQL DB. How does that align with our goal to build a scalable system? I was expecting a NoSQL choice here. Please correct me if I'm wrong.

    • @tryexponent
      @tryexponent  4 года назад +9

      Hey Ash! Thanks for the question. We should have clarified that this estimate is for the size of the images. The database would just store references (URLs) to the images, which can be stored in a storage system like Amazon S3 and served by a CDN.

    • @cloud5887
      @cloud5887 4 года назад +1

      @@tryexponent Thanks for the clarification, that makes sense.

    • @liraneli
      @liraneli 4 года назад +6

      it is still makes more sense to go for NoSQL DB for scalability

    • @devendraagarwal9510
      @devendraagarwal9510 4 года назад

      I have a similar doubt too. Even if we ignore the size, is relational db scalable enough for this kind of problem?

    • @wulymammoth
      @wulymammoth 4 года назад +7

      Liran elisha this would not be sufficient - scalability is not one dimensional. You can read the actual story about how Instagram originally did it at High Scalability. There is a cost to using NoSQL - schema on load versus schema on read. If you care about data integrity and operational familiarity, it’s hard to beat SQL. These are considerable trade-offs worth sharing and discussing during an interview. It’s not blatantly obvious why some hand-wavy NoSQL solution is more scalable. Which part? I don’t doubt that there is a NoSQL solution that makes some bottleneck in the system much more performant, but rather we should go with a choice and be able to defend it and also discuss alternatives and their trade-offs

  • @Sam-yt8nc
    @Sam-yt8nc 2 года назад

    You just designed a monolithic solution for instagram. Did they really liked that? where is domain model? where are microservices? Api gateway?where is event sourcing? eventual consistency against strong consistency? there is no way you can scale this for 10 million users on one single database shares between multiple services.

  • @srki80
    @srki80 3 года назад +1

    What is the tool you are using for drawing/diagraming?

  • @hunterxg
    @hunterxg 2 года назад

    Always funny to me when people in the US always forget about i18n

  • @itsmechicpalak
    @itsmechicpalak Месяц назад +1

    Isn't 100TB=0.1PB(in decimal units)?

    • @tryexponent
      @tryexponent  Месяц назад +1

      Sorry for the confusion due to the misuse of the "=" sign! It was 100TB (0.1PB) per month, but we wanted to find the amount per year which was 0.1PB * 12 months = 1.2PB.
      A clearer presentation of the workings would be:
      Per month: 100TB = 0.1PB
      Per year: 100TB * 12 = 1.2PB
      Hope this clarifies it for you!

  • @tanmoybanerjee
    @tanmoybanerjee 2 года назад

    nice explanation.

  • @karanbhatia2834
    @karanbhatia2834 2 года назад +1

    It would be awesome if someone could tell me what whiteboard/notepad application is used in this video. Could come in super handy during a system design interview!

    • @Tuyenrc
      @Tuyenrc 2 года назад

      Whimsical

  • @balrajvishnu
    @balrajvishnu 2 года назад

    Well done, nicely explained

  • @kapil8965
    @kapil8965 4 года назад +2

    Hey great video Jacob learned a lot. Really liked the way you explored everything giving fairly reasonable and convincing solutions . Thanks for the video again really appreciate the effort.

    • @tryexponent
      @tryexponent  4 года назад

      Thanks Kapil! If we had to make another video next, what should the topic be? :)

    • @kapil8965
      @kapil8965 4 года назад +2

      @@tryexponent Typeahead Suggestion or Twitter Search would be interesting topics .

  • @дигро-у3с
    @дигро-у3с 3 года назад +1

    Hello! What are you use for draw structure in this video?

  • @raghavsaxena96
    @raghavsaxena96 2 года назад

    Feed computation might be complicated , wanted to know more about it

  • @vladimirmokeev2856
    @vladimirmokeev2856 3 года назад

    1. Hour based generation of feeds seems to be impractical: feedback time is very important for social networks. Maybe it should be triggered for users on events (follow, unfollow, upload,...)?
    2. Some very popular users maybe should be processed separately cause they generate ton of events for their followers.
    3. You didn't estimate any sort of traffic.
    4. You didn't say about any nonfunctional requirements
    5. You didn't say about how you would scale horizontally on database of metadata.
    And so so much more.

  • @82andyphillips
    @82andyphillips 3 года назад

    Potentially use a CDN in front of the LB? CDN that is programatically invalidated upon the write app server success.

  • @batelluzon6931
    @batelluzon6931 2 года назад

    Amazing!! thanks:)

  • @MehdiG-gj4bx
    @MehdiG-gj4bx Год назад

    great video content ,wish the best for yours

  • @abhi77kumar
    @abhi77kumar 2 года назад

    Is it good to cache feed as this changes frequently.

  • @prashantsalgaocar
    @prashantsalgaocar 3 месяца назад

    the API' were not constructed in this design. I think we should create the entities after confirming the API's and doing the high and some level of deep dive

  • @mfe_
    @mfe_ 2 года назад +1

    Starting with DB schema, and not with the user flow is a huge "noob" mistake. It gives away, that you just prepared this scenario in advance by just looking at the end result without understanding how to get there.
    Not to mention that the solution itself is very "junior" -ish. For example, passing a file through "app service" and not uploading it directly to the object storage - is a big mistake, you basically multiplied the amount of traffic that runs through your system by 2 for no reason. Populating cache synchronously on reading - is also a big mistake. And pretty much nothing was said about how to prepare a feed, which is the most interesting and loaded part of the design.
    And by the way, this is a classic example of a distributed monolith - you tied everything to one single database.
    2/10.

    • @soyaro
      @soyaro 2 года назад

      Please give alternatives to weakness you spot rather than pointing them out and walking away. You're not helping anyone that way Mark.