Design Twitter - System Design Interview

Поделиться
HTML-код
  • Опубликовано: 21 дек 2024

Комментарии • 345

  • @NeetCode
    @NeetCode  2 года назад +84

    So are you guys interested in working at Twitter? 😅Btw, don't forget to "Batch" click the like & subscribe buttons.
    🚀 neetcode.io/ - Get lifetime access to every course I ever create!

    • @criostasis
      @criostasis 2 года назад +7

      You should leave Google for Twitter

    • @BhargavSushant
      @BhargavSushant 2 года назад +13

      tweet this video to Elon , he might make you CEO, he is weird like that.

    • @NeetCode
      @NeetCode  2 года назад +6

      @@BhargavSushant lol maybe i should

    • @cofing_challenge_1
      @cofing_challenge_1 2 года назад +4

      Yes hire then next day fire

    • @wondays654
      @wondays654 2 года назад

      @@NeetCode you should look at your website, I tried to go pro but for some reason the google api won’t let me sign up. I don’t know if I’m the only one the problem or if it is general.

  • @BangMaster96
    @BangMaster96 Месяц назад +27

    Man, finding a job as a Software Engineer is just crazy.
    You need to go through at least 4 to 6 rounds of interviews, starting with a technical take home challenge, then a follow up discussion about that challenge, and then another live technical coding interview, and then a live behavioral interview, and then a live system design interview, and then maybe a product delivery interview, and probably a chat with CTO or VP at the end of that.
    And then, once you're hired, you're just gonna be focused on fixing bugs and building features, it is very rare that you are creating a fresh system from scratch, unless you're working at a start-up, and even then, you're going to be working with other Engineers to design that system.
    In most other industries, you typically learn what you need for the job over time, through hands on experience. Only in Software Engineering do all Companies just expect you to be a data structure and algorithm wiz, have previous experience so you can answer those behavioral questions, and then design some abstract system from scratch within 1 hour, just to get hired.

  • @Lintlikr1
    @Lintlikr1 Год назад +222

    Being an SWE these days is just insane. Any other job, you'd get hired then learn the system over time and by working with people at the company. As a SWE you have to already know how Twitter works just to get through one of the six or so interviews to get a job fixing bugs or writing new features. Does every other SWE know this shit just from going to school or working in the field for a few years? Ive been a SWE for 10 years and these are all semi-new concepts to me. Ive never once had to design a system like this but I guess now companies want you to be an expert on day one. I thought I could avoid cramming algorithims and system design stuff if I didnt try to get a job at FAANG but now every little startup expects you to be a senior level engineer just to make 140k. I feel like my 10 years of experience count for literally nothing.

    • @garlicpress6121
      @garlicpress6121 8 месяцев назад +35

      10 Years and you barely did system design? Typically getting up in seniority means having to take a higher level approach to problems and leaving the implementation to juniors

    • @Chubbywubbysandwich
      @Chubbywubbysandwich 7 месяцев назад

      @@garlicpress6121 I feel like the web based software has skewed everyones perception and it makes people think that this is the only kind of sodtware dev in the work. There are so many other domains which would never need to know this sort of stuff for interviews or even for their work. For example, someone working on low level programming for drivers, or OS level sofware or desktop applications.

    • @daddashikamani
      @daddashikamani 5 месяцев назад +12

      I hear you. I am 15 years exp and I am finding this very strange. I functioned without knowing Leetcode algos and these insane System Design stuff. And I did pretty well! I dont know what value these things are adding, TBH.

    • @vhchoang
      @vhchoang 5 месяцев назад +8

      @@garlicpress6121 I think for a typical SWE, doing system design is common, but not to this degree. Normally it's working on top of or improve existing systems to add features or improve performance/scale/reliability etc.

    • @r2ravi2008
      @r2ravi2008 4 месяца назад

      @@vhchoang if you work in startup and new product is created from ou get oppurtunity to design these kind of things.

  • @damaroro
    @damaroro 2 года назад +221

    I would love to see more System Design content !! nice video man

    • @NeetCode
      @NeetCode  2 года назад +15

      Thank you, more to come!

    • @sanskarkaazi3830
      @sanskarkaazi3830 2 года назад +2

      @@NeetCode I think he meant on your youtube channel haha..

    • @indiging8330
      @indiging8330 2 года назад +1

      @@sanskarkaazi3830 obviously what else could her mean?

    • @sanskarkaazi3830
      @sanskarkaazi3830 2 года назад +2

      @@indiging8330 neetcode has premium courses on his website as well so not there but here.. you get what i mean?

    • @xoladlamini3675
      @xoladlamini3675 Год назад

      Can you talk about Pinterest, or someone link some available content.

  • @cannabisanomaly
    @cannabisanomaly 5 месяцев назад +7

    Starting from 4:26 to 7:38, that's pretty much superfluous arithmetic you're going to be doing during a systems design interview. The time you spend mentally calculating those numbers is going to be wasted, just to arrive at a conclusion of "it's a lot", which is almost a given in any systems design interview. Your time will be better spent calculating those numbers while you're doing your high-level design portion, if needed. One example of needing to calculate those numbers is a TopK system for trending topics in a social media feed (which doesn't pertain to a basic Twitter implementation).
    Ask your interviewer for DAUs and if it's anything over 100M, move on to the core components section (Tweet, User, Feed), rather than calculate capacity estimations.

  • @nettemsarath3663
    @nettemsarath3663 2 года назад +67

    wow !!! from algorithms to system design, love to see more on system design videos

  • @dukekong2412
    @dukekong2412 2 года назад +66

    I can't help but find it slightly hilarious that you released this video during the ongoing controversies happening at Twitter.
    But in all seriousness, amazing content!

    • @fauxz3782
      @fauxz3782 2 года назад +2

      Musk will hire him

    • @SunilPatil-hs8wd
      @SunilPatil-hs8wd 2 года назад

      Its because Musk tweeted the HLD of twitter on twitter. You can see that in the thumbnail of this video too

  • @tiskahar9738
    @tiskahar9738 2 года назад +50

    This is a fantastic example of a realistic architecture screen. I would note for viewers that you will almost certainly not be able to think of and describe everything that was covered here and as someone who conducts 3 or 4 of these every week, I don't expect candidates to cover everything here in the 20-30 minutes I have with them. But as you go through this video, the issues presented scale really well with the expectations that go along with the seniority of the candidate and position. We actually skip a lot of the preliminary setup so that we can delve into the more complex issues for more senior candidates. If you're a mid level, I'm not expecting you to come at me talking about batching out feeds and dynamically updating them based on high popularity tweets.

    • @alexd7466
      @alexd7466 2 года назад +1

      no, with such test you filter already for ex-twitter employees. That would be fine if you build a social network, but you'd miss out on all the all the brilliant devs who for example designed large e-commerce or data-pipeline architectures, because that requires a very different approach.

  • @jti107
    @jti107 2 года назад +275

    I guess twitter will be a case study in “does talent matter” and “how interchangeable/disposable are sw engineers”.

    • @KennethBoneth
      @KennethBoneth 2 года назад +65

      It will also be a case study on if these software companies are truly over staffed or not. If Twitter survives after laying off so many people it may inspire other companies to consider down staffing

    • @Mattarii
      @Mattarii 2 года назад +51

      @@KennethBoneth I think the main issue with scaling down on employees is that the remaining employees will essentially have to monitor and handle the same amount of work as before scaling down, which will cause additional stress and probably a less than healthy work life balance.

    • @bryanyang7626
      @bryanyang7626 2 года назад +38

      Not really, Tesla and SpaceX both are well known for the horrendous work environment. So it depends on the management and the owner of the company in this case.

    • @Mattarii
      @Mattarii 2 года назад +17

      @@bryanyang7626 that's true, might not work well with other companies once people start realizing their lives are worth more than slaving away

    • @KennethBoneth
      @KennethBoneth 2 года назад +20

      @@Mattarii That is true if you were properly staffed to begin with. If twitter is as overstaffed as many people believe, then a large chunk of employees are effectively doing nothing. IF twitter goes from properly staffed to understaffed, you are correct. If twitter is going from overstaffed to properly staffed, then that won't happen.

  • @RandomShowerThoughts
    @RandomShowerThoughts Год назад +25

    the biggest thing about sharding is that we could potentially lose the joins, and it adds a huge layer of complexity on the application.

  • @randysong823
    @randysong823 9 месяцев назад +3

    Great video! One question (or perhaps a mistake), in 18:20, you say all the people this guy follows should be on one shard but I don't think that's possible. If person A follows B and C, then B and C should be on one shard. if person E follows C and D, C and D should be on one shard, but its already on a different shard. Maybe B,C,D are all one shard, but as long as each person follows another different person, we will only have one shard.

    • @baran8452
      @baran8452 3 месяца назад

      Thanks for this comment, I really didnt get this sharding thing :) it is looking impossible to sharding per user. I thought that maybe I misunderstood this point but, after your comment it's clear.

    • @Squigglybiggly
      @Squigglybiggly 3 месяца назад +2

      Sounds like he meant "each person the user follows will be located uniquely within a single shard" and not "all the people he follows will be in the same shard". The phrasing isn't great.

  • @GeorgeHFonseca
    @GeorgeHFonseca Год назад +4

    Your content is way, WAY better than the others on RUclips! Great work!

  • @kaixuanhu8332
    @kaixuanhu8332 2 года назад +37

    Love your content, your video help me land a position at Twitter one year ago. but I just got laid from Twitter and will start checking your video again 😅

    • @NeetCode
      @NeetCode  2 года назад +10

      I'm sorry to hear that, wish you the best - it's only a matter of time!!!

    • @tct1787
      @tct1787 2 года назад +3

      me too😂😂

  • @gmanonDominicana
    @gmanonDominicana Год назад +12

    Once I had an interview explaining how to design something. I totally missed the point. This definitely give us a clear idea.
    It's not about writing a user story, and not even building the actual application, but identifying the most critical points and possible components and to come up with how to solve it.
    Thanks again.

  • @karanbhatia2834
    @karanbhatia2834 2 года назад +13

    This level of quality content is available for free, it blows my mind! Also, I am churning through your Blind 75 list of questions and I am loving your solution videos.

  • @eldavimost
    @eldavimost Год назад +25

    Loved it!
    The only issue I see is sharding having all the people who follow each other in the same shard. That's just not possible, as a friend of yours will follow someone in another shard group at some point.
    I haven't got a good answer for that yet, apart from saying we should use a GraphDB here that hopefully is optimised for sharding this kind of data...

    • @arwinvinnysardana3266
      @arwinvinnysardana3266 Год назад +3

      Yes, that seems like a big oversight. Each shard will have a subset of a users followees, so the proposed user id as a shard key really doesn't do anything for us.

    • @Socsob
      @Socsob Год назад +1

      Yeah I felt like I was missing something when he said sharding and scrolled down to the comments to confirm

    • @salient244
      @salient244 Год назад

      Just paused at that part, seems incorrect. The best sharding I think may be tweet id (assuming using chronological IDs like snowflake) as people are generally accessing the latest tweets so can grab them in a single request if it misses cache

    • @eldavimost
      @eldavimost Год назад

      @@salient244 yeah, but still you'd need to store the friends relationships somehow and you'd get into the sharing issue when it scales up

    • @TheSdl79
      @TheSdl79 9 месяцев назад +1

      You've got it wrong. The idea is to have all the _followers_ of the user in one shard. This way, when the user posts a tweet, you would get all their followers ids from one shard with one query. Then you'd use this list of ids, to update their respective feeds with the tweet. When the user request their feed, they get it pre-computed from the cache, not built on-the-fly.

  • @respondo
    @respondo Год назад +3

    Very much enjoyed the video, the explanation, the simplicity and the clarity it brought out. Thank you

    • @NeetCode
      @NeetCode  Год назад

      Glad it was helpful!

  • @ejun251
    @ejun251 2 года назад +10

    Extremely good discussion in this video, more of this please!

  • @akashbhardwaj7810
    @akashbhardwaj7810 2 года назад +18

    What books / sources did you refer to get a strong grip on system design?

    • @NeetCode
      @NeetCode  2 года назад +29

      DDIA is the most comprehensive resource (assuming you have at least some experience).
      Also, most companies (including twitter) release blog posts and white papers about technical challenges they faced and how they overcame them. I think many beginners miss these, but they are an extremely valuable and free resource, which is why they are commonly referenced by system design textbooks.

    • @akashbhardwaj7810
      @akashbhardwaj7810 2 года назад

      Thanks!!

    • @sezer6200
      @sezer6200 2 года назад +1

      @@NeetCode Is there a central url where you find those blog posts or do you just google them?

    • @meepk633
      @meepk633 2 года назад

      A good place to start is by learning the classic OOP design patterns. It's less about the OOP and more about the patterns.

  • @roycechua
    @roycechua 11 месяцев назад +2

    I can't believe that I just found this channel now. Great content

  • @_romeopeter
    @_romeopeter 2 года назад +3

    Wow! That's a lot to take in maybe because I'm sleepy but sparked at the same time. Put out more of this please.

  • @PhillyHank
    @PhillyHank Год назад

    Thanks!

  • @rajatsaraf237
    @rajatsaraf237 Год назад +1

    If we shard based on a used id, won't it become a hotspot (if user is a celebrity or has large no of tweets)?

  • @tekforge
    @tekforge 5 месяцев назад

    Amazing! This one of the best System Design videos I watched :) Great job!

  • @Dayogg
    @Dayogg 2 года назад +2

    Thank you for explaining in such detail. I learned about sharding, definitely will use in my projects.

  • @arthur723
    @arthur723 8 месяцев назад +2

    At 18:12, how do you manage to get all the users one follows in a single shard? It seems strange to me. If that can work, then all the users need to be in a single shard, which fails the purpose of sharding.

    • @Squigglybiggly
      @Squigglybiggly 3 месяца назад

      Sounds to me like he meant "each person the user follows will be located uniquely within a single shard" and not "all the people he follows will be in the same shard"

    • @arthur723
      @arthur723 3 месяца назад +1

      @@Squigglybiggly the two sentences you said mean the same to me. Or maybe my English is bad.

    • @Squigglybiggly
      @Squigglybiggly 3 месяца назад

      @@arthur723 for a given user following users a,b,c with shards 1,2,3:
      User a posts -->> shard 2
      User b posts -->> shard 3
      User c posts --->> shard 1
      But not
      User a--> some posts shard1 and some posts in shard2
      So for any given user, all of their data will be in a single shard. However, not all of the people you follow will all have data in the same shard

  • @MuhammadFahreza
    @MuhammadFahreza 4 месяца назад

    14:11, why put index on follower? I think once we index by followee, the followeer list would be grouped inside DB.

  • @mayankkumargupta9601
    @mayankkumargupta9601 Год назад +1

    Literally Amazing man. Take a bow🙇‍♂️

  • @angelsancheese
    @angelsancheese 2 года назад +4

    Looking forward to part 2!!! More in-depth

  • @tzadiko
    @tzadiko Год назад +5

    First time Kim Kardashian has come up in any tech video I've watched

  • @kSergio471
    @kSergio471 Год назад +2

    If sharding by user id then, to retrieve a single tweet (e.g. by a direct link), you would need to request all shards. Is it something tolerable or how do you overcome it?
    And what about hot user problem? Sharding by user id does not work well in this case.

    • @TheSdl79
      @TheSdl79 9 месяцев назад

      Yep, but there is no requirement in this case to be able to request tweet by id directly without knowing the author of the tweet.

  • @YawarMurtaza-z9k
    @YawarMurtaza-z9k 6 месяцев назад

    Very good tutorial as always from NeetCode. Kudos.
    One confusion though: I am aware of publisher / subscriver pattern and I am also aware of message queue - What is new is "Pub/Sub message queue". Not sure what that is. From what it looks more like a message queue behaviour auther is indicating instead of a pub/sub.
    The impact you are creating is far better and huge than anyone working for FAANG.

  • @josephp1263
    @josephp1263 2 года назад +1

    I almost spilled my coffee when i heard the word "How hard can it be?" LOL

  • @yiyao1522
    @yiyao1522 2 года назад +8

    Correction 9:01 We can also implement sharding in most nosql databases.

    • @NeetCode
      @NeetCode  2 года назад +5

      That's correct, I meant that while NoSQL is easier to scale (automatically or by specifying a shard key), we can still scale relational DBs via sharding.

    • @-_______________________.___
      @-_______________________.___ 2 года назад +2

      Nice catch boss

  • @jordanhasnolife5163
    @jordanhasnolife5163 2 года назад +6

    Nice video! Gotta love some systems design

  • @user-kb5vl8hj4e
    @user-kb5vl8hj4e 3 месяца назад

    So if we want return the cache only, but if the user follows celerity then it will not be up to date. That mean every time user comes we still need to query the list of people that user is subbed to right? To check whether there is celebrity

  • @rewrite__
    @rewrite__ 4 месяца назад

    I just got asked this question in an interview, but with the added feature to follow interests too, and I am surprised I answered pretty much the same thing that is stated here and I passed the interview!, one thing to mention is that some companies/interviewers want to see SQL queries written in order to see how you make joins to the tables, so be prepared on that I would say

  • @maximilian19931
    @maximilian19931 2 года назад

    Caching the Feed page in the CDN and purge it on update(feed is tagged with User_ids), the infrastructure is basically a multi layer data retrieval, uid->followee->tweets(sorted by timestamps) and then merge to get the final result.
    The uid->followee mapping can be compactly stored and updated if needed. (K/V or RDB)
    followee->tweets would be a sharded DB with all tweets posted. (K/V).
    it would just be a simple backend and most of the load would be handled by the CDN.

    • @marspark6351
      @marspark6351 2 года назад

      That more or less is I think what he described for his feed cache description.
      But it doesn't solve the problem he brings up where we don't want to update all the followers' feed cache whenever a popular user posts a tweet.
      Also, I don't know how to do it, but when you say "on update", I'm assuming that whenever a person posts a tweet, all the users following that person gets "updated". In that case, then only thing that needs to be changed is inserting that new tweet into the feed (and probably popping out whatever oldest or least important tweet that is in the feed that this new tweet will replace). In that case, I don't think retrieving and merging all the relevant tweets each time there is an "update" makes sense. I think that's why he brought up pub/sub. So it's just a queue where whenever a new one comes the least important one gets popped out.

    • @TheIntraFerox
      @TheIntraFerox 2 года назад

      ​@@marspark6351 Maybe it's possible to determine a "popular" user and when those users create a tweet, only cache that tweet instead of allowing a message to go through the pub/sub when they post a tweet.

  • @progdynamic3114
    @progdynamic3114 2 года назад +1

    I have a question. in most of the read internsive applications . most of the design is to add a cache layer like redis to block the db traffic. Can i not add any cache but add as many as read-only replicas of mysql to distribute the traffice ? as cache also need to consider the sync problem between redis and mysql. but read-only replica can get rid of this hassle .

    • @marspark6351
      @marspark6351 2 года назад

      I would believe this has less to do with whether it's SQL or noSQL, but probably more to do with that Redis makes better use of RAM than mysql. Don't take my word tho. Just a possible assumption

  • @marspark6351
    @marspark6351 2 года назад

    12:48
    Can you clarify what you meant by "I shouldn't be able to pass in your uid"?
    Are you saying that function should actually not take uid as input?

    • @StockDC2
      @StockDC2 2 года назад

      Pretty sure he means that someone shouldn't be able to use something like Postman to send a request with someone else's user id and retrieve all of their tweets.

  • @marspark6351
    @marspark6351 2 года назад

    I have a question on how on 23:46 on how that "update of the feed upon request instead of during when a tweet is created" would work.
    So would the feed of a user keep continuously get updated via the message queue whenever there's a new tweet, except for the tweets of the popular one? And when that user requests the feed, it will somehow just fetch that missing tweet and fill it in the feed? How would that work?
    Isn't that the same issue as what it's described at 19:57 where 19 of your 20 tweets could be cached but you'll have to go to the disk to find that one tweet?

    • @TheSdl79
      @TheSdl79 9 месяцев назад

      Before returning the feed, the app server would check if the user follows any celebrity (one query to follow table). Then get the tweets of the celebrities which user follows from the cache, and inject them into the feed based on the timestamp. This approach has significant downsides like increasing latency for all users, so I believe this problem is addressed differently in real world.

  • @OswinChou
    @OswinChou 2 года назад +6

    If you have the capacity for asynchronously pre-building timelines for all (active) users, why don't you increase the capacity of the cache layer for the RLDB, or store the tweets in a fast KV NoSQL?

    • @TheSdl79
      @TheSdl79 9 месяцев назад

      Probably, having NoSQL KV-store with such massive reads you'd have to deal with its sharding anyways. Don't think you'd just set up Cassandra and start throwing in nodes to the cluster mindlessly. So, author, choosing SQL DB, just makes that logic explicit.

  • @garthbartin
    @garthbartin 6 месяцев назад

    Something I didn't understand:
    You suggests sharding on user ID as then the people a user follows will be grouped on the same shard.
    However, users can have a lot of followers and their followers will be distributed across different shards. So you have to duplicate a user's tweets across every shard that has someone following them in it in which case you probably have enough fanout that you're not really sharding anymore, it's just replicas with more steps (at least for the read case, writes would be meaningfully sharded).
    Am I missing something here? It feels like to get any value out of sharding you'd have to do something MUCH more complicated like assign users to shards based off similarity graphs.

  • @chessmaster856
    @chessmaster856 7 месяцев назад +5

    Don't guess the capacity, there are infinite servers, infinite ram, infinite disk. Don't calculate. Only poor calculate. Is the design horizontally scalable? Yes. Go home now

  • @ankitsheoran1788
    @ankitsheoran1788 7 месяцев назад +1

    If user A follows B and C and B follows back to A then all three should be on same shard and same way if B follows 10 more people and even one person follows back then all those 10 should be on same shard and it goes on with all data on single shard . looks like very abstract way , i am not sure why people not think little more rather thn explaining that abstract way

  • @KevinRabaev
    @KevinRabaev Год назад +1

    which tool are you using to draw the diagrams?

  • @imakhlaqXD
    @imakhlaqXD 6 месяцев назад +1

    If we have read heavy system why are we not using slave and master design

  • @raylin2527
    @raylin2527 Месяц назад

    I really enjoy watching your video!!

  • @ROBIN12JBJ
    @ROBIN12JBJ 2 года назад

    The abstract design is vital! Now I have realized this point.

  • @jenokalinszki1696
    @jenokalinszki1696 2 года назад +1

    I am a little confused about the DB schema - can someone explain why would we favour indexing based on the follower rather than the followee? What's the advantage here?
    I would assume the former is more logical to implement but I can be wrong.

    • @NeetCode
      @NeetCode  2 года назад +4

      If we wanted all the people that user1 follows (in order to generate their news feed) we could run a query like:
      SELECT *
      FROM followers
      WHERE followers.followerId = user1
      Notice we are filtering by followerId.

    • @ketanambati4413
      @ketanambati4413 2 года назад +2

      Whenever we create the newsfeed we want to populate it with tweets of people that a user follows(followee). So when we index the follower, we can query the followees relatively quickly, meaning that we can get all the people that a user follows, which makes it easier to create their newsfeed.

    • @ismailmo4
      @ismailmo4 2 года назад

      I understood it as: when the timeline is loaded and we need to populate tweets, you're going to be querying for tweets based on the follower of that tweet as opposed to the person being followed (followee).
      So if we say the user loading the timeline is the current_user a query (in a simplified world) would be like:
      SELECT tweets.content FROM tweets WHERE tweet.follower_id = current_user.id
      Note how our WHERE clause is on follower_id of that tweet and NOT the person who wrote that tweet (followee).

    • @jenokalinszki1696
      @jenokalinszki1696 2 года назад

      I see. Thank you all for explaining this!

  • @rizthetechie
    @rizthetechie Год назад

    Agree on the part that, the data is more on relational side. But why can't we put the tweet in any NoSql db like cassandra, scylla. As from our follow table i know which followee's tweet i have to fetch. Now that i know, i simply have to search in shards the followee's tweet stored.

  • @hitarthdesai5271
    @hitarthdesai5271 2 года назад +3

    Amazing video, this has made me curious about systems design roles in industry

  • @andrewlee7574
    @andrewlee7574 2 года назад +18

    What a nice video, I learnt a lot even being a junior developer.
    Btw, how can I find the official twitter engineering paper you mentioned at the end?

  • @lianchengzhang5604
    @lianchengzhang5604 2 месяца назад

    Splendid! Solid content with crystal clear pronunciation and comfortable speed. How did you practice your speaking? I wish I could speak no er----en-----aa those no meaning words in a system design interview.

  • @zuowang5185
    @zuowang5185 11 месяцев назад

    why do you index on follower, instead of making 2 db index on both

  • @veliea5160
    @veliea5160 2 года назад

    I understood why the userId helps as shard key but I did not understand why choosing "tweetId" as shard key does not help. Why to we have to query all the shards if we shard based on "tweetId"? can someone explain pls?

  • @thatomotaung6891
    @thatomotaung6891 2 месяца назад

    Awesome video, what are you using as your board?

  • @charan775
    @charan775 25 дней назад

    how would feed cache work when user scrolls indefinitely

  • @gazijarin8866
    @gazijarin8866 8 месяцев назад +5

    There shouldn't be any userId in the POST /v1/tweet/create endpoint. This is because we will get the id of the user initiating the request from the authentication token in the request header. Putting sensitive information like authentication tokens in the request body is a security risk

    • @Mike.Zazharskiy
      @Mike.Zazharskiy 3 месяца назад

      There's no difference in security, whether you put the token in the headers or the body. But it's better to put it in the headers because your gateway can start checking it or sending the request to the destination API before it downloads the body. Putting the userId in the body doesn't make sense here, but it would allow you to have other features like "postponed tweets". And another service with an internal token (without the userId) could call the existing API to post those messages.

  • @asdfasdf9477
    @asdfasdf9477 Год назад

    One of the defining features of twitter is timely notifications about new tweets from people you follow. Could you please describe how could it be implemented in this architecture? Likes and comments allow users to attach their content to a potentially popular tweet. How would it affect our storage layer? What challenges, if any, we would face with multi-az deployment of such system? Thank you for your time and interest in our company.

  • @roxhensm.8071
    @roxhensm.8071 2 года назад +1

    What software and device do you use for the drawing?

  • @punarvdinakar
    @punarvdinakar Год назад

    That initial diss on twitter is everything 😂😂

  • @Angelslo690
    @Angelslo690 Год назад

    Why you need relational db, all this relationship data you can store in document db as json for high performance, low latency and scalability. Usage of relational db will not be efficient in this scenario because we need to achieve high availability, we need eventual consistency so NoSql Mongo db is preferred over relational db in this scenario. Correct me if I am wrong.

  • @ivanjermakov
    @ivanjermakov 2 года назад

    Speaking of popular users. We can separate tweet data by some follower threshold (say 10k followers) and, when popular profile post a new tweet, we only need to update that feed. Every normal profile will check that feed in case they follow popular profiles.

    • @SameAsAnyOtherStranger
      @SameAsAnyOtherStranger 2 года назад

      So...use the average Twitterer's tweets as load dampening. They should do that. It will make Twitter even less popular.

  • @spork1125
    @spork1125 Год назад

    Where do those caches live? Are they separate servers? Or are we caching on the app servers?

    • @alexs591
      @alexs591 2 месяца назад

      Twitter uses redis. Separate servers, sharded by tweet ID, with read replicas

  • @programadorpython
    @programadorpython Год назад

    omg, you're insane. thank you!

  • @dannysi1234
    @dannysi1234 9 месяцев назад

    This is great! Thank you!

  • @tyronedamasceno6706
    @tyronedamasceno6706 Год назад +1

    That’s amazing how this kind of large-scale system can grow and become so complex with amount of components and “moving parts”, also it’s impressive how it works with a massive amount of users and data storage like petabytes. In the end, I didn’t understand if your solution was using sharding or not on the database, if it is using, how do you solve the issue about the sharding-key, ‘cause it looks like not possible to use the “every account followed by someone” strategy due the reasons you even talked about.
    Is it possible to have sharding and reading replicas at the same time? And how to handle it, using many load balancers, each one after sharding for a single replicas cluster?

    • @lucassaarcerqueira4088
      @lucassaarcerqueira4088 Год назад

      I was left with the same impression. I don't see how this sharding could work

  • @alexs591
    @alexs591 2 месяца назад

    Use a DB like Cassandra: users, tweets, followers, follows, feed. Everything sharded by user ID to colocate relevant data.
    Fan out to followers feeds on tweet. For celebrity users, fetch the celebrity tweets from cache when building the feed. Have some background jobs pre-populate some other good feed candidates, Rank the feed by some scoring system.
    Push likes, retweets to an event stream and update cached like counters in Redis from the stream every so often. Shard on tweet ID and spin up some read replicas if needed

  • @somebodyoulove
    @somebodyoulove 2 года назад

    This is great. I loled at 0:48 .This video is neet.

  • @jacksonashby7471
    @jacksonashby7471 2 года назад +1

    we need more of these for sure

  • @Sudarshansridhar
    @Sudarshansridhar 2 года назад

    I don't even have a twitter account or did get the reall need.
    So do the interviewers gives inputs what is the twitter is used for?

  • @_jko
    @_jko Год назад

    Don't forget ads. Imagine how complex this whole thing becomes when we add in ads.

  • @zertbrown4642
    @zertbrown4642 2 года назад

    my IT classes coming in clutch

  • @Brosales1414
    @Brosales1414 Год назад

    I'm confused on how pub/sub works? can anyone explain to me what its suppose to do? if you can explain like I'm five that would be great!. THX

  • @chrishabgood8900
    @chrishabgood8900 Год назад +1

    considering how many joins you would have to do in a relational DB, it would be hard to justify that for twitter.

  • @tukkanen
    @tukkanen 2 года назад

    I wouldn't combine reads of tweets with "reads" of videos into a single number of data we're going to read from our "storage" as storing videos and streaming videos and storing and reading text tweets + meta data are completely different tasks which access and deals with data in a completely different way.

  • @ourangzebkhan6516
    @ourangzebkhan6516 Год назад

    which hardware you use for writing?

  • @smittyplusplus
    @smittyplusplus Год назад

    I would send the tweet timestamp from the client. If you handle it server-side and something breaks and delays the server-side ingestion of the tweet, you'd have an incorrect timestamp. ("Wow, what an amazing touchdown!" posted 2 hours after the touchdown and way out of context on feeds etc)

  • @alexanderradzin1224
    @alexanderradzin1224 Год назад

    Thank you for interesting video. I however doubt that relation database can store the tweets. I've just asked to design twitter during a job interview and constructed something very similar. But I suggested to use aerospike for messages using the following schema: id->list off messages. Aerospike is horisontaly scaled, so there is no need to think about sharding.

  • @Thomas-lv1dc
    @Thomas-lv1dc 2 года назад

    Ngl as a aspiring software engineer, I find this video helpful in terms of macro design. New video style over the different duties of a software engineer? 👀👀

  • @IAmNumber4000
    @IAmNumber4000 2 года назад +1

    If the interviewer is Elon, all you need to do is remember the word “turboencabulator”.

  • @neurocat6453
    @neurocat6453 2 года назад

    I don't understand the logic behind the statement on 18:20. "All the people this user follows will be on the single shard". Why so? Tweets of one user will be on one shard - but the following accounts (their tweets) can be scattered across all the shards. Or maybe you meant it by "logic of our sharding" - but it would be impossible to maintain our sharding on every users follow-unfollow.

  • @dantedt3931
    @dantedt3931 Год назад

    Great video. Learnt a lot.

  • @mehrdadk.6816
    @mehrdadk.6816 2 года назад

    Thank you so much for this video and its good content. Actually one thing to correct maybe is that 12:24 it's not good to save authorization token in db due to security reasons. so maybe if one says that in interview , the interviewer thinks the interviewee does not care about this, and reject him/her

    • @zhenghaohe4727
      @zhenghaohe4727 2 года назад

      do you mean by passing user ids along with the request implies that the auth token is stored in db? because I don't see him mention it explicitly in the video where to store auth tokens. Also out of curiosity where do we store the auth tokens then?

    • @cesarvspr
      @cesarvspr 2 года назад +2

      @@zhenghaohe4727 we don't store them, we validate them against our secrets

  • @BartomiejKrzywania
    @BartomiejKrzywania Год назад

    Yes, you must be really disliking Elon Musk so much (to say it mildly ).
    > Who is most popular on Twitter? Kim Kardashian. probably over 100 million followers .
    .....
    --
    Putting the subject aside - you made a good content - thank you!

  • @thekauer
    @thekauer 2 года назад +1

    Would it make sense to only store the tweetId of the tweets in the feed cache, so when someone popular edits their tweet, the edited version will probably be in the tweet cache already, from where we can quickly grab it?

    • @merick1453
      @merick1453 2 года назад

      You’re already pushing tweet related info to the feed cache why would we limit it to tweet id only? That’s actually more of an overhead since we’ll need to do another request to actually fetch the tweet detail. Also for update we can always use a lastUpdate timestamp to compare and only push to the cache if it changed

  • @yilmazbingol4838
    @yilmazbingol4838 2 года назад +1

    People watch netflix, I watch neetcode.

  • @MrRetroboyish
    @MrRetroboyish Год назад

    I appreciate the effort and care you put into this video but I think it could use a little more focus. Especially at the sharding-for-writes portion. You jumped around a lot to digressions that made that line of thought hard to follow.

  • @Nonchalant2023
    @Nonchalant2023 2 года назад +1

    so what's a batch RPC? Asking for a friend...

    • @alexs591
      @alexs591 2 месяца назад

      inserting/retrieving multiple things at once rather than separately

  • @sangdang9666
    @sangdang9666 4 месяца назад

    The problem right now is not about designing a workable system but a system that works smoothly without spending much $$$ on the infrastructure.

  • @thatsJD
    @thatsJD 2 года назад +1

    I loved your video, very much and thanks a lot for he afford you made. These are the question we actually face when you are working on the BE side.
    One small question,
    If someone asks you, what kind/type of architecture is this? What will be your answer?

  • @JohnSaenzc
    @JohnSaenzc Год назад

    Gracias - Thanks, great video.

  • @kewtomrao
    @kewtomrao 2 года назад

    Wat tools do you use to draw?

  • @ayumi5621
    @ayumi5621 2 года назад

    Can someone please tell me what drawing tool he use here?

  • @KellsCode
    @KellsCode 2 года назад +2

    What software do you use to record drawings like this?

    • @NeetCode
      @NeetCode  2 года назад +6

      Paint3d and streamlabs obs

  • @SreekantShenoy
    @SreekantShenoy 2 года назад +1

    Neet: How hard could it be?
    Candidate: *sweats profusely seeing Elon* 😥

  • @NightCityBeats
    @NightCityBeats 4 месяца назад

    Bro how do you draw so good with the mouse

  • @doBobro
    @doBobro 9 месяцев назад +1

    It's so silly and cursed situation with system design interviews. Usually you have functional requirements to support 10e6+ users but you can't make even remotely viable design to support these requirements. It's always a hand wavy "a thing" you can't apply to real life in any way. And the most outrageous thing: in real life you never design for scale without already working product. It's always post tweaks for current and near future loads, numbers you have on hand.

  • @michaelscofield2652
    @michaelscofield2652 10 месяцев назад

    I would argue you need an index on both followee and follower because in twitter you can see both ways