System Design Interview: Design Dropbox or Google Drive w/ a Ex-Meta Staff Engineer

Поделиться
HTML-код
  • Опубликовано: 26 сен 2024

Комментарии • 208

  • @abhijeet8710
    @abhijeet8710 3 месяца назад +72

    "Have you done any System Design course ? How are you so good with this subject ?" - These were the word of my interviewer. I had a High Level + Low Level system design with a start-up recently. Surprisingly the question was to design a file sharing system such as Google Drive as described in this video with some additional features. I explained the HLD with the diagram as I had learned from the the concepts of this video. After the HLD was over, the interviewer told me that I have created a very robust & elegant system. He further said, he was so satisfied with the HLD, that he no longer wants to go into the LLD.
    Folks, these videos are the absolutely anything that you will ever require to ace a system design interview. Do remember to learn the fundamentals used in the system. A huge thanks to #Hello Interview for putting out the best content out there.

    • @JohnVandivier
      @JohnVandivier 3 месяца назад +7

      "he was so satisfied with the HLD, that he no longer wants to go into the LLD. "
      GOALS! kudos and congrats

    • @hello_interview
      @hello_interview  3 месяца назад +8

      This is epic!

  • @EamonLinskey
    @EamonLinskey 4 месяца назад +30

    These are the best System Design videos I have found. Great framework for approaching problems, clear explanations, helpful diagrams. And I really appreciate the notes about how insight’s different seniority levels might approach specific parts

  • @andjelaarsic9217
    @andjelaarsic9217 3 месяца назад +6

    My mind is absolutely blown by how beautifully everything is explained. I love how you understand what would be possible questions/confusions from people watching and you address them by explaining pros and cons.
    Thank you so much for the content! Your walkthroughs are by far the most useful and interesting.

    • @hello_interview
      @hello_interview  3 месяца назад +1

      High praise! Appreciate you taking the time to share this 😊

  • @GauravGupta-op8ol
    @GauravGupta-op8ol 5 месяцев назад +16

    With my systems design interview coming up, I was looking forward to your video.
    It's great as always.

  • @Wololowizz
    @Wololowizz Месяц назад +2

    I must say that this is the best system design video I've seen so far. You covered the problem and solution step-by-step while other videos just throws a bunch of ideas right away. Sometimes I feel overwhelmed watching other videos thinking that's impossible to know all of that, but watching this video we can know what's the expectation for each level and the most important thought: you don't need to know everything. And that's gold

    • @hello_interview
      @hello_interview  Месяц назад +1

      Glad you liked it! Check out our others if you haven’t already. Same format :)

  • @yourssachin
    @yourssachin 4 месяца назад +8

    Love the content and explanation. I watched hundreds of videos on system design from last 4-5 years and also have paid subscription from few. I don't have any doubt that, your channel can become premier system design platform in no time if you keep the content quality high ( just like last 3 videos).
    Next video, I'd recommend to talk about messaging platform like WhatsApp or FB messenger. There are so many videos on this topic but didn't find any which explain the details and really help in the interview.

  • @madhurnsit
    @madhurnsit Месяц назад +1

    This is the best content I have come across on System Design interviews. Wish I had landed here this sooner. Thank you so much!

  • @levimatheri7682
    @levimatheri7682 2 месяца назад +2

    Wow, by far the best system design videos anywhere. I love how simple you make it, and the invaluable tips!

  • @batusun717
    @batusun717 16 дней назад

    please upload more stuff like this. This is literally the BEST on RUclips. Very much appreciate all the great efforts!

  • @alexandergordon9286
    @alexandergordon9286 4 месяца назад +2

    It's pure gold! specially the parts where you are stopping the debates abouts what db to choose or if the calculations are needed.
    The deep dives are the best part.. no one goes that deep and thats actually what matters in an interview

  • @md_dm490
    @md_dm490 5 месяцев назад +4

    This channel has the best system design content on youtube. Keep up the good work.

  • @anuragtiwari3032
    @anuragtiwari3032 4 месяца назад +1

    i dont comment much, but for this kind of explanation i gotta give it u. Hands down the best explanation on youtube . pls continue making these kind of videos . This channel will blow up

  • @noobu
    @noobu 5 месяцев назад +1

    Great stuff again!
    Not only good for interview but also for daily work
    1) Clear and concise structure
    2) Weigh trade off rigorously and explain the final decision clearly. Every single component is well though out with real world considerations

  • @indreshgahoi7103
    @indreshgahoi7103 5 месяцев назад +2

    Hey Evan , thank you so much for providing the great content. I really live the way you organize and put content across the board. ❤

  • @JShaker
    @JShaker 2 месяца назад

    I'm so grateful for all of your videos. I've been practicing using the Hello Interview AI interviews, booked one mock with one of your interviewers, watched all the videos.
    The quality is so far beyond any other content out there, and I've successfully passed 5 system design interviews. Keep up the good content, your RUclips channel deserves to blow up and your website too #wouldinvest

  • @venkatamunnangi1287
    @venkatamunnangi1287 5 месяцев назад +3

    Thanks for the effort and videos. Easily one of the best in business for mocks and educational material.

  • @pragatimodi950
    @pragatimodi950 3 месяца назад

    Hi Evan, this is my first time giving system design interviews. Really glad I found this channel to learn from. Most of my prior feedback from mocks and system design have been framework related for when I explain my design. This really helps with that and I think even at work, this is a really good approach to follow for. most things. Awesome content, thanks a lot!!!

  • @tushargoyal554
    @tushargoyal554 23 дня назад

    This is the best channel for learning system design. I've gone through a lot of explanations but found them talking things in isolation making it very hard to connect to get a full picture. The popular system design interview book also doesn't help much due to very discrete and sometimes inconsistent sharing of knowledge.

  • @allenputich4192
    @allenputich4192 29 дней назад

    You do an amazing job of explaining the thought process, technical details, and growth opportunities!

  • @prasidmitra6859
    @prasidmitra6859 3 месяца назад

    These are like gift from God. The best SD resources I've found in the last 3 years.

  • @adeeshacharya7520
    @adeeshacharya7520 3 месяца назад

    This is really good, irrespective of whether we are taking interview or not, any person looking at this level of explanation and detail would try to picture software differnetly. Thanks for making such videos, would love to see some more

  • @evangeloskostopoulos8173
    @evangeloskostopoulos8173 5 месяцев назад +2

    This is really awesome, thank you. Please keep them coming!

  • @ashutoshrana9998
    @ashutoshrana9998 3 месяца назад

    Will be the best system design interview channel for sure. Neat content. Keep up with the quality Man!

  • @chongxiaocao5737
    @chongxiaocao5737 4 месяца назад

    one of the best system design preparation video I have seen online.

  • @groovymidnight
    @groovymidnight 4 месяца назад

    I really like the 5-step structure, it's the best I've seen and it effectively helps me think through the designs in a methodical way.

  • @smalladi78
    @smalladi78 3 месяца назад

    Thanks for posting these! Great interview as always! I am learning a lot from these interviews.
    I found it interesting that you jumped ahead in order for the non-functional requirements since you knew the large file upload requirement would impact the design enough that doing the other ones first was not beneficial since they would become irrelevant. Obviously, this comes with actual experience of working on the job.
    May I suggest doing a follow up that uses the final design from this interview and consider how it may change if you piled on a more advanced feature like syncing only a partial set of folders or sharing folders with other people.

  • @aldogutierrezalcala3047
    @aldogutierrezalcala3047 28 дней назад

    Bro, again me, just had a system design interview using your framework, still don't have the result but definitely this framework is basically pure gold to lead a conversation that i would keep using even in a daily job.

  • @jimitshah7636
    @jimitshah7636 3 месяца назад

    Great video for system design preparation.
    Methodology, the way he approached the question was good. 5 steps. Pretty good

  • @ahmedkhan25
    @ahmedkhan25 3 месяца назад

    Excellent sys design interviews - I like the informative tone and clear approach - thanks

  • @VyasaVaniGranth
    @VyasaVaniGranth 2 месяца назад

    First - please continue making and sharing these videos, this is incredible. Very few high quality sources available out there and this is probably the best one in my eyes.
    Second - how realistic is it that the download and upload happen directly b/w client and S3?
    Are there security concerns with this approach that should be considered? For reference, there's a Dropbox engineer's talk where uploads go through an intermediate service - this does mean additional copies of the data meaning more memory / compute but seems more realistic.
    In general, for any design that has media upload (eg. newsfeed), would you recommend direct upload to S3?

    • @hello_interview
      @hello_interview  2 месяца назад

      yah its a good point, most major systems don't do this for a number of reasons. While is largely academically correct and optimal, at youtube/dropbox/etc scale, they prefer more control so they're rolling their own systems here.

  • @mehdisaffar
    @mehdisaffar 3 месяца назад

    I love the content. It has been frustrating to watch some other system design videos where they just brush off over important details and act like everything is straightforward and easy, and just make 10s of services and never really explain the nitty-gritty details of how those things would work and IF they would actually work/be efficient etc. Thank you!

    • @mehdisaffar
      @mehdisaffar 3 месяца назад

      I wish you had mentioned the challenges of 2-way syncing in this context. Because this is akin to master-master replication, in case of network partition (for example user makes changes to remote, hops on another offline device, makes changes, then comes back online) there is a chance of inconsistencies (user makes different changes on device 1 vs device 2). There would probably need to be a way to offer merging changes together or have the user choose between version 1 or 2.

    • @mehdisaffar
      @mehdisaffar 3 месяца назад

      I think I talked too fast! You did mention reconciliation

  • @god_of_blunder
    @god_of_blunder Месяц назад

    these are the best Design videos i ever found, Thanks and Kudos.

  • @jherreria
    @jherreria 3 месяца назад

    I really appreciate your help in this topic. I'm learning a lot! Keep the videos coming!

  • @jeremyklein953
    @jeremyklein953 4 месяца назад

    Really good approach. I love how you build up to the full solution. It makes a lot of sense to me and helps me reason these complex systems as well

  • @puppy851226
    @puppy851226 17 дней назад

    Amazing content! Thank you hello interview!

  • @MrSnackysmorez
    @MrSnackysmorez Месяц назад

    I love the videos and these are some of the best explanations. I love the flow and how everything builds on each other. It makes it much more manageable to do these problems. However you are driving and dictating this and this is so much harder to do when the interviewer wants to constantly interrupt and ask questions while you are doing these steps without first letting you explain what you are doing. I have this happen pretty often. How can you tell them to just chill and let you proceed?
    Appreciate these videos!

  • @suri4Musiq
    @suri4Musiq 5 месяцев назад +1

    Loved this resouce, thank you so much! But I just wanted to point out that in my interview I was asked about sharing files with other users and I feel like this design concentrated more on just syncing files across multiple devices. In the former, I think we can talk a little more about CDN/other approaches which were hand waved here.

    • @hello_interview
      @hello_interview  5 месяцев назад +3

      Checkout the write up I linked! I go into sharing there.

  • @vaibhavsharma1653
    @vaibhavsharma1653 3 месяца назад

    Amazing.
    Some Notes:
    DeepDive:
    Chunking
    CDNs
    Adaptive Polling with only updated chunks
    Compression.

  • @phavelar
    @phavelar 4 месяца назад

    one can argue that "supporting 50gb upload file size" is a functional requirement (you placed it under non-functional requirement) - just a call out. great video!

  • @theoshow5426
    @theoshow5426 Месяц назад

    Keep going man! This is great!

  • @jmms49
    @jmms49 4 месяца назад

    great videos, thanks for uploading these. Easily the best content about system design interviews I've found.
    I would probably suggest to use merkle trees for the sync functionality, seems like a natual way to diff and sync large file systems

  • @surojitsantra7627
    @surojitsantra7627 4 месяца назад

    One of the best and detailed explanation.
    Thank you so much for this content. Please upload more such videos.

  • @YeetYeetYe
    @YeetYeetYe Месяц назад +10

    Simply amazing. I don't mean to throw shade to other channels, but this is by FAR the best system design interview prep. So many other channels are just people with a couple of months of experience at FAANG and it really shows the difference between junior FAANG engineers and Staff FAANG engineers. Extremely high quality work.

  • @IshaZaka
    @IshaZaka 5 месяцев назад

    Hi Evan, Thankyou so much for providing this type of content. plz make a system design video on payment system

  • @ediancomachio2783
    @ediancomachio2783 4 месяца назад +1

    this is pure gold thank you so much

  • @AlbaraaAlHiyari
    @AlbaraaAlHiyari 4 месяца назад

    I truly appreciate all the effort you've put into making these amazing videos. Please keep them coming. One insignificant (not important) nitpick. 50 GB @ 100Mbps = ~ 1hr 7min. I think you just forgot to convert the decimal to minutes. You have it correct in the write up, as in 1.11 hours (0.11 * 60 = 6.6 minutes).

    • @hello_interview
      @hello_interview  4 месяца назад +2

      Mental math is hard 😛

    • @AlbaraaAlHiyari
      @AlbaraaAlHiyari 4 месяца назад

      @@hello_interview tell me about it... Also not fun under the pressure of an interview 🤣

  • @robert23kim
    @robert23kim Месяц назад

    Thanks. Great learning.

  • @castulo
    @castulo 5 месяцев назад

    👏Bravo, on point as always. Thanks Evan, keep up the good work man!

  • @KITTU1623
    @KITTU1623 5 месяцев назад +2

    Thank you very much for the videos. One small nit pick. DynamoDB supports a maximum of 400KB per item and if we are storing all the chunk metadata in the item, for a 50GB file with 5 MB chunk size, assuming we need 100Bytes per chunk metadata, our item size would be around 1MB.

  • @dark-knight494
    @dark-knight494 4 месяца назад

    Big fan of this channel and Evan. Please solve whatsapp/messenger type chat system next if you get some time.

  • @rajeshkishore7119
    @rajeshkishore7119 Месяц назад

    Awesome content, full depth

  • @pujamishra1475
    @pujamishra1475 4 месяца назад

    I have a product architecture interview coming up. I was really looking for some good product architecture/design examples and then came across this. This is very helpful because you talk about the client, user experience, malicious users and relate it to the design decisions made. Thank you!
    One question, for a product architecture interview - should we go into more details about the APIs like explicitly write out requests, response, failure/success codes or the amount of discussion you did on APis is enough for senior level?
    Can you also tell me what topics/ points would you add over the discussion in this video if this was asked in a product architecture design round. Thanks again!

  • @Anonymous-ym6st
    @Anonymous-ym6st 5 месяцев назад +1

    Thanks for the great content! Your video really help me understand the "flow" about a good system design interview (which I do feel very important for staff+ engineers to direct the interview!)
    a few general questions:
    1. Is it beneficial to mention the selection of specific DB (like dynamo), and mention to use Kafka/Spark for microbatching snapshot update etc. after evaluating the QPS / replication etc.? or keep it at abstract level would give equivalent signals for interviewers?
    2. what signals / thinking might make you feel it even beyond staff (just out of curiosity, as feeling it has already been very perfect from the staff requirement)?

    • @hello_interview
      @hello_interview  5 месяцев назад +2

      1. You'll need to show technical excellence somewhere, these could be two good places. Ideally, you go deep in the places you know well and have hands on experience. If this is where that is, then go for it. If you keep it abstract, then the depth needs to come from somewhere else.
      2. Hard to say, Staff candidates can usually teach me something, which is a key sign. They know some part of a system better from work experience then I do, so we can go deep there and I end up learning. It's abstract, but this is usually the best sign that a candidate is staff+.

  • @danielkling4647
    @danielkling4647 10 дней назад

    First I would like to say that this content is excellent. Why though would you implement chunking yourself instead of using S3's multipart upload?

  • @tvmanikandan835
    @tvmanikandan835 5 месяцев назад

    the content is good, keep up the good work. expecting more SD videos in more details

  • @aforty1
    @aforty1 2 месяца назад

    Thank you for these videos!

  • @allenxxx184
    @allenxxx184 4 месяца назад

    Thank you for your effort! Excellent content!! Love it!

  • @dragonpearl8244
    @dragonpearl8244 5 месяцев назад

    Very easy understand keep continue new videos, thanks you so much

  • @omprakashpatel2079
    @omprakashpatel2079 4 дня назад

    great explaination

  • @adithyabhat4770
    @adithyabhat4770 12 дней назад

    Very informative video!

  • @depengluan7222
    @depengluan7222 3 месяца назад

    Love the content! Thanks for putting the efforts! nit, fingerprint probably does not fit as a good id for chunk, as hash value can change over time when the content changes.

  • @fragrancias972
    @fragrancias972 23 дня назад

    Excellent content. Please tell me if I’m mistaken, but I believe GET /files/:fileid would return a list of chunk s3 links, not the file itself.
    Also, I don’t think merely filtering chunks by update time would work for syncing. You would need a tombstone for when chunks are removed. You didn’t quite specify how “polling the DB”/ update time filtering works with delta sync.
    Merkle trees could be used to optimize the reconciliation you mentioned, right?

  • @HandsomeLancerYouTube
    @HandsomeLancerYouTube 23 дня назад

    Amazing stuff!!
    Side note: please use dark mode.

  • @dannyryngler6425
    @dannyryngler6425 2 месяца назад

    Question - what should the file id be? It can't be based on the file name, as names can change. It also couldn't be a hash of the whole file, as the file itself can obviously change. Amazing content, thank you!!

    • @hello_interview
      @hello_interview  2 месяца назад +1

      Depends on if you want versioning or not. Can be the fingerprint or a random uuid, depends on requirements

  • @TatianaRacheva
    @TatianaRacheva Месяц назад

    IIRC, low latency was specifically low priority for Dropbox because they (like email) rely on the client syncing the data and user accessing the local copy when it is ready. Also, I question whether consistency is less important than availability. I don’t know, but I’m curious how the answer would be different if latency could be high and consistency had to be strong.

  • @aslgomes
    @aslgomes 2 месяца назад +1

    Hey Stefan, awesome video, congrats! I've got a quick question though. Around the 49:46 mark, you mention adding an "updatedAt" to a chunk at a specific id/fingerprint. If a chunk changes, its fingerprint/hash/checksum would change too, right? So that id wouldn't really match the changed chunk anymore, would it? Doesn't that mean the old chunk gets "invalidated" and a new chunk id appears? Sorry if I'm missing something obvious here.

    • @hello_interview
      @hello_interview  2 месяца назад +1

      No this is spot on, good call out. I was loose here. If the fingerprint is the ID, then an updatedAt does not make sense. If the fingerprint is not the ID, then it of course does. Trade off here of whether you want to keep old chunks around for versioning.

  • @HiraMalik-r3i
    @HiraMalik-r3i Месяц назад

    Thanks for such a detailed video. Query: If File service is pushing the change events to Event Bus and also updating Db someway, wouldn't this lead to dual write problem? I do not see the same event being consumed by both( no message queues in that flow in the diagram). Shouldn't we instead use CDC or some other solution for this problem? What are your thoughts on that?

  • @arnavahuja310
    @arnavahuja310 4 дня назад

    im confused why the sync service needs to query the file service to see what changed. why can't it just look at the db itself and compare? and in that sense, why can't the file service just have an api to see what changed?

  • @Ynno2
    @Ynno2 5 месяцев назад +3

    Do you suggest a different delivery framework for system design interviews which aren't necessarily "product"?

    • @hello_interview
      @hello_interview  5 месяцев назад +1

      Topical! Was chatting about updating the site with that soon. I’d recommend very similar, but core entities and api are what may change as they could be less relevant. Instead I’d frame it as focusing on the inputs and outputs of the system more generally. And then still thinking about the data persisted

    • @hello_interview
      @hello_interview  5 месяцев назад +1

      I’ll do a pure infra question next

  • @whosgotrythm
    @whosgotrythm 5 месяцев назад

    Thanks great content. Probably the best.

  • @eforeyerman
    @eforeyerman 10 дней назад

    Are there any nitty-gritty details we need to know about auth for when the client talks to S3 directly on behalf of the file workflow? Or is that all handled by the pre-signed URL?

  • @nobodyknows228
    @nobodyknows228 2 месяца назад

    1. How can we handle write conflicts when we have a folder which is supposed to be consistent across multiple devices?
    2. Also when two devices are disconnected from the internet and if users updates some files how does the sync happens when they come back online and when both tries to write the changes at the same time at a same file path?
    I am not sure if these solutions work but I think
    1. We can use a Redis lock for writes with TTL same as the timeout or a little more of the pre-signed url. If connection fails in between we can just resume the upload when connected back. But this might be a problem when a user is trying to upload big files with large timeout durations since other users might have to wait till the user uploading currently is done.
    2. When the user comes back online we should probably first fetch all the changes that are executed on the device and raise conflicts with the user asking what action to perform(similar to git) and acquire lock to write if required.

  • @SunilKumar-jl6dl
    @SunilKumar-jl6dl Месяц назад

    Hey there, I have some questions. Would be great to get your thoughts:
    1. S3 supports multipart upload and all the chunks would get reconstructed into a single file at S3. Isn't this correct? If yes, then having file chunks in the database would be redundant right? Or would S3 have the chunks always and give access to the download at the client end?
    2. At the client end should we know how the updated/deleted chunks of a previously uploaded file be stitched back together?
    3. Would folder sharing with other users be a possible follow up question? Like what Google drive offers.

  • @truthSaty
    @truthSaty 4 месяца назад

    Your videos are v. good but your mock interview is costly for certain markets (BFS can give you more return). Wish to get interviewed soon !

  • @pankajk9073
    @pankajk9073 4 месяца назад +1

    one question- how do we merge chunks in order after downloading to local device? is it a good idea to keep some kind of sequence number for each chunk for a file?

    • @hello_interview
      @hello_interview  4 месяца назад

      Yah!

    • @Sandeepg255
      @Sandeepg255 3 месяца назад

      @@hello_interview Wont this mergeing logic be too heavy on the client side ?

  • @kamalsmusic
    @kamalsmusic 2 месяца назад

    For the client to know how to stitch together chunks, doesn't it need to know the starting offset & length for each one?

  • @ramannanda
    @ramannanda 3 месяца назад

    For the delta sync bit, probably should go a bit deeper into rechunking for an existing file, to perform the delta sync.

  • @bqrkhn
    @bqrkhn 29 дней назад +1

    Very nice video.
    A question: You added a updatedAt at each chunk. But chunks are identified with their ID which is calculated from a finger print. When the file changes, the finger print changes, how do we update the updatedAt?
    Possible Answer: From client we send both old and new chunk IDs and then update both id and updatedAt. Is this the correct strategy?

    • @fragrancias972
      @fragrancias972 23 дня назад

      Same question here.

    • @bqrkhn
      @bqrkhn 23 дня назад

      @@fragrancias972 what do you think about my possible answer ?

  • @yuuhameaw1510
    @yuuhameaw1510 2 месяца назад

    Thanks for the great content!
    One question though, if we use chunk fingerprint as an id, when the chunk change the fingerprint would be changed. How are we going to sync them?

    • @hello_interview
      @hello_interview  2 месяца назад +1

      Add a new chunk. Good to keep the old around for versioning (not a requirement here, so either way works)

  • @maharshishah4840
    @maharshishah4840 3 месяца назад

    Some interviewers really like seeing estimation numbers earlier on before going into the higher level design despite what you generally suggest we should do. Can you may be also create a video where you do some relevant capacity estimations early on and later guide the higher level design? I would love to see if there is a good way to get some numbers in during "requirements" phase

    • @hello_interview
      @hello_interview  3 месяца назад +1

      In general I’d say ask up front. Explain your reasoning and ask if they’d like you to then proceed from there. It’s pretty hard for me to think of what estimations I’d do directly up front that would inform anything. The classic bandwidth and storage is far too crude to inform in my opinion and I can already estimate based on DAU alone until later in the design.

  • @davidoh0905
    @davidoh0905 2 месяца назад

    One question about the chunk status verification is that with NoSQL, we cannot directly access the information about the chunk. Is it okay for us to assume that we will always be parsing through the entire list of chunk metadata? Additionally, what would happen if one chunck gets big enough that new chunk has to be created? And maybe the new chunk is super super small. are there any issues coming out of a continuously updating file where chunks start to fragment? Thanks for such an amazing content! I'd just like a bit more detail of how the database based status updates would look like!

  • @UnderratedMomentsfromStarWars
    @UnderratedMomentsfromStarWars 2 месяца назад

    For chunking on the client side... would we literally chunk it all on the client, send a request to make the chunkId rows in the DB, then upload to s3?
    I'm confused about that process and then how we update the status of each chunk in the DB after a chunk has been uploaded. I've been reading the s3 documentation, and couldn't see a great way to have events per chunk.

  • @damluar
    @damluar 8 дней назад

    If we change a file on 2 different clients at about the same time, in the current design we might end up a mixture of chunks from both files, right? How would we avoid this? Versioning?

  • @mindrust203
    @mindrust203 5 месяцев назад +1

    Hey Evan, this content is fantastic, thank you!
    I have a question regarding your solution to chunking around the 39 minute mark
    When we ask S3 to fetch us a pre-signed URL, do we do that for all our chunks as well? Does this happen on initial request to upload the file (metadata)?
    The way the File Metadata entity schema is described, it looks like we have a top-level S3Link, but also chunk-level S3 links embedded in the file metadata, so the upload flow is a little unclear to me

    • @hello_interview
      @hello_interview  5 месяцев назад +5

      Good question, you're right to be a little confused here. So as I alluded to S3 offers and API called multi-part upload. For this, it requires just 1 presigned url, but, multi-part upload re-stitches the chunks back into a single file in s3, so this does not allow us to send over chunk deltas for syncing.
      As a result, we have to upload as chunks manually without relying on multi-part upload. So, long answer, but yes, you'd actually need to request a presigned url for each chunk, I should have made that clearer but tbh was not sure in the moment if multi-part upload could be configured to not re-stitch the file, so I omitted :)

  • @JuggleDrum
    @JuggleDrum 7 дней назад

    Please do "Design Gmail" next.

  • @VahidOnTheMove
    @VahidOnTheMove 3 месяца назад

    Thanks for the videos. 47:45 I would like to know your opinion on push approach? By push approach I meant when the File service knows there is a change in a chunk, Sync service will let the client know. And, then the client will send a request to sync/download the chunk.

  • @fmagarik
    @fmagarik 5 дней назад

    I didn't get the event bus part. What's actually stored there?

  • @Resocram
    @Resocram 4 месяца назад +1

    If you split the files into chunks are you able to upload all chunks using the same pre-signed URL? Or do you need to generate a new URL for each chunk? How would you piece together the file from S3 when you download it through chunks?

    • @hello_interview
      @hello_interview  4 месяца назад

      S3 has a multi-part upload api that only requires 1 pre-signed url. Depending on if you need the chunks as chunks in S3 or not you can use that (it stitches them back together automatically). If you want to save the chunks as chunks, you need N pre-signed urls

  • @sachinpateltech
    @sachinpateltech 2 месяца назад

    Amazing content

  • @BlunderMunchkin
    @BlunderMunchkin 2 месяца назад

    Huh. I would have prioritized consistency over availability. So much so, in fact, that I didn't even think it was a question. Some of the biggest headaches I've experienced as a developer have been caused by having an out-of-date file. I would much rather be temporarily unable to retrieve a file than to be fooled into thinking that the file I retrieved is the correct version.

  • @coderzlife
    @coderzlife 4 месяца назад

    Please make a video around Designing a distributed login

  • @shantanusoni4311
    @shantanusoni4311 26 дней назад

    Thanks for so infomative contents , I even watched and read design of drop box , no one went into so much of details. The software that you used is also very good, can you please tell which software are you using for teaching.

  • @prateeksingh7994
    @prateeksingh7994 5 месяцев назад +1

    Great content!

  • @supragya8055
    @supragya8055 4 месяца назад

    had few questions and suggestion->
    1. When really large file is divided into chunks and upload in different s3 file path , will different pre signed urls be requested for each chunk of that large file ? will this logic be maintained in file service. If we are using trust and verify along with s3 multipart api , i guess in that case s3 would created different paths for each chunk which we also need to store in metadata , would s3 return actual file path of each chunk to client so that it can updated by file service in metadata db .
    2. Also generally google drive doesn't allow to update same file , it will create a new version if a new upload is made to same path , so syncing of chunks for 50gb file that has been modified is probably not a valid usecase ?(thoughts) more valid usecase is addition of files to same folder/path .(thoughts) , which i think here would work by sync service using user_id , folder_id , timestamp , returning all metadata to client which is essentially added to folder/directory for user .
    3. while responses by sync service or file service for downloads are being sent to client , i guess some pre processign would be needed to get pre signed urls from actual s3 path which is stored in ddb , which can be used by clients to donwloads .
    But thanks for the video its super helpful .

  • @sharansrivatsa210
    @sharansrivatsa210 4 месяца назад

    Thank you for this video! It is very helpful.
    I do have a question though - lets say the same file is modified by 2 devices at the same time. How do we handle sync in that case?

  • @deathbombs
    @deathbombs 4 месяца назад

    45:45 I wonder how syncing would change if instead of folder status, it's for database writes with many writers

  • @li-xuanhong3698
    @li-xuanhong3698 5 месяцев назад

    Love your channel !

  • @EstherKim-463
    @EstherKim-463 4 месяца назад

    Maybe since data integrity was a non-functional requirement it would be worth mentioning some strategy for handling two clients modifying the same file concurrently - otherwise there's a high chance you end up with a corrupt file. Since you don't handle versioning, that pretty much necessitates locking.

    • @hello_interview
      @hello_interview  4 месяца назад

      Yup, definitely worth mentioning a resolution strategy there.

  • @YatharthJohari-uu1yb
    @YatharthJohari-uu1yb 4 месяца назад

    It would be really helpful if you can share which whiteboard tool you are using here.

  • @maxvettel7337
    @maxvettel7337 2 месяца назад

    Hi Evan, can you explain how we keeps tracking the order of chunks in file. I mean the case when the client has downloaded chunks from S3 DB and is about to assemble it