Algorithms behind Modern Storage Systems

Поделиться
HTML-код
  • Опубликовано: 19 янв 2025

Комментарии • 17

  • @Xeoncross
    @Xeoncross 2 года назад +7

    Starts at 16:00 with LSM-Tree's if you're already aware of sequential vs random access

  • @ameynaik2743
    @ameynaik2743 3 года назад +12

    Not for a beginner. Good talk to revise the concepts. Highly recommend reading Chapter 3 in DDIA book.

    • @rapoliit
      @rapoliit Год назад

      What's DDIA book please?

    • @nokibulislam9423
      @nokibulislam9423 Год назад +1

      ​@@rapoliitdesigning data intensive application

  • @charan7240
    @charan7240 Год назад

    one of best talks about database read and writes

  • @mullergyula4174
    @mullergyula4174 2 года назад +1

    It was a joy to watch.

  • @arijit_ad
    @arijit_ad 6 лет назад +6

    Enjoyed the talk. Thanks.

  • @mr-boo
    @mr-boo 4 года назад +4

    Great talk, much appreciated! :)

  • @benevolent6705
    @benevolent6705 4 года назад +5

    In 19:44 it is assumed that ss-tables have a synchronized clock because their entries have a key and timestamp. What method is used to synchronize the clocks of separate nodes that contain ss-tables?

    • @altanozlu8268
      @altanozlu8268 3 года назад

      Use NTP

    • @SimonBuchanNz
      @SimonBuchanNz 3 года назад +2

      It's handy to think about what it actually looks like for this to matter: you have multiple nodes being written to with different values for the same key at close to the same time, so this is essentially just the multiple master/primary node problem. Either it's fine for one of those to win, or you already need something like a mechanism for optimistic update where the nodes can agree on which is the existing latest value that is getting replaced and that the incoming write was from a client that knew about it.
      The simple answer is have a single primary node that writes go to, and use its clock. You can be more clever and determine a different primary for each key based on hash to spread the load, which then replicates to the other nodes for resilience. You can still have multiple primaries for a key, but generally that involves then knowing about each other and pushing any received updates to each other, along with the common timestamp, so that communication had to take into account that there's clock differences, time lag, and concurrency issues to consider.
      Note that making it a timestamp isn't even needed, an auto increment version number works too with most of these approaches, but using a timestamp can be handy.

  • @manan4436
    @manan4436 2 года назад +1

    Amazing talk

  • @jonnytheponny5753
    @jonnytheponny5753 3 года назад +4

    good talk, but has one flaw: He has too less slides. it is not good (for beginners/learners) if too much is explained without having backed that by slides.

    • @KPTalksStuff
      @KPTalksStuff 3 года назад

      Yeah, true. Lot of talking with the same slide on, the slide just becomes a distraction and also boring I guess. I can see people talking about lack of visualizations when talking about database. Lot of scope for improvement and content for databases I guess! ;)

  • @subusrable
    @subusrable 3 года назад

    Awesome

  • @milossimicsimo
    @milossimicsimo 2 года назад

    Great talk