Lesson 174 - Replicated Caching and Data Collisions

Поделиться
HTML-код
  • Опубликовано: 25 окт 2024

Комментарии • 14

  • @dimitrikalinin3301
    @dimitrikalinin3301 3 месяца назад

    Great episode, thank you! This is actually just one of the possible applications of the CQRS approach in eventually consistent systems.

  • @nrprabhu555
    @nrprabhu555 11 месяцев назад +4

    Good one, however in the second case what happens if the 2 requests have requested quantities that total to an amount higher than the quantity in hand??

    • @gogolglaus
      @gogolglaus 11 месяцев назад

      Yes, I had the same thought ;). You need a strategy to avoid going below zero.

    • @markrichards5014
      @markrichards5014  11 месяцев назад +2

      That is the trade-off of the single-update solution. It guarantees consistency at the cost of eventual consistency. That's when backorders occur 🙂

  • @MultiProceed
    @MultiProceed 11 месяцев назад

    Great topic Mark. Just a remark on the computation. With the input example the result is 0.4 not 0.2. I forgot something or just a typo ?

  • @petrvasilyev6843
    @petrvasilyev6843 8 месяцев назад

    Thanks a lot for the video. But how is 0.2% collision rate or even 0.00001% collision rate OK? What is the point of our system if we cannot be entirely sure if numbers in our memory or DB represent the actual amount of books in stock? Are there specific kinds of systems where approximate numbers are enough?

    • @markrichards5014
      @markrichards5014  8 месяцев назад

      Its not a guarantee that will happen, only a probability it will happen. Systems that have this tolerance sacrifice performance for data integrity.

  • @VishnuPulivelikunnel
    @VishnuPulivelikunnel 10 месяцев назад

    Sharding and routing requests based on that can avoid collisions to some extend. But main challenge is dividing the data based on write.

  • @VladOleniuk
    @VladOleniuk 11 месяцев назад

    Hi Mark. Thanks a lot for the video and for the whole series.
    Would you be so kind to clarify one concern. In the video you discussed, that if the update rate is low enough and the replication is fast enough that we do not have to worry about the collisions. While I agree that on average it is true, in edge cases it still might happen that the replication took longer than expected (a network glitch) or two particular consequent updates came faster than the replication latency. In this case the collision is still possible, and the distributed cache will find itself in an inconsistent state (and there is no way out of it, except of restart).
    So, do we have to care about the data collisions always when we decide to use a replicated caching solution?

    • @markrichards5014
      @markrichards5014  11 месяцев назад

      Most certainly, regardless of how low the update rate is, you could POSSIBLY get an edge case where within the same millisecond two customers purchase the same item, possibly resulting in a data collision. However, the probability at such a low update rate is so low that it's usually not worth addressing. The trade-offs of caching and distributed computing... 🙂

  • @davideb4258
    @davideb4258 6 месяцев назад

    Where does the collision formula come from? Is there a paper, study, something that demonstrates the formula?

    • @markrichards5014
      @markrichards5014  6 месяцев назад

      You bet! Here you go: www.shadowbasesoftware.com/wp-content/uploads/2016/08/Resolving-Data-Collisions.pdf

    • @davideb4258
      @davideb4258 6 месяцев назад

      @@markrichards5014 thanks!