How Booking com designed and scaled their highly available and performant User Review System

Поделиться
HTML-код
  • Опубликовано: 10 сен 2024

Комментарии • 50

  • @susantaghosh504
    @susantaghosh504 17 дней назад +1

    I believe all these complexities will be abstracted by distributed database providers like DynamoDB, ScyllaDB, or CockroachDB.

  • @abhis3kh
    @abhis3kh Год назад +1

    Hi Arpit, Got some questions. If you can answer it will be awesome.
    1. Consistent hashing will be used for writing the data as well to DB right?
    2. What happens if we remove one node from a ring of 4 & that removed node contains some data. So how to determine which Node we should transfer our data from that removed node?
    3. How the cache will be updated in real time? Does it got updated when user create/update a rating, or it will fetch data from DB? Cache is of limited size so what would be the best approach for storing data: I think it should be LFU. What your views?
    4. How to route the request in case a node goes down? Are we adding that item (another node from other AZ1) to consistent hashing ring or how?

    • @AsliEngineering
      @AsliEngineering  Год назад +4

      1. Yes. It spits out data ownership.
      2. Consistent hashing spits out that info
      3. Depends on the usecase not one correct answer here. LFU with exponential decay can also work fine.
      4. That's the recovery handler written in the routing layer (API servers) in this case. Or a separate component.

    • @abhis3kh
      @abhis3kh Год назад

      @@AsliEngineering Thank you :)

  • @gauravraj2604
    @gauravraj2604 Год назад +1

    liked your explanation Arpit. Thanks a ton

  • @Amritanjali
    @Amritanjali Год назад +1

    thanks🙏

  • @architshukla8076
    @architshukla8076 Год назад

    Very Informative video Arpit...Thanks :)

  • @user-bs7dh4nq4i
    @user-bs7dh4nq4i 8 месяцев назад

    In Japan, last year 2023, some hotels owners did large lawsuit agajnst Booking, because of payment delay. Booking side execuse was System trouble.

  • @Aditya-us5gj
    @Aditya-us5gj Год назад

    Hi Arpit, thanks for this channel. I was utilising it to fullest before I discontinued because of lack of dedication and consistency. And now when I visit videos section, I'm f**ked up how to cover all these gems.
    Not sure to bing watch them on weekends or to keep them aside for a while and start covering daily videos consistently.

    • @adianimesh
      @adianimesh Год назад

      30 minute a day ! thats how I do it .. will catch up eventually

  • @AbdurraffaySyed
    @AbdurraffaySyed 20 дней назад

    Hi Arpit, Thankyou for such an amazing explanation.
    I wanted to ask you a question.
    Is it possible that our hash space gets skewed up at a certain point like the files are inserted on the left side of the hash space and the storage nodes are inserted on the right or vice versa? If yes, so wouldn't it result in the celebrity problem?
    I might be missing something here. Would love to hear your thoughts. Thankyou

    • @sarthuaksharma9609
      @sarthuaksharma9609 День назад

      Although hash functions are used in such a way such that we close to uniform distribution but in case if such a condition happens you can shift some data to another node but yeah this will require some changes in the review service hashing logic.

  • @koustavdas2519
    @koustavdas2519 Год назад +1

    Hi Arpit. If we use NOSQL here what DB would you suggest a Cassandra or MongoDB? I know Cassandra is write efficient. But what about reads? And personally which DB would you choose? @AsliEngineering

  • @atuljoshi6182
    @atuljoshi6182 Год назад

    Excellent explanation .

  • @utsavprabhakar5072
    @utsavprabhakar5072 Год назад +1

    Do we shard databases ourselves? Like in dynamo db, sharding is handled on its own. There is a provision I guess to shard but thats not usually used by anyone. Can you give some real world examples of where consistent hashing is done nowadays? (given db requirements are handled by cloud service providers and autoscaling, provisioning and sharding is done by the database automatically?

    • @AsliEngineering
      @AsliEngineering  Год назад +1

      depends on the database you are using. in managed db we need not do anything but if you are self hosting then yes.

    • @utsavprabhakar5072
      @utsavprabhakar5072 Год назад

      @@AsliEngineering got it, thanks!

  • @microtech2448
    @microtech2448 Год назад

    How would data integrity be handled when shards are added or removed?

  • @shubhamtyagi5219
    @shubhamtyagi5219 Год назад

    When we are creating new ring, how do we know which requests need to be relocated because we would be moving some keys from the older shard to new shard. How to find which key?

  • @ng.manisha
    @ng.manisha 4 месяца назад

    What will be the structure of the redis key and value? Also, for DB sharding, we don't generally recommend sharding for rdbms systems right? because joining data across multiple shards are costly? How does this get managed? Shouldn't we use cassandra here?

    • @sarthuaksharma9609
      @sarthuaksharma9609 День назад

      Cross sharing is frowned upon but if you are using shards on the basis of accommodation id you will never have to do cross shards. I think he used sql in the example because that's what booking also does. Also Cassandra is a good fit generally when there is more write to read ratio here the case is different. Although I am not sure about prematerialized view performance wrt to any other Nosql DB. I think that is the deciding factor here

  • @jayantprakash6425
    @jayantprakash6425 Год назад

    when the new ring is being prepared, are reads/writes served by old ring? Is there any downtime associated?

  • @mahendratonape27
    @mahendratonape27 10 месяцев назад

    what if routing decision is made by db router instead of consistent hashing algo written in routing service, i think db written routing alog is better than our own consistient hashing algo

    • @maheshkumartangella5516
      @maheshkumartangella5516 5 месяцев назад +1

      I had the exact same question, why do we want to recreate what database could already do, routing its requests to corresponding shards

  • @boombasach
    @boombasach 11 месяцев назад

    Thanks

    • @AsliEngineering
      @AsliEngineering  11 месяцев назад

      Thank you so much for the kind gesture :)

  • @robinpaulification
    @robinpaulification 4 месяца назад

    08:50
    it is two regions and not AZ

  • @d4devotion
    @d4devotion Год назад +1

    So it turned out to saying that even reviews on your System-Design-Master-Class are playing a vital role. (You will come to know what I meant to say in your next cohort :) )

  • @tesla1772
    @tesla1772 Год назад

    Hi aprit, i have one question that all this data migration that we have to do how is it done?. Do we write our own scripts for it to first migrate and then inform services about new node . What is the right way of handling this

  • @tarunpahuja3443
    @tarunpahuja3443 Год назад

    What if the master shard goes down during write re and quest. I guess here we could avoid strong consistency for available

    • @sarthuaksharma9609
      @sarthuaksharma9609 День назад

      As he already said there are read replicas. You can have leader election Master goes down slave takes over and yes consistency can be comprised for such use cases as availability and latency has been prioritised

  • @piyush3168
    @piyush3168 Год назад

    very good explanation !
    does writing service will scale ? as we have only one master node , or Cassandra ( multiple writers) will be good choice for that ? , as we are not looking for acid compliant behaviour

    • @girishanker3796
      @girishanker3796 2 месяца назад

      A Cassandra cluster would be a great choice. +1 but since he mentioned trade offs I think we can trade off consistency for availability (so maybe 2 write masters which will be eventually consistent)

  • @SwikarP
    @SwikarP 9 месяцев назад

    Best best best awsome

  • @hackwithharsha
    @hackwithharsha Год назад

    00:19:00 Thank You… If we use asynchrous replication.. How do we handle replication lag ? For example, Imagine If I have written a review for hotel which goes to master and then immediate read query went to replica which has’t recieved replica data yet… In the userinterface, User might think something went wrong with his submitted review and they will resubmit it too ? How do we handle this situation ?

    • @AsliEngineering
      @AsliEngineering  Год назад +2

      You will ensure Read Your Write consistency by redirecting the critical reads to the master.

    • @gauravraj2604
      @gauravraj2604 Год назад

      @@AsliEngineering Read Your Write consistency, is this any specific case we should wonder about while using master-slave replication?

  • @bharatarya7929
    @bharatarya7929 Год назад

    Hi! Nice video can u share your thoughts on how Rapido is keep a constant OTP of 4 digits for all users.

    • @AsliEngineering
      @AsliEngineering  Год назад

      Have never used Rapido. So no idea about it.

    • @bharatarya7929
      @bharatarya7929 Год назад

      @@AsliEngineering so while booking any cab via any app like Uber, Ola etc. They all share a OTP for the ride which keeps on changing. But rapido keeps a constant OTP for users which is 4 digits so how they are managing large scale

    • @AsliEngineering
      @AsliEngineering  Год назад +1

      @@bharatarya7929 then OTP are not unique just a random number assigned to a user.

  • @niksgupt
    @niksgupt Год назад

    Great use case and amazing presentation.
    Wondering, any specific reason for using relational db?
    Can't we solve it using NoSQL db?

    • @AsliEngineering
      @AsliEngineering  Год назад

      You can so long as it fits the usecase. No hard bounds.

  • @abhis3kh
    @abhis3kh Год назад

    Can we get twitter design video based on the diagram shared by Elon Musk ?

    • @AsliEngineering
      @AsliEngineering  Год назад +3

      I thought about it, but that diagram is really very very very high level. THey are literally just a bunch of microservices.
      If I create a video on it, no one will learn anything concrete from it. It will just be a list of microservices and guessing what it does.
      Say, I create a 20 min video on it, that will definitely get me views, but will add nothing to the viewer.
      Thank you so much for suggesting this, but I really do not want to waste the time of anyone watching my video.
      I like to keep the video information dense and useful.
      Hope you understand.

    • @abhis3kh
      @abhis3kh Год назад

      @@AsliEngineering Sure. That makes sense :)