Rate Limiter Design Deep Dive with Google SWE! | Systems Design Interview Question 7

Поделиться
HTML-код
  • Опубликовано: 3 июл 2024
  • Only rate I've ever had to limit was the rate at which I beat my meat per day
    00:00 Introduction
    01:53 Functional Requirements
    02:51 Capacity Estimates
    04:52 API Design
    05:58 Database Schema
    07:00 Architectural Overview
  • НаукаНаука

Комментарии • 29

  • @pl5778
    @pl5778 Год назад +4

    This video is awesome. I think there are alot of little gold nuggets in how you explain things. For example when you talked about the the single/multi thread for redis.

  • @kaushikbv
    @kaushikbv Год назад +4

    Was asked this exact question in a coding interview.
    I was able to come up with the exact approach of the rolling window that you mentioned.
    Wrote the Pseudocode for the same.
    But the interviewer wanted a working code for this in the 45 minute interview.
    I was a bit baffled with the expectation.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +2

      First of all, glad the video helped! You can't always pass every interview, as sometimes the expectations really are high, but sounds like you're getting closer! Just use the interview as a way to self reflect and figure out what you need to work on.

  • @serhiisoboliev2385
    @serhiisoboliev2385 Год назад +1

    Nice content, please, keep going

  • @6eeykwr2
    @6eeykwr2 Год назад +4

    lol love the “rate limited” segments in the video lmao

  • @ouvreboite87
    @ouvreboite87 5 месяцев назад +1

    Nice video :)
    To extend a bit (if someone is looking to implement an actual distributed rate limiter):
    The « list of events » approach for the rolling window is not optimal: the memory usage and cpu usage grows as you get closer to the limit (more events to store and iterate in the list).
    The « leaky bucket » (or the similar « token bucket ») algorithm would be a better solution: constant time + constant memory + bursting is built-in.
    It also requires some sort of « transactional » support.

    • @jordanhasnolife5163
      @jordanhasnolife5163  5 месяцев назад

      Thanks for that, perhaps I should have covered leaky bucket! Nonetheless, could definitely be useful to reduce space overhead here, maybe it would allow us to do more caching on the load balancer.

  • @username911
    @username911 Месяц назад +1

    AFAIK Redis supports keyspace notifications and that includes expire as well. That makes me think of the following algorithm for sliding window approach:
    unless already exists create a consistent ip /& user based hash => redis set with expiry of allowed rate limit duration
    if already the hash already exists create the key "guid:random_hash" as key and the consistent hash as value with the rate limit duration as ttl, and add it to the set too
    remove the "guid:random_hash" from the set when the expire event fires for that key
    when a request comes check O(1) if the set count has less than rate-1 items, if not reject, else carry on with above implementation.
    What do you think?

    • @jordanhasnolife5163
      @jordanhasnolife5163  Месяц назад +1

      Seems reasonable, I imagine that for TTLs Redis is basically doing the process that we've described

  • @raj_kundalia
    @raj_kundalia 4 месяца назад +1

    thank you!

  • @stephenchan7944
    @stephenchan7944 Год назад +1

    Hey Jordan,
    Do you have any advice for knowing what metrics to track for capacity estimates? I often find it challenging to pin down exactly what I'm calculating capacity estimates for and for what granularity.
    As always, thanks for the great content!

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +1

      I still have difficulty for this too! I think the main thing is not to focus too much on a specific number as a result, but moreso to get a sense of where the bottlenecks in your system lie, whether that's reads or writes or anything else. The estimates themsves can be very crude.

  • @aniruddhkhera510
    @aniruddhkhera510 Год назад +1

    Hey Jordan, great video as always!
    I have a question on the write-back cache approach. Although the benefit is it saves the network call to Redis but this will complicate the system.
    How would you keep this local cache updated in a distributed rate-limiter setup? If the LB picks a rate-limiter instance based on how busy it's then there are a lot of problems around consistency and probably need a mechanism to sync the local caches.
    The other way I could think of is using hashing or better consistent hashing to solve the problem of hotspots yet again we land to the same problem of keeping the local cache updated.
    What do you think?
    I would think of gossip protocol to sync local caches or setting a low threshold to expire the local record, probably near around the rate-limit time.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад

      I think that using consistent hashing on something like a userID will make sure that all of a users requests go to the same server so that you can cache them there

    • @sanjivsingh091
      @sanjivsingh091 5 месяцев назад

      @@jordanhasnolife5163 are we suggesting sticky session thing here ?

  • @karenronen6618
    @karenronen6618 Год назад +1

    Thanks for making this video :) I was wondering about the bit of the write back cache, isn't Redis already a cache? it is an in-memory key store, trying to understand the benefit of using the write-back cache if we already have a low latency cache ...

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад +1

      That's correct but you still induce a network call to use it, whereas caching the result on the actual server itself avoids that

  • @mrdanp123
    @mrdanp123 Год назад +1

    Another great video. I've been reading about this a bit and redis can give atomicity with Lua scripts or MULTI/EXEC commands. Also a great data structure you can use for the rolling window is the redis sorted set where you use the timestamp as the score. Then you can perform an efficient range query to get all elements in your window.

  • @faceTheCandidates
    @faceTheCandidates Год назад +1

    Great content.
    I think in your performance estimate you make an underlying assumption that there is only one API end point you are guarding. In reality, you might want to at least mention that you are rate limiting across multiple services or multiple APIs within a service.

    • @jordanhasnolife5163
      @jordanhasnolife5163  Год назад

      Fair point! Though for the sake of the design all else should remain more or less the same (you just have to duplicate it lol)

  • @stephenirving9846
    @stephenirving9846 11 месяцев назад

    This feels like a weird service to make. I'm glad I watched this to go into it, but having an api and load balancer layer are going to add 30 ms at a minimum in my experience when it comes to API gateway on AWS. Even up to 100 ms if your team uses Lambdas for your api layer.
    On our AWS account, we just made our own elastic cache table and hit that directly because ultra low latency is really important for us.
    I totally get that if you are offering this service to people outside your AWS account, you would need to have a load balancer and api layer.

    • @jordanhasnolife5163
      @jordanhasnolife5163  11 месяцев назад

      Yep I think you've accurately described why the write back cache is so important

  • @francisliubin
    @francisliubin 11 месяцев назад +1

    So is race condition a real problem here? If yes, any way to solve it besides lock... Thank you.

    • @jordanhasnolife5163
      @jordanhasnolife5163  11 месяцев назад

      You could always just hash things with userID and timestamp included and hope for no collisions, otherwise you'll have to assume that key collisions are possible which lead to write conflicts

  • @unknowrabbit666
    @unknowrabbit666 4 месяца назад +1

    Just finished a interview about Rate Limiting... I am so f*cked, but next time I will get it right, tks a lot!

    • @jordanhasnolife5163
      @jordanhasnolife5163  4 месяца назад +1

      That's how it goes! No worries, you gotta fail a few times to know where to improve!

  • @mickeyp1291
    @mickeyp1291 5 месяцев назад +1

    Hey Jordan, qualityvideo, sloppy backend this time.
    you got so into the rate .limiter algo, you forgot the service per endpoint implementation,
    it needs another cache/redis that when the user connects to the rate limiter will push the user rate/default rate into hash so the rate llimiter knows what to do. it also needs a mysql database for the crud of the rate limiter you also forgot about.
    API
    routes:
    /rates/userid//service/
    /rates/ip//service/
    - add POST {ms:
    or
    day/
    }
    - del DEL
    - update PUT {ms:
    or
    day/
    }

    - get GET -> response status (200(ok), 400(bad format), 422(invalid)
    info_message: { state: [specified | default ]
    type: [in ms | per day ]
    count: long
    id: [ip | userkey] :
    }

    • @jordanhasnolife5163
      @jordanhasnolife5163  5 месяцев назад

      I think the re-made version of this video (from like 2 weeks ago) was probably less shabby in this regard. Good point though, you can totally do some partitioning per service. I didn't touch too much on rate limits per user, but I do agree that if these were to exist you would need another form of storage to initiate those.