Bloom Filters | Algorithms You Should Know #2 | Real-world Examples

Поделиться
HTML-код
  • Опубликовано: 19 окт 2024
  • Subscribe to our weekly system design newsletter: bit.ly/3tfAlYD
    Checkout our bestselling System Design Interview books:
    Volume 1: amzn.to/3Ou7gkd
    Volume 2: amzn.to/3HqGozy
    Digital version of System Design Interview books: bit.ly/3mlDSk9
    The Secret Sauce Behind NoSQL: • The Secret Sauce Behin...
    Animation tools: Illustrator and After Effects
    ABOUT US:
    Covering topics and trends in large-scale system design, from the authors of the best-selling System Design Interview series.

Комментарии • 113

  • @Mutual_Information
    @Mutual_Information 2 года назад +119

    Asking/answering the right questions, information dense and technical. I love this channel.

  • @an_other_world
    @an_other_world 2 года назад +31

    Levelling the playing field by making info like this easily accessible to anyone

  • @CommandantNOVA
    @CommandantNOVA 2 года назад +66

    If you like Bloom Filters - I highly recommend reading the SuRF : Practical Range Query Filters from CMU. It's a little more complicated but shows how you can take the same concept and apply it to eliminate arbitrary size ranges.

    • @andreikhalipau2796
      @andreikhalipau2796 Год назад

      For someone who does not like to read there is a presentation ruclips.net/video/OD29hZww-DM/видео.html

  • @MrX-nc8cm
    @MrX-nc8cm 2 года назад +42

    Yesterday I binge watched all your videos 😂 to prepare for an interview, and then had nothing left to watch!
    I’m glad that you uploaded again, you’re videos are awesome well explained, your channel will grow fast. Keep the good work! 👏👏

    • @TheAyushSomani
      @TheAyushSomani 2 года назад +4

      How did the interview go?

    • @MrX-nc8cm
      @MrX-nc8cm Год назад +8

      @@TheAyushSomani sorry I didn’t see your reply. I was hired on September! After 4 months of job hunting and countless applications. I did more than 100 applications 😂 and I received 2 offers. Practicing interviews and getting rejections definitely made me improve a lot. It’s been a great experience so far on my job.

    • @jathebest2835
      @jathebest2835 Год назад

      @@MrX-nc8cm Wow! Congrats from Korea:) What country do you live in? and I guess it's pretty competitive to get into a software engineering job there.

  • @emreboga
    @emreboga 2 года назад +46

    Definitely more exciting to watch than many Netflix series 👌🏼

    • @andherium
      @andherium Год назад +9

      Comparing a DSA tutorial to a netflix series doesn't make sense.

    • @TRoss-ru6sg
      @TRoss-ru6sg Год назад

      Stop the cap

  • @ciberman
    @ciberman 2 года назад +8

    This video is awesome. I read about Bloom filters a long time ago in wikipedia but I didn't get it at that moment. This video explains it so clearly. I love it.

  • @gattuvideo
    @gattuvideo Год назад +1

    I love how you make any topic so easy to understand. I tried reading the same topic on wikipedia and got utterly confused but the way you explained it, it clicked immediately.

  • @Athmarr
    @Athmarr Год назад +2

    I've looked up bloom filters before and knew what they were used for but this actually made them make sense to me. Thank you so much!

  • @benoitleger-derville6986
    @benoitleger-derville6986 2 года назад +7

    Very good realization, the animations are remarkable and very useful.

  • @essamgouda1609
    @essamgouda1609 2 года назад +3

    This channel is a gem !

  • @daviddickey9832
    @daviddickey9832 6 месяцев назад

    This channel is an absolute gold mine of knowledge for software devs

  • @JayAntoney
    @JayAntoney 20 дней назад

    2 years after the video release, but browers are using cascading bloom filters to do SSL revocation checks now! Supper interesting when I learnt that.
    Steve Gibson on a podcast called security now (ep989) did a great chat on it

  • @kapilricky
    @kapilricky Год назад +1

    I feel smarter everyday after watching ur design videos.

  • @ibidathoillah2051
    @ibidathoillah2051 2 года назад +1

    thank you, easy to learn, this videos should get million viewers

  • @TesserId
    @TesserId 2 года назад +7

    Been fascinated with this data structure for some time. Hearing the Akamai perspective is very interesting, particularly since I manage a large proxy installation that does subscribe to an advanced rating service. Imagine needing a quick, memory efficient data structure that can give you a no/maybe answer very quickly. Sure, on the 16 proxy servers we have 128GB of RAM each. But imagine the memory requirements for virtually every FQDN on the planet.

    • @yatinarora1252
      @yatinarora1252 5 месяцев назад +1

      but kinda look for false positives

    • @TesserId
      @TesserId 5 месяцев назад +1

      @@yatinarora1252 It's been a while since I was taught about this, but I'm pretty sure that was said to be one of the weaknesses. It essentially caches positive findings, but lookups are called for otherwise (or maybe it's the other way around). And, collisions can accumulate (or something like that), so there has to be some way of clearing things when they get stale. So, it's not that's it perfect. It's fast and compact, but you need extra algorithmic handling to manage it's weaknesses.

  • @jo42715
    @jo42715 2 года назад +1

    Short and clear ! I love it.

  • @yp5387
    @yp5387 2 года назад +4

    I was confused until you explained it with the food example. And, suddenly it clicked for me. It was Ahha moment for me.

    • @DemPilafian
      @DemPilafian 2 года назад +1

      I was actually more confused... loving pork chops must always be true!

  • @iamcute69
    @iamcute69 2 года назад +6

    The animation at 4:21 is wrong. Ribeye hashes to indexes 1, 3, and 4, but indexes 0, 3, and 4 are highlighted. The explanation is still correct, though.

  • @daniel860305
    @daniel860305 Год назад

    Thanks for your easy-to-understand explanation! Amazing to see a data structure applied to so many different uses.

  • @FLAMESpl
    @FLAMESpl 2 года назад +2

    Great stuff, never heard of bloom filter before.

  • @bittu007ize
    @bittu007ize Год назад

    I don't know why but my every morning starts with one of these short videos

  • @venkataramaraoemmadi378
    @venkataramaraoemmadi378 2 года назад +1

    Big Fan of your books, Alex. These Videos are very much useful. One concern I have is.. these videos are playing like 2x fast :). You can slowdown, while explaining stuff. So our mind will get time to understand what you are saying.

    • @gcash49
      @gcash49 2 года назад

      Just to provide a differing opinion, I found the pace of the video perfect personally

  • @modolief
    @modolief 2 года назад

    Today I learned that the Bloom Filter relies on using several different hash functions (in this video the example uses 3 different hash functions). This is a very creative idea for an algorithm!

    • @CommandantNOVA
      @CommandantNOVA 2 года назад

      This is actually a simplification for this video. Hashing from scratch is expensive. In most actual bloom filters I've seen use two hashes to produce polynomial coefficients A and B. You can then produce as many "hashes" as you want by using these coefficients without running through a whole hash function.

  • @MaxPicAxe
    @MaxPicAxe 2 года назад +3

    Wow that was very interesting!

  • @v_iancu
    @v_iancu Год назад +1

    Techically, you could make the bloom filter store integer numbers instead of one and zero and use increment/decrement approach when adding/removing elements.

  • @vikasbhutra9400
    @vikasbhutra9400 2 года назад +2

    Amazing video... Very Well Explained. !!

  • @tomtomsiesie5436
    @tomtomsiesie5436 2 года назад +2

    Another great video

  • @andreffrosa
    @andreffrosa 2 года назад +1

    The original bloom filters don't allow removals but there are a lot of proposed improvements over the original idea and one of them allows removals by using integer buckets instead of bit buckets and incrementing/decrementing the value of each bucket (trade-off of more space to allow removals)

    • @ZorakWars
      @ZorakWars 2 года назад +3

      What if you remove a false positive? You break the assumption that the data store cannot give false negatives.

    • @andreffrosa
      @andreffrosa 2 года назад +2

      @@ZorakWars yes, that is true. Another trade-off you have to make.

  • @thesadanand6599
    @thesadanand6599 Год назад

    awesome, truely knowledge delivered in byte size :)

  • @miettoisdev
    @miettoisdev 6 месяцев назад

    good, good stuff right here!

  • @BharatChandak1
    @BharatChandak1 2 года назад +3

    Great content as usual.
    Just little pun on title of video.. 😉 What else should i know before i die ?

  • @thinhle5771
    @thinhle5771 2 года назад

    For me, I always learn new things from this channel😅

  • @DelShott
    @DelShott 2 года назад

    Thanks very clear explanation, but I like lemons.

  • @rampage241
    @rampage241 11 месяцев назад

    Very useful, thank you!

  • @ViktorWahlberg
    @ViktorWahlberg 2 года назад

    Nice videos you're putting out. Would have been good mentioning cascading bloom filters. Maybe out of scope for video, but just as a mention for people to do further research on their own.

  • @PaladinMansouri
    @PaladinMansouri Год назад

    I really enjoyed that, thanks a lot

  • @mohammadkhalafIraqi
    @mohammadkhalafIraqi Год назад

    I love this channel

  • @shivd862
    @shivd862 6 месяцев назад

    Thank you so much !

  • @ricardopassos1180
    @ricardopassos1180 Год назад

    Bloom filters are also used in the Blockchain (EVM-based) context to query for smart contract events.

  • @growie000
    @growie000 3 месяца назад

    Thank you so much.

  • @tshawtshi3040
    @tshawtshi3040 Год назад

    You can delete if you make the buckets numbers and whe adding you increment the bucket and when deleting you decrement it

  • @StephenGillie
    @StephenGillie 2 года назад

    This would be a good first step to slim down a huge data set, before more complex processing on the "probably Yes" subset.

  • @DrRishabhGarg
    @DrRishabhGarg 2 года назад +2

    Great video! can I ask something how do you make this lovely animations?

    • @ByteByteGo
      @ByteByteGo  2 года назад +1

      It takes a team. We have some talented editors for illustration and animation, with the help of tools like Adobe After Effects, Adobe Illustrator, etc. Each video takes many hours to make.

  • @ChrisHalden007
    @ChrisHalden007 2 года назад

    Great video. Thanks

  • @axa993
    @axa993 2 года назад +2

    Awesome videos

  • @learnersparadise7492
    @learnersparadise7492 2 года назад +6

    How do you create such a nice animation?

    • @devenderyadav6209
      @devenderyadav6209 2 года назад +4

      From description : Animation tools: Illustrator and After Effects

  • @ReflectionOcean
    @ReflectionOcean Год назад

    Implemented with buckets and fast hash function

  • @javisartdesign
    @javisartdesign 2 года назад +1

    thanks. interesting stuff

  • @krishnakeshav23
    @krishnakeshav23 Год назад

    How is this more efficient in terms of memory access? - HashTable returns 1 location and thus O(3), but BF can return n locations, thus access time is O(n). I guess it gives huge advantage in terms memory required - to store n elements, BF will only need - log(n) locations.

  • @bamboosnow2605
    @bamboosnow2605 Год назад

    If the entries are not booleans but integer accumulators, then the deletion of keys can go away. This would be useful for deleting old entries.

  • @10e999
    @10e999 2 года назад +3

    Love that series.
    I'm interested in doing video form teaching and I wanted to know : how do you make your animation ?

    • @ByteByteGo
      @ByteByteGo  2 года назад +8

      It takes a team. We have some talented editors for illustration and animation, with the help of tools like Adobe After Effects, Adobe Illustrator, etc. Each video takes many hours to make.

  • @hadiesmaeli6124
    @hadiesmaeli6124 Год назад

    Thank alot, very helpfull

  • @vikramadityakukreja6725
    @vikramadityakukreja6725 2 года назад

    Excellent video

  • @jiakai7254
    @jiakai7254 Год назад +1

    3:51 explanation starts

  • @hosseineyvazi3122
    @hosseineyvazi3122 Год назад

    awesome !

  • @RichardCurrie
    @RichardCurrie 2 года назад +2

    Why are multiple hash functions required? A single one would be enough no? Having 3 seems to reduce the capacity by x3?

    • @romannasuti25
      @romannasuti25 2 года назад +1

      IIRC, this prevents patterns from producing false positives too frequently without using much more expensive hash functions. You get to use fast, naive hash functions and still get decent space savings. Plus, it’s not really 3x storage, as elements can reassign bits and often do as the bloom filter fills up.

  • @boringmanager9559
    @boringmanager9559 2 года назад

    I was out for 2 weeks and when I come back, bytebytego has like 10 new videos to watch 🙂

    • @ByteByteGo
      @ByteByteGo  2 года назад

      😂 You sure you were only gone for 2 weeks?

    • @boringmanager9559
      @boringmanager9559 2 года назад

      @@ByteByteGo to be honest, no, not so sure 😅

  • @darklajid
    @darklajid 2 года назад

    Tiny error in the animation. The ribeye lookup looks at 0 3 and 4 (not 1 3 and 4) according to the red highlights. But nicely done!

  • @yanwugu
    @yanwugu Год назад

    If it never forgets, will bloom filter full? eventually it will return might be true to any query.

  • @JiamingFan
    @JiamingFan 8 месяцев назад

    In 1:17, it should be trade off between less memory and more accurate.

  • @alirezakhorami
    @alirezakhorami 5 месяцев назад

    Amazing, AMAIZINGNGNGNGNGNGN

  • @swatibhartiya7457
    @swatibhartiya7457 3 месяца назад

    @byte byte go, can you do a video on token bucket algorithm ?

  • @bytetrip114
    @bytetrip114 Год назад

    Can I know how to make video such like this style? What tools does Alex use?

  • @amertia
    @amertia Год назад

    Thanks for this info. There is also something called cuckoo filter. Can you make a video on that as well with usecases.

  • @EhsanIrshad
    @EhsanIrshad 2 года назад

    Please make video on Raft amd Paxos Algorithms as well.. pleaee

    • @ByteByteGo
      @ByteByteGo  2 года назад +1

      Ooof... Raft is doable, but Paxos...
      Can we just do raft, please? LOL

    • @EhsanIrshad
      @EhsanIrshad 2 года назад

      @@ByteByteGo yes please you may opt with raft, i will give it my name of paxos LOL,

    • @Chessmasteroo
      @Chessmasteroo 2 года назад

      @@ByteByteGo and Ehsan, there an excellent paper on Paxos Made Simple 01 Nov 2001 by Leslie Lamport. It's only 14 pages. Great read.

  • @ryan_roga
    @ryan_roga 2 года назад +1

    Can I please know how you made this presentation? I'd very much like to do a final presentation this semester that looks this good. 🙏

    • @yzhang2008
      @yzhang2008 2 года назад +1

      I'm interested in how to make presentations as pretty as this one too.

    • @ByteByteGo
      @ByteByteGo  2 года назад +4

      It takes a team. We have some talented editors for illustration and animation, with the help of tools like Adobe After Effects, Adobe Illustrator, etc. Each video takes many hours to make.

  • @hmls3579
    @hmls3579 7 месяцев назад

    1:30 that was personal

  • @RahulSoniiitb
    @RahulSoniiitb 3 месяца назад

    I don't understand. in the give buckets, if he would have added 5-6 more foods of his liking then entire range would have become 1.. then it doesn't matter what you search it will always be positive. What am I missing ? is the bucket size always much larger than the dataset ?

  • @mahantalebi7192
    @mahantalebi7192 2 года назад

    Keep up the good work! your content is just amazing and one of the best that could be found on the internet. Though the narration has some room for improvement.

  • @edgarhernandezparra7021
    @edgarhernandezparra7021 23 дня назад

    One animation is wrong, when you check the indexes for "ribeye" you're highlighting in red 0, 3 and 4 but then you make the arrows point to 1, 3 and 4

  • @crikxouba
    @crikxouba Год назад +1

    How can it be firm no and probably yes? I mean, if it returns probably yes, and the element is not there, then that no cannot be so firm.

    • @JAlbertSG
      @JAlbertSG 5 месяцев назад

      It’s all in the hash function, since it can have collisions then it’s a probably yes, if no collision is found is a definitely no

  • @akhila5858
    @akhila5858 2 года назад

    Which software is used for these presentations

    • @ByteByteGo
      @ByteByteGo  2 года назад

      It takes a team. We have some talented editors for illustration and animation, with the help of tools like Adobe After Effects, Adobe Illustrator, etc. Each video takes many hours to make.

  • @reactdeveloper2368
    @reactdeveloper2368 Год назад

    If anyone has worked with mongoose implemeting bloom filters for full text search it will be nice! Thank you in advance

  • @ReflectionOcean
    @ReflectionOcean Год назад

    No false negative but probably false positive

  • @Software.Engineer
    @Software.Engineer 2 года назад +1

    🤩

  • @Overthought7
    @Overthought7 2 года назад

    Sorry, what? How is a bloom filter more space efficient than a simple set when the filter has to store 3x the bits AND give extra buffer zeroes to prevent false positives?

    • @ByteByteGo
      @ByteByteGo  2 года назад +1

      A simple set's memory requirement scales linearly with the number of elements.
      A bloom filter's space requirement is capped at the size of the bit-set for the bloom filter itself, which is usually a fraction what a simple set would need.
      Let's go through an example.
      Say there is a set with 1 million keys, with each key hashes to a 64-bit number.
      A simple set would need 8-million bytes to just hold all the keys, ignoring collision handling and other details.
      A bloom filter with about a 1 in 100 chance of collision would need a bit-set with 10 million bits and 7 hash functions. 10 million bits is 1.25-million bytes.
      This is the main purpose of a bloom filter. We trade space for sometimes slightly inaccurate answers.

    • @Overthought7
      @Overthought7 2 года назад

      @@ByteByteGo Thank you for taking the time! I finally figured it out on Stack-O when I read a comment saying basically, "But. A. Set. Stores. The. Keys." lol. That's just NOT how I think about a set. Why do sets store the keys anyway? Is it only for dynamic sizing? (not saying that's unimportant, rather that it's not inherent to a set - you COULD have a fixed-size set that doesn't do that)

  • @chizuru1999
    @chizuru1999 2 года назад

    I barely managed to solve box blur and now theres more... 😂💀

  • @jackjacobs2960
    @jackjacobs2960 11 месяцев назад

    Came for my Fujifilm left with a ?coding degree

  • @nateshrager512
    @nateshrager512 Год назад

    Trade off animation doesn't exactly make sense, but great video otherwise. The trade off is between memory and accuracy, not between a lack of both.

  • @SphereofTime
    @SphereofTime Год назад

    1:45

  • @shreyageek
    @shreyageek 2 месяца назад

    no one explains a single question with bloom filter

  • @daitedve1984
    @daitedve1984 2 года назад

    Is it so hard to put proper description what is that filter and why we need it??

  • @sleepysheep1015
    @sleepysheep1015 Год назад

    How does a hash function return multiple values? A hash function should only return one value for a function. It doesn't make sense. What hash functions are typically used for Bloom filters.