A simple introduction to basics of hashing & need for Consistent Hashing|System Design Primer Course

Поделиться
HTML-код
  • Опубликовано: 25 окт 2024

Комментарии • 81

  • @sudocode
    @sudocode  3 года назад +79

    Guys and gals, let me know if this new teaching style without white board works or not? Is it better and easier to understand? Though the graphics need a bit of work.

    • @samiraghouri4599
      @samiraghouri4599 3 года назад +5

      The new teaching style with graphics is really good. Would love to have more such videos.

    • @rishavsrestha6683
      @rishavsrestha6683 3 года назад +2

      This one looks better imo

    • @vickyvivek3286
      @vickyvivek3286 3 года назад +2

      It's really good. Keep up the great work!

    • @MadrasCofee
      @MadrasCofee 3 года назад +1

      Yess! This is better!

    • @sanjana8358
      @sanjana8358 3 года назад

      This is some amazing content! make more such videos and also some mock sys design interviews using common topics.

  • @sudocode
    @sudocode  3 года назад +29

    Correction at 2:48: Binary search complexity is O(logn). Thanks to Kapil and Jurgen Van for pointing this out.

    • @vickyvivek3286
      @vickyvivek3286 3 года назад +2

      No. She is right. Unlike integers, string has characters within to compare. Here, a different variable could have been used like O(mlogn) where m is the number of characters in the string. But for discussion purpose we can state it O(nlogn). Hope this is clear.

    • @hritijrana1409
      @hritijrana1409 Год назад

      @@vickyvivek3286 yup

  • @_HariHaranR
    @_HariHaranR 11 месяцев назад

    I'm a ECE student and more interested in the software field and I chose a path in the software field with good logical knowledge and less theoretical knowledge and I've been searching for all these days how hash works and this helped me a lot. Thanks❤

  • @tanveerbaba1155
    @tanveerbaba1155 2 года назад +10

    For DIY(x, y) function, we can have loop inside of it and counts the remainders of each keys with x and y. If remainders of key(i) % x and key(i) % y is not equal, means move is required and we can increment the count. This way we can achieve number of moves. Function name can be anything.

  • @kapil8660
    @kapil8660 3 года назад +14

    Good stuff!!
    Just wanted to highlight a correction at 2:48
    Time complexity for binary search is O(logN)

    • @vickyvivek3286
      @vickyvivek3286 3 года назад +4

      No. She is right. Unlike integers, string has characters within to compare. Here, a different variable could have been used like O(mlogn) where m is the number of characters in the string. But for discussion purpose we can state it O(nlogn). Hope this is clear.

    • @akki1703
      @akki1703 3 года назад

      Data has to be sorted first, so overall TC is nLogn

  • @Sathwik0
    @Sathwik0 Год назад +2

    I believe the reason its n/k for hash loopkup is because, we are roughly dividing the 'n' data values we have into 'k' groups/hashcodes.
    So instead of looking at the whole n values while searching, we're only looking at 'n/k' values present in one of the hashcode groups.

  • @vineetrohatgi564
    @vineetrohatgi564 2 года назад +2

    I liked how you related Hashbrown with hashing..!! Never thought about it !! Thanks for the nice explanation.

    • @sudocode
      @sudocode  2 года назад +2

      You are welcome Vineet :)

  • @amanlatkar2961
    @amanlatkar2961 3 года назад +5

    Appreciation for the efforts and explaintion of complexity stuff 👏👏

    • @sudocode
      @sudocode  3 года назад

      Thanks for liking

  • @sumonmal009
    @sumonmal009 3 года назад +10

    collision 5:50
    hash distribution 7:10
    server number change (rehashing problem with hashing) 9:20

  • @shashikantpawar7069
    @shashikantpawar7069 3 года назад +2

    The explanation is easy to understand. Thank you for your efforts

    • @sudocode
      @sudocode  3 года назад +1

      Glad it was helpful!

  • @wizardgaming163
    @wizardgaming163 3 года назад +8

    Your channel is very underrated

    • @sudocode
      @sudocode  3 года назад +3

      Glad you think so! Help spread the word please :)

    • @colinmaharaj
      @colinmaharaj 2 года назад

      Shhh

    • @sallbro786
      @sallbro786 2 года назад

      @@sudocode sure

  • @anonymoushuman3575
    @anonymoushuman3575 3 года назад +6

    Two questions:
    * What ends up being the solution to N = DIY(x, y) ?
    * At 4:51, when we are using Modulus 8, why are there 9 boxes shown numbered 0-8 instead of 8 boxes numbered 0-7?

  • @shyamsundervr5749
    @shyamsundervr5749 2 года назад +1

    One of the simplest explanations for hashing

  • @Engineer_With_A_Life
    @Engineer_With_A_Life 2 года назад +1

    This is awesome! Beautifully explained.. explanation was so easy that I can code it out this whole logic in single shot! Thanks for this video... Subscribed!

  • @seemasehrawat6229
    @seemasehrawat6229 2 года назад

    Appka positive point the jo aap white board p samne khade ho kr teacher krte the, usse feel tha like me college me professor se lecture le rhe, ese accha feel nhi ho ra

  • @sushmitagoswami7320
    @sushmitagoswami7320 3 года назад +1

    I would like to discuss one idea with since you are already doing a great job and it is really helpful for any person in Software industry. Thank you so much for all your efforts! Will you kindly consider all the details of a topic like instead of keeping some points of a topic for a next video, we can consider that first and the gradually move to some other topics which might be a most asked question. It will make the understanding clear and will help to connect all the points. I think this will make the topic more organized. For example, instead of keeping the topic like hash collision for some other video, if you can include that part also before moving to consistent hashing, I think it will be more organized. Because, there are may videos which talks about consistent hashing, but there are not many video which caters all the aspect. But I really appreciate for your great job.

    • @nkunam
      @nkunam 2 года назад +1

      You are asking for spoon feeding

    • @gurupreetsingh8347
      @gurupreetsingh8347 Год назад

      @@nkunam no this is not spoon feeding bro, @sushmitagoswami7320 is right here , need to have much details else no use of half information and since after long time we didn't yet get any remaining videos

  • @vickyvivek3286
    @vickyvivek3286 3 года назад +4

    I comment very less on youtube.
    Your efforts are brilliant and of great quality. Kindly continue and add videos explaining hands on stuff with Mongo and MySQL.
    I see some comments clarifying the time complexity of Binary search. I think you are right as string has characters say m within to compare after finding the string in the mid position. It could have been stated as O(mlogn) but here for discussion purposes this is good.

    • @sudocode
      @sudocode  3 года назад

      Thank you so much Vivek, this really means a lot and keeps me motivated.

  • @manishkrai
    @manishkrai 2 года назад +1

    Good explanation

  • @roshanshukla5667
    @roshanshukla5667 Месяц назад

    One thing that I am not understanding is, Reassigning Ruchi is a problem for sure. But why is the existing data being tinkered with? I understand that the hash function will change, but that should affect the new data that is coming in, right? Whatever was stored previously was also distributed evenly, and hence I don't see the point of changing that. Please let me know what am I missing.

  • @vasu-arora
    @vasu-arora 3 года назад +3

    DIY(x, y, N) = N - min(x,y) - (int)(N/(x*y))
    Where N is the no of keys, x is the initial number of servers and y is the final number of servers.

    • @sudarshanrbhat7686
      @sudarshanrbhat7686 3 года назад

      could you please elaborate on how u reached this conclusion?

    • @yashmathur1389
      @yashmathur1389 3 года назад

      Lets take the example when initially 4 servers are there and finally 5 servers are there. Initially the keys were A1 with hash value 18, A2 with 35, A3 with 73, A4 with 46 and A5 with 91. So initially In S1 server A3 would be there, in S2 server A1 And A4 would be there and in S3 server A2 and A5 would be there. Finally after 1 more server is added the positions would be as follows : in S0 server A2 would be there, in S1 server A4 and A5 would be there, in S3 server A1 and A3 would be there. Now if you see the positions of A1 is changed, A2 is changed, A3 is changed, A4 Is changed, A5 is also changed. So number of keys changed equals 5 but your formula gives 1 as the answer. I don't know How you have arrived at this formula but It doesn't seem right to me.

    • @mohithrgowda8682
      @mohithrgowda8682 2 года назад

      there is a simple approach loop through all the keys,compare(key%x,key%y) if value is different increment count

  • @sautap4u
    @sautap4u 3 года назад

    Good stuff. Just a small correction time complexity of Binary search is O(log N)

  • @saurabhkukreja1982
    @saurabhkukreja1982 Год назад

    which device are you using to make this video ? I find it very helpful! Please sharre

  • @subee128
    @subee128 4 месяца назад

    Thank you very much

  • @bharath_v
    @bharath_v 2 года назад +1

    Good One!

  • @pratyushkhare2830
    @pratyushkhare2830 10 месяцев назад

    Assuming DIY(x,y) gives the percentage of keys moved, it can be given by
    DIY(x,y) = 100 * (LCM(x,y) - min(x,y)) / LCM(x,y)
    where,
    LCM is Least Common Multiple
    min is the minimum fn

  • @SoloGamer-23
    @SoloGamer-23 3 года назад +1

    Just a curious question , what programming language you use while working

  • @jsj14
    @jsj14 3 года назад +1

    could you please share the amazon link to your mic and what software you and arpit use for video editing?

  • @pranavbhat29
    @pranavbhat29 3 года назад +5

    For the question on number of keys in each bucket in a sample space of n entries and k possible hash values, assuming a uniform distribution, the number of keys per bucket it n/k.
    Going by the same logic as above, Let us assume that there are a total of m keys which were placed in a s1 servers. Now they get placed in s2 servers lets say ( with s1 > s2, or s1 < s2 ), Let n be the number of keys moved.
    then we know initially each server had roughly m/s1 keys, and later each server has roughly m/s2 keys, So for each server the number of keys moved per server is,
    m/s1 - m/s2
    Now since the movement happened from s1 servers, the total number of keys is s1(m/s1 - m/s2), which is n, hence
    n = ms1(1/s1 - 1/s2)
    Since we need to find the percentage of total keys moved which is 100 n/m we do the multiplication and division,
    100n/m = s1(1/s1 - 1/s2)
    100 n/m = 1 - s1/s2;
    Hence roughly 1 - s1/s2 percentage of keys are moved on a change in the cardinality of servers from s1 to s2, assuming that s1 < s2. If the reverse case is true, since this operation is symmetric ( in the sense number of keys moved while changing from s1 to s2 servers is same as that while moving from s2 to s1 servers ), just flip the numerator and denominator.
    This as mentioned is the rough estimate, is this correct?

  • @manojr40
    @manojr40 Год назад

    Omg she is my tech crush now!

  • @psmohammedali
    @psmohammedali 2 года назад

    5:36 Time complexity for hash table is O(1) for insertion and lookup.

  • @chetanmotamarri6942
    @chetanmotamarri6942 3 года назад

    Really appreciate your efforts. Very informative but please do double check and validate before posting. Cost of Binary search is O(log n)

  • @amitbhaira2492
    @amitbhaira2492 2 года назад

    Very nice video

  • @myLifeSk98
    @myLifeSk98 2 года назад

    #50k soon ❤️ thanks

  • @kollivenkatamadhukar5059
    @kollivenkatamadhukar5059 3 года назад

    What is the answer for the DIY function, BTW Great Video.

  • @abhisheksahu3277
    @abhisheksahu3277 3 года назад

    Really Great one

    • @sudocode
      @sudocode  3 года назад

      Thank you! Cheers!

  • @kvaddadi
    @kvaddadi 2 года назад

    I think the answer to the DIY is based on whether the final number of servers is even or odd number. i.e.
    If its an odd value then the Number of keys that get rearranged = max(x,y)
    If its an even then the Number of keys that get rearranged = min(x,y)
    Where x and y are initial and final number of servers.

  • @anushasingh5562
    @anushasingh5562 2 месяца назад

    Mam are notes available??

  • @justvenkyy...3423
    @justvenkyy...3423 3 года назад

    can you post on data structures? and what main skills are needed for FAANG interviews?

    • @sudocode
      @sudocode  3 года назад

      something coming up soon.

  • @-MayurDewangan
    @-MayurDewangan 2 года назад

    I can't think of the function even after trying hard, can you pls share the answer

  • @VY-zt3ph
    @VY-zt3ph Год назад

    Hi let's say we had x servers which are now reduced to y. x -> y. Let's say total number of keys are N. Then only first y keys will be remain unchanged, and rest all of them have to be displaced. N-y have to be replaced. The total percentage will be N-y/N. Please correct me if I'm wrong as I am preparin for an upcoming interview.

  • @amansaxena4446
    @amansaxena4446 Год назад

    sorting is nlogn but binary search is logn

  • @teemrahb7612
    @teemrahb7612 3 года назад

    I love hash browns ! :-)

  • @davidespinosa1910
    @davidespinosa1910 2 года назад +1

    Consistent Hashing is a made-up story told by Akamai to get a patent. The ring that everyone draws is really an ordered map, invented around 1960.

    • @sudocode
      @sudocode  2 года назад +1

      Is that so? Care to share the paper from 60s ?

    • @davidespinosa1910
      @davidespinosa1910 2 года назад

      @@sudocode Wikipedia, Binary Search Tree (P. F. Windley, Trees, Forests and Rearranging, 1960). We want to map ranges of keys to servers (and back). If we implement consistent hashing, we'll find a BST sitting behind it (or some other ordered map, like a B-tree).

  • @jvsg6
    @jvsg6 2 года назад

    The video is great, but complexity of binary search is log(n), not n*log(n);)

  • @yaswanthmareddy3930
    @yaswanthmareddy3930 3 года назад +2

    Hi Yogita,
    Really appreciate your efforts and I am awaiting for the videos, please try to make videos consistently atleast one video a week.🙂

  • @AbhishekYadav-fb3uh
    @AbhishekYadav-fb3uh 3 года назад

    Yarrrrr 2 video private kun hai

  • @TheSridharraj
    @TheSridharraj 3 года назад +2

    Answer is n/k

  • @Hassainsyed
    @Hassainsyed 3 месяца назад

    Hashing comes from the dish name Hash brown , wow😂😂😂😂

  • @engineer-x7419
    @engineer-x7419 Год назад

    int DIY (int x,int y){
    // function returns % of keys which needs to be maved
    return (|x-y|/x) *100;
    }

  • @ankurbhanot496
    @ankurbhanot496 3 года назад

    dont say home work again !!! i am gona unsubscribe hahaha !