Guys and gals, let me know if this new teaching style without white board works or not? Is it better and easier to understand? Though the graphics need a bit of work.
No. She is right. Unlike integers, string has characters within to compare. Here, a different variable could have been used like O(mlogn) where m is the number of characters in the string. But for discussion purpose we can state it O(nlogn). Hope this is clear.
I'm a ECE student and more interested in the software field and I chose a path in the software field with good logical knowledge and less theoretical knowledge and I've been searching for all these days how hash works and this helped me a lot. Thanks❤
For DIY(x, y) function, we can have loop inside of it and counts the remainders of each keys with x and y. If remainders of key(i) % x and key(i) % y is not equal, means move is required and we can increment the count. This way we can achieve number of moves. Function name can be anything.
No. She is right. Unlike integers, string has characters within to compare. Here, a different variable could have been used like O(mlogn) where m is the number of characters in the string. But for discussion purpose we can state it O(nlogn). Hope this is clear.
I believe the reason its n/k for hash loopkup is because, we are roughly dividing the 'n' data values we have into 'k' groups/hashcodes. So instead of looking at the whole n values while searching, we're only looking at 'n/k' values present in one of the hashcode groups.
Two questions: * What ends up being the solution to N = DIY(x, y) ? * At 4:51, when we are using Modulus 8, why are there 9 boxes shown numbered 0-8 instead of 8 boxes numbered 0-7?
This is awesome! Beautifully explained.. explanation was so easy that I can code it out this whole logic in single shot! Thanks for this video... Subscribed!
Appka positive point the jo aap white board p samne khade ho kr teacher krte the, usse feel tha like me college me professor se lecture le rhe, ese accha feel nhi ho ra
I would like to discuss one idea with since you are already doing a great job and it is really helpful for any person in Software industry. Thank you so much for all your efforts! Will you kindly consider all the details of a topic like instead of keeping some points of a topic for a next video, we can consider that first and the gradually move to some other topics which might be a most asked question. It will make the understanding clear and will help to connect all the points. I think this will make the topic more organized. For example, instead of keeping the topic like hash collision for some other video, if you can include that part also before moving to consistent hashing, I think it will be more organized. Because, there are may videos which talks about consistent hashing, but there are not many video which caters all the aspect. But I really appreciate for your great job.
@@nkunam no this is not spoon feeding bro, @sushmitagoswami7320 is right here , need to have much details else no use of half information and since after long time we didn't yet get any remaining videos
I comment very less on youtube. Your efforts are brilliant and of great quality. Kindly continue and add videos explaining hands on stuff with Mongo and MySQL. I see some comments clarifying the time complexity of Binary search. I think you are right as string has characters say m within to compare after finding the string in the mid position. It could have been stated as O(mlogn) but here for discussion purposes this is good.
One thing that I am not understanding is, Reassigning Ruchi is a problem for sure. But why is the existing data being tinkered with? I understand that the hash function will change, but that should affect the new data that is coming in, right? Whatever was stored previously was also distributed evenly, and hence I don't see the point of changing that. Please let me know what am I missing.
Lets take the example when initially 4 servers are there and finally 5 servers are there. Initially the keys were A1 with hash value 18, A2 with 35, A3 with 73, A4 with 46 and A5 with 91. So initially In S1 server A3 would be there, in S2 server A1 And A4 would be there and in S3 server A2 and A5 would be there. Finally after 1 more server is added the positions would be as follows : in S0 server A2 would be there, in S1 server A4 and A5 would be there, in S3 server A1 and A3 would be there. Now if you see the positions of A1 is changed, A2 is changed, A3 is changed, A4 Is changed, A5 is also changed. So number of keys changed equals 5 but your formula gives 1 as the answer. I don't know How you have arrived at this formula but It doesn't seem right to me.
Assuming DIY(x,y) gives the percentage of keys moved, it can be given by DIY(x,y) = 100 * (LCM(x,y) - min(x,y)) / LCM(x,y) where, LCM is Least Common Multiple min is the minimum fn
For the question on number of keys in each bucket in a sample space of n entries and k possible hash values, assuming a uniform distribution, the number of keys per bucket it n/k. Going by the same logic as above, Let us assume that there are a total of m keys which were placed in a s1 servers. Now they get placed in s2 servers lets say ( with s1 > s2, or s1 < s2 ), Let n be the number of keys moved. then we know initially each server had roughly m/s1 keys, and later each server has roughly m/s2 keys, So for each server the number of keys moved per server is, m/s1 - m/s2 Now since the movement happened from s1 servers, the total number of keys is s1(m/s1 - m/s2), which is n, hence n = ms1(1/s1 - 1/s2) Since we need to find the percentage of total keys moved which is 100 n/m we do the multiplication and division, 100n/m = s1(1/s1 - 1/s2) 100 n/m = 1 - s1/s2; Hence roughly 1 - s1/s2 percentage of keys are moved on a change in the cardinality of servers from s1 to s2, assuming that s1 < s2. If the reverse case is true, since this operation is symmetric ( in the sense number of keys moved while changing from s1 to s2 servers is same as that while moving from s2 to s1 servers ), just flip the numerator and denominator. This as mentioned is the rough estimate, is this correct?
I think the answer to the DIY is based on whether the final number of servers is even or odd number. i.e. If its an odd value then the Number of keys that get rearranged = max(x,y) If its an even then the Number of keys that get rearranged = min(x,y) Where x and y are initial and final number of servers.
Hi let's say we had x servers which are now reduced to y. x -> y. Let's say total number of keys are N. Then only first y keys will be remain unchanged, and rest all of them have to be displaced. N-y have to be replaced. The total percentage will be N-y/N. Please correct me if I'm wrong as I am preparin for an upcoming interview.
@@sudocode Wikipedia, Binary Search Tree (P. F. Windley, Trees, Forests and Rearranging, 1960). We want to map ranges of keys to servers (and back). If we implement consistent hashing, we'll find a BST sitting behind it (or some other ordered map, like a B-tree).
Guys and gals, let me know if this new teaching style without white board works or not? Is it better and easier to understand? Though the graphics need a bit of work.
The new teaching style with graphics is really good. Would love to have more such videos.
This one looks better imo
It's really good. Keep up the great work!
Yess! This is better!
This is some amazing content! make more such videos and also some mock sys design interviews using common topics.
Correction at 2:48: Binary search complexity is O(logn). Thanks to Kapil and Jurgen Van for pointing this out.
No. She is right. Unlike integers, string has characters within to compare. Here, a different variable could have been used like O(mlogn) where m is the number of characters in the string. But for discussion purpose we can state it O(nlogn). Hope this is clear.
@@vickyvivek3286 yup
I'm a ECE student and more interested in the software field and I chose a path in the software field with good logical knowledge and less theoretical knowledge and I've been searching for all these days how hash works and this helped me a lot. Thanks❤
For DIY(x, y) function, we can have loop inside of it and counts the remainders of each keys with x and y. If remainders of key(i) % x and key(i) % y is not equal, means move is required and we can increment the count. This way we can achieve number of moves. Function name can be anything.
Good stuff!!
Just wanted to highlight a correction at 2:48
Time complexity for binary search is O(logN)
No. She is right. Unlike integers, string has characters within to compare. Here, a different variable could have been used like O(mlogn) where m is the number of characters in the string. But for discussion purpose we can state it O(nlogn). Hope this is clear.
Data has to be sorted first, so overall TC is nLogn
I believe the reason its n/k for hash loopkup is because, we are roughly dividing the 'n' data values we have into 'k' groups/hashcodes.
So instead of looking at the whole n values while searching, we're only looking at 'n/k' values present in one of the hashcode groups.
I liked how you related Hashbrown with hashing..!! Never thought about it !! Thanks for the nice explanation.
You are welcome Vineet :)
Appreciation for the efforts and explaintion of complexity stuff 👏👏
Thanks for liking
collision 5:50
hash distribution 7:10
server number change (rehashing problem with hashing) 9:20
The explanation is easy to understand. Thank you for your efforts
Glad it was helpful!
Your channel is very underrated
Glad you think so! Help spread the word please :)
Shhh
@@sudocode sure
Two questions:
* What ends up being the solution to N = DIY(x, y) ?
* At 4:51, when we are using Modulus 8, why are there 9 boxes shown numbered 0-8 instead of 8 boxes numbered 0-7?
One of the simplest explanations for hashing
This is awesome! Beautifully explained.. explanation was so easy that I can code it out this whole logic in single shot! Thanks for this video... Subscribed!
Appka positive point the jo aap white board p samne khade ho kr teacher krte the, usse feel tha like me college me professor se lecture le rhe, ese accha feel nhi ho ra
I would like to discuss one idea with since you are already doing a great job and it is really helpful for any person in Software industry. Thank you so much for all your efforts! Will you kindly consider all the details of a topic like instead of keeping some points of a topic for a next video, we can consider that first and the gradually move to some other topics which might be a most asked question. It will make the understanding clear and will help to connect all the points. I think this will make the topic more organized. For example, instead of keeping the topic like hash collision for some other video, if you can include that part also before moving to consistent hashing, I think it will be more organized. Because, there are may videos which talks about consistent hashing, but there are not many video which caters all the aspect. But I really appreciate for your great job.
You are asking for spoon feeding
@@nkunam no this is not spoon feeding bro, @sushmitagoswami7320 is right here , need to have much details else no use of half information and since after long time we didn't yet get any remaining videos
I comment very less on youtube.
Your efforts are brilliant and of great quality. Kindly continue and add videos explaining hands on stuff with Mongo and MySQL.
I see some comments clarifying the time complexity of Binary search. I think you are right as string has characters say m within to compare after finding the string in the mid position. It could have been stated as O(mlogn) but here for discussion purposes this is good.
Thank you so much Vivek, this really means a lot and keeps me motivated.
Good explanation
One thing that I am not understanding is, Reassigning Ruchi is a problem for sure. But why is the existing data being tinkered with? I understand that the hash function will change, but that should affect the new data that is coming in, right? Whatever was stored previously was also distributed evenly, and hence I don't see the point of changing that. Please let me know what am I missing.
DIY(x, y, N) = N - min(x,y) - (int)(N/(x*y))
Where N is the no of keys, x is the initial number of servers and y is the final number of servers.
could you please elaborate on how u reached this conclusion?
Lets take the example when initially 4 servers are there and finally 5 servers are there. Initially the keys were A1 with hash value 18, A2 with 35, A3 with 73, A4 with 46 and A5 with 91. So initially In S1 server A3 would be there, in S2 server A1 And A4 would be there and in S3 server A2 and A5 would be there. Finally after 1 more server is added the positions would be as follows : in S0 server A2 would be there, in S1 server A4 and A5 would be there, in S3 server A1 and A3 would be there. Now if you see the positions of A1 is changed, A2 is changed, A3 is changed, A4 Is changed, A5 is also changed. So number of keys changed equals 5 but your formula gives 1 as the answer. I don't know How you have arrived at this formula but It doesn't seem right to me.
there is a simple approach loop through all the keys,compare(key%x,key%y) if value is different increment count
Good stuff. Just a small correction time complexity of Binary search is O(log N)
which device are you using to make this video ? I find it very helpful! Please sharre
Thank you very much
Good One!
Assuming DIY(x,y) gives the percentage of keys moved, it can be given by
DIY(x,y) = 100 * (LCM(x,y) - min(x,y)) / LCM(x,y)
where,
LCM is Least Common Multiple
min is the minimum fn
Just a curious question , what programming language you use while working
could you please share the amazon link to your mic and what software you and arpit use for video editing?
For the question on number of keys in each bucket in a sample space of n entries and k possible hash values, assuming a uniform distribution, the number of keys per bucket it n/k.
Going by the same logic as above, Let us assume that there are a total of m keys which were placed in a s1 servers. Now they get placed in s2 servers lets say ( with s1 > s2, or s1 < s2 ), Let n be the number of keys moved.
then we know initially each server had roughly m/s1 keys, and later each server has roughly m/s2 keys, So for each server the number of keys moved per server is,
m/s1 - m/s2
Now since the movement happened from s1 servers, the total number of keys is s1(m/s1 - m/s2), which is n, hence
n = ms1(1/s1 - 1/s2)
Since we need to find the percentage of total keys moved which is 100 n/m we do the multiplication and division,
100n/m = s1(1/s1 - 1/s2)
100 n/m = 1 - s1/s2;
Hence roughly 1 - s1/s2 percentage of keys are moved on a change in the cardinality of servers from s1 to s2, assuming that s1 < s2. If the reverse case is true, since this operation is symmetric ( in the sense number of keys moved while changing from s1 to s2 servers is same as that while moving from s2 to s1 servers ), just flip the numerator and denominator.
This as mentioned is the rough estimate, is this correct?
Is this correct?
Omg she is my tech crush now!
5:36 Time complexity for hash table is O(1) for insertion and lookup.
Really appreciate your efforts. Very informative but please do double check and validate before posting. Cost of Binary search is O(log n)
Very nice video
#50k soon ❤️ thanks
What is the answer for the DIY function, BTW Great Video.
Really Great one
Thank you! Cheers!
I think the answer to the DIY is based on whether the final number of servers is even or odd number. i.e.
If its an odd value then the Number of keys that get rearranged = max(x,y)
If its an even then the Number of keys that get rearranged = min(x,y)
Where x and y are initial and final number of servers.
Mam are notes available??
can you post on data structures? and what main skills are needed for FAANG interviews?
something coming up soon.
I can't think of the function even after trying hard, can you pls share the answer
Hi let's say we had x servers which are now reduced to y. x -> y. Let's say total number of keys are N. Then only first y keys will be remain unchanged, and rest all of them have to be displaced. N-y have to be replaced. The total percentage will be N-y/N. Please correct me if I'm wrong as I am preparin for an upcoming interview.
sorting is nlogn but binary search is logn
I love hash browns ! :-)
me too!
Consistent Hashing is a made-up story told by Akamai to get a patent. The ring that everyone draws is really an ordered map, invented around 1960.
Is that so? Care to share the paper from 60s ?
@@sudocode Wikipedia, Binary Search Tree (P. F. Windley, Trees, Forests and Rearranging, 1960). We want to map ranges of keys to servers (and back). If we implement consistent hashing, we'll find a BST sitting behind it (or some other ordered map, like a B-tree).
The video is great, but complexity of binary search is log(n), not n*log(n);)
Hi Yogita,
Really appreciate your efforts and I am awaiting for the videos, please try to make videos consistently atleast one video a week.🙂
Yarrrrr 2 video private kun hai
Answer is n/k
no, n/k is when consistent hashing is implemented
Hashing comes from the dish name Hash brown , wow😂😂😂😂
int DIY (int x,int y){
// function returns % of keys which needs to be maved
return (|x-y|/x) *100;
}
dont say home work again !!! i am gona unsubscribe hahaha !