Yes, it will approximate the distinct element count to the nearest power of 2. It's an approximation, so for 7 distinct elements, FM algorithm would return 4 (i.e. an approximate power of 2)
What if we have two elements 4,3 compute it's hash h(4) = (6x+1) mod 5= 25 mod 5 = 0 h(2) = (6x+1) mod 5= 13 mod 5 = 3 binary h(4)=0=000 h(3)=3=011 Trailing zero: h(4)=000=0 h(3)=011=0 Compute distinct element r=0 2^r = 2^0 = 1 It’s wrong answer ;-;
I'm sorry to mention this, but the algorithm you describe is not Flajolet-Martin algorithm (a.k.a. Probabilistic Counting). The algorithm that you explain is the basis of the LogLog and HyperLogLog algorithms, also by Flajolet and coauthors (but not Martin). For example, LogLog was invented by Marianne Durand and Philippe Flajolet around 2003, some 18 years after FM. Moreover, you skip all the part of "stochastic averaging", which is a must to avoid large deviations in the estimations. This idea was already used in FM algorithm and it was also used in their successors. In stochastic averaging, you maintain m variables max_R[i], i=0..m-1; the first log_2 m bits of each h(x) are used to choose one of m counters, and the remaining bits of h(x) to update (or not) the corresponding max_R[i]. Once you have your m max_R variables you combine them to obtain a value avg_max_R to be used in the estimation of the cardinality. Depending on the way you obtain avg_max_R you have LogLog or HyperLogLog. A correction factor is also needed to avoid bias in the estimation. The original Flajolet-Martin algorithm is similar to LogLog and HyperLogLog, but it finds the largest r such that hashes with i trailing zeros have been observed for all i, 0
Seriously! Bohot help hui iss video se.
Kal exam hai or ek to ye big data dimag ke upar se ja raha hai. Thanks for your detailed explanation !
bahut acha laga video aur bahut achhi hi apki awwaz
Excellent Notes and Example!
Another masterpiece content.thanks a ton ma'am, u made algo's damn easy.
thanks!!! Most helpful during my exam time
Bahut Acha Explain kiya hai ❣❣🔥🔥
Mam aap bhut accha samjhate ho apka easily samagh aa jata hai ❤
Pretty good explanation..keep it up 👍🏻👍🏻
Katai jahar 😎
Really nice video.
Kafi acche se explain keya
Easy to understand 💯
good job
the simplest explanation ever
thanks a much...
So FM algorithm can only approximate the unique values to the powers of 2. If we have say 7 unique elements this would give an incorrect result?
Yes, it will approximate the distinct element count to the nearest power of 2.
It's an approximation, so for 7 distinct elements, FM algorithm would return 4 (i.e. an approximate power of 2)
Only if you don't use stochastic averaging, see my recent comment above
Mam you got h(3) = 4 but while calculating the hash function but in last second h(3) you have written a h(3) = 3
Yesss
@@abdullahmukri4555 inka bhi agaya bhai!!!
Thank you for explaining
good
Thank you.#helpful.
Thankyou so muchhhh😭❤️this was much needed
How did we get hash function like h(x)=6x+1 mode5...?
that will be given in the question
If value 16mod16 gives value 0000 then value of r should be r= 4 or 0 please give the correct answer....Is it 0 or 4
but h4 me 3 trailing zeros hai na ? to 2 kyu likha hai
Thanks✨
Good teaching
What if we have 6 distinct elements last step would never give any other number except power of 2 so?
Nice
Thankyou mam you, supub explanation.
Thanks a lot.......
Thank you so much
best
how to identify the distinct element are 1234 only at the end
maam aapne 11th wala bit ka mod galat likha hai h(3)=3 likha h but h(3)=4 aayega baki explaintion is simple and easy
already shared ma'am
❤
thank you
Will this type of question come in aktu
thanks
Thank u very much mam
What if we have two elements 4,3
compute it's hash
h(4) = (6x+1) mod 5= 25 mod 5 = 0
h(2) = (6x+1) mod 5= 13 mod 5 = 3
binary
h(4)=0=000
h(3)=3=011
Trailing zero:
h(4)=000=0
h(3)=011=0
Compute distinct element
r=0
2^r = 2^0 = 1
It’s wrong answer ;-;
Hash function isn't fixed, for this specific one hash is 6x+1 mod 5
I'm sorry to mention this, but the algorithm you describe is not Flajolet-Martin algorithm (a.k.a. Probabilistic Counting). The algorithm that you explain is the basis of the LogLog and HyperLogLog algorithms, also by Flajolet and coauthors (but not Martin). For example, LogLog was invented by Marianne Durand and Philippe Flajolet around 2003, some 18 years after FM. Moreover, you skip all the part of "stochastic averaging", which is a must to avoid large deviations in the estimations. This idea was already used in FM algorithm and it was also used in their successors. In stochastic averaging, you maintain m variables max_R[i], i=0..m-1; the first log_2 m bits of each h(x) are used to choose one of m counters, and the remaining bits of h(x) to update (or not) the corresponding max_R[i]. Once you have your m max_R variables you combine them to obtain a value avg_max_R to be used in the estimation of the cardinality. Depending on the way you obtain avg_max_R you have LogLog or HyperLogLog. A correction factor is also needed to avoid bias in the estimation. The original Flajolet-Martin algorithm is similar to LogLog and HyperLogLog, but it finds the largest r such that hashes with i trailing zeros have been observed for all i, 0
Don't care+we passing out next exam
This video has all the errors as the video where it got its content from, namely: "Flajolet Martin [FM] Algorithm" by Anuradha Bhatia.
So its unreliable video
😂🎉
Thank you
Thanks