It's certainly clever. But when k is small, and n is large, its wasteful of both time and space (or at least of time and memory allocations). When k == n, it's at least wasteful of space. Essentially, it suffers from the same class of problem as bucket sort does. It's great when the data is evenly distributed. But can have some real drawbacks when the data is not. Unfortunately, for this problem, the data cannot be evenly distributed. Here's why: Consider the case where n == k (or is very close), one of two "edge cases" are possible: a) each element occurs once, so you have a bucket for every possible frequency between 0 and n, but the only frequency that gets used is 1. Because k cannot exceed n, if all elements are to be included then all must be of equal frequency, hence, only 1 of the buckets will get used. b) there is only one element, and it occurs n times. Again, you will use only 1 bucket. The bucket for the "n" frequency. And again, the extra buckets are pointless. Knowing this, we can see that it will never be possible to actually use all of the buckets, because there simply aren't enough locations in n for all the frequencies this approach accounts for. I believe (although I don't feel like doing the math right this second), that the absolute best you could hope for would be that sqrt(n) buckets get used. Now that we know that even if k == 1 and n is extremely large, we won't be able to use all the frequency buckets. k has no impact on that. In fact, the larger n becomes, the more "wasted" buckets there will be since the ratio of a value, v to its square decreases as v grows. The progression from 2 is 1/2, 1/3, 1/4, 1/5, 1/6, etc. So as n reaches max the number of buckets that will not be used is 1 - (1 / 63^n) assuming a 64 bit machine. And that's a lot of buckets. As I said, same issues as bucket sort. Great if the data actually fills all the buckets. Unfortunately, given the constraints of this problem, you'll never fill all the buckets and I suspect that's why it wasn't included in the editorial. Just want to say: We're splitting hairs here (as quite frankly, the most readable solution is a count with a sort and then taking a k sized slice and that's only barely slower than a heap in the worst case and about the same in the average case.) Quicksort and quickselect have always been complex. I've been doing this 25 years-I know no one who could implement either without a quick refresher and a little debugging. It was included in the editorial because its useful to know and understand. But in real life, you'd use an existing implementation.
Exactly.According to the Leetcode, "It is guaranteed that the answer is unique" means that there is no ambiguity in identifying the k most frequent elements in the array.
that's why his solution was accepted I think. Because his res array is of k length and if two elements have same frequency then it will be more than k and it will give an error.
i don't think the solution is going to be [1,2,3] since the loop is going to stop iterating as soon as counter becomes more than or equal to k , since k is equal to 2 and you are starting counter from 0 , adding 1 at res[0] and then 2 at res[2] as soon as it get counter = 2 , its gonna stop and the output will be [1,2]
Great explanation of the logic. I am purely on python not java, but the way you explained this, i won't hv difficulty implementing it in python, since the logic is clear. Btw you've explained the logic better than neetcode
12:45 I think you are reffering frequency values as keys which should ideally be values , if you see frequencies have 2 coming twice which should not be the case if they are keys which are meant to be unique
@nikoo28 Thanks for the video. Very nice explanation. Your videos are very useful. Just a correction. In this testcase, {1,1,1,2,2,3,3}, if k = 2, the expected output is either [1, 2] or [1,3] not [1,2,3]. So to return the correct output, in the above code we can check if(counter < k) and then res[counter++] = integer. The condition counter < k can be removed from for loop.
in the final example, the result array is defined as int[] res = new int[k] where k = 2. So only 2 elements can be added. However, the answer is [1, 2, 3]. Wont this throw index out of bounds for this example?
thanks, I fixed the code in the github link now. Basically add all elements to a list, and then return it as an array. This particular test case is kinda unique, the value of k=2 but we have 3 elements. Hence, needed to handle it separately. Sorry for the confusion.
Nikhil, your code would not work for the test case you mentioned: [1,1,1,1,2,2,3,3,4] & k=2 This code is getting submitted on Leetcode because there it is mentioned that unique answers only. But in the about test case: We should get [1,2,3] as ans for k=2. You cannot assume the res array of size k since there might be duplicacy. Otherwise solution works fine for the Leetcode problem. Here is the Code which will cover duplicacy as well. class Solution { public static int[] topKFrequent(int[] nums, int k) { int n = nums.length; List[] bucket = new ArrayList[n + 1]; HashMap frequencyMap = new HashMap(); ArrayList resultList = new ArrayList(); for (int num : nums) { frequencyMap.put(num, frequencyMap.getOrDefault(num, 0) + 1); } for (int i = 0; i { bucket[frequency].add(element); }); for (int i = n; i >= 0; i--) { if (bucket[i] != null) { resultList.addAll(bucket[i]); if (resultList.size() >= k) { break; } } } int[] result = new int[resultList.size()]; for (int i = 0; i < result.length; i++) { result[i] = resultList.get(i); } return result; }
Awesome Explanation Nikhil. Thank you so much for time and effort and sharing your knowledge. I have tested your code with this input int[] arr = new int[]{1, 1, 1, 1, 2, 2, 3, 3, 4,4}; out output should be [1] [2,3,4] but i found an error since you have int[] res=new int[k]; , so we need to change this line as int[] res=new int[nums.length];
@@mamu11111 that is a very good catch, and I verified it myself. Thanks for pointing that out, I will correct it. :) and I think even LeetCode does not have that test case 👍
Great explanation but I have never seen List initialized like an array. Is there any alternative to do that? I understand now how it works and why it is needed but it's just not that intuitive to me. Probably I am dumb. Probably Map would be more intuitive to me
Thank you for your extremely clear and concise video. Please rest assured that the RUclips algorithim will catch notice of your quality, and your channel will gain very quick and upward traction.
Just curious, doesn't bucket sort have n^2 at the worst case and only n at the average case? While a heap would have n log k at the worst case? Shouldn't a heap be more efficiency?
In the question it's mentioned that "It is guaranteed that the answer is unique." so the example test cases taken by you are wrong, or your problem statement is different then Leetcode-347.
your line of code in the dry run, when populating the result array : res[counter++] = integer; ^ shouldnt the above line just be: res[counter] = integer without incrementing counter first? when you do counter++, res[1] will be populated. i think populating the result array should be: for (Integer integer : bucket[pos]) { res[counter] = integer; counter++; } please let me know what you think
bhai [1,1,1,1,2,2,3,3,4] and k = 2, test case hi galat hai kyuki answer unique nahi hai, it is clearly mentioned in constraints, It is guaranteed that the answer is unique. toh 2 and 3 ki freq same nahi ho sakti aur agar hogi toh k ki value 3 hogi.
I have silly doubt here. You are saying that for your test case ans is 1,2,3 . here is three element. and size of the res array is 2 cause k is 2. Thats makes me confused . It might be stupid question to ask!
Because the numbers in a given array appears at least once, therefore creating bucket of length nums.length for an array of a single element would have only one element of index 0 which means elements with 0 frequency (e.g. array = [1], k=1) this would create a bucket of a single element (bucket[0]) with index 0, which means there can be only elements with 0 frequency that can be stored there which we don't need.
thanks for the test case. I had missed these cases while making the video. However, if you check the code on Github, I have updated it to handle such cases. :) Hope it helps
Just because there is a nested loop does not mean a time complexity of O(n ^ 2). You need to think how many iterations will happen. In the last loop, you can have a maximum of n iterations when all elements of array are different and the value of k=n Hence the time complexity will be O(n)
yes, I realized it a while ago. Have fixed the code in github link to handle that particular case. Thanks for pointing that out :) But hope you get the idea, how to solve the problem.
Am I missing some thing here? The same code is giving ArrayIndexOutOfBoundsException for input {1,1,1,1,2,2,3,3,4}, 2 in my IDE in the last for loop but it is accepted in Leet code. Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 2 out of bounds for length 2 at TopKFrequentElements.topKFrequent(TopKFrequentElements.java:30) at TopKFrequentElements.main(TopKFrequentElements.java:39)
@10:10 in the bucket 1, shouldn;t it be a 4
you are absolutely correct. Sorry for the error
Didn't understand Neetcode so came here. This is very well explained. Instantly subscribed.
thanks for the sub
Same! Neetcode's explanation on this one is a bit confusing.
Out of all the videos I watched over this problem, yours is the one I was able to truly understand. Thank you!
So happy you feel that.
I just found your channel. Both you & neetcode do amazing work. Thank you so much for these!
Please dont stop teaching. Crystal Clear explaination bhaiya
It's certainly clever. But when k is small, and n is large, its wasteful of both time and space (or at least of time and memory allocations). When k == n, it's at least wasteful of space.
Essentially, it suffers from the same class of problem as bucket sort does. It's great when the data is evenly distributed. But can have some real drawbacks when the data is not. Unfortunately, for this problem, the data cannot be evenly distributed. Here's why:
Consider the case where n == k (or is very close), one of two "edge cases" are possible:
a) each element occurs once, so you have a bucket for every possible frequency between 0 and n, but the only frequency that gets used is 1. Because k cannot exceed n, if all elements are to be included then all must be of equal frequency, hence, only 1 of the buckets will get used.
b) there is only one element, and it occurs n times. Again, you will use only 1 bucket. The bucket for the "n" frequency. And again, the extra buckets are pointless.
Knowing this, we can see that it will never be possible to actually use all of the buckets, because there simply aren't enough locations in n for all the frequencies this approach accounts for.
I believe (although I don't feel like doing the math right this second), that the absolute best you could hope for would be that sqrt(n) buckets get used.
Now that we know that even if k == 1 and n is extremely large, we won't be able to use all the frequency buckets. k has no impact on that. In fact, the larger n becomes, the more "wasted" buckets there will be since the ratio of a value, v to its square decreases as v grows. The progression from 2 is 1/2, 1/3, 1/4, 1/5, 1/6, etc. So as n reaches max the number of buckets that will not be used is 1 - (1 / 63^n) assuming a 64 bit machine. And that's a lot of buckets.
As I said, same issues as bucket sort. Great if the data actually fills all the buckets. Unfortunately, given the constraints of this problem, you'll never fill all the buckets and I suspect that's why it wasn't included in the editorial.
Just want to say: We're splitting hairs here (as quite frankly, the most readable solution is a count with a sort and then taking a k sized slice and that's only barely slower than a heap in the worst case and about the same in the average case.) Quicksort and quickselect have always been complex. I've been doing this 25 years-I know no one who could implement either without a quick refresher and a little debugging. It was included in the editorial because its useful to know and understand. But in real life, you'd use an existing implementation.
the problem is finding the top 2,so according to leetcode if we have two values with same frequency we should return only the first one
I was also thinking that😂😂😂
Exactly.According to the Leetcode, "It is guaranteed that the answer is unique" means that there is no ambiguity in identifying the k most frequent elements in the array.
that's why his solution was accepted I think. Because his res array is of k length and if two elements have same frequency then it will be more than k and it will give an error.
Kya clear explanation hai. Thank you Nikhil !!!
i don't think the solution is going to be [1,2,3] since the loop is going to stop iterating as soon as counter becomes more than or equal to k , since k is equal to 2 and you are starting counter from 0 , adding 1 at res[0] and then 2 at res[2] as soon as it get counter = 2 , its gonna stop and the output will be [1,2]
correct
ur just so underrated dude
You are such a hardworking! appreciate your content.
So nice of you
Great explanation of the logic. I am purely on python not java, but the way you explained this, i won't hv difficulty implementing it in python, since the logic is clear. Btw you've explained the logic better than neetcode
bhisaab kya samjhaya hai, ekdam goated bhai
🤘🏻
12:45 I think you are reffering frequency values as keys which should ideally be values , if you see frequencies have 2 coming twice which should not be the case if they are keys which are meant to be unique
outstanding lecture!
@nikoo28 Thanks for the video. Very nice explanation. Your videos are very useful. Just a correction. In this testcase, {1,1,1,2,2,3,3}, if k = 2, the expected output is either [1, 2] or [1,3] not [1,2,3]. So to return the correct output, in the above code we can check if(counter < k) and then res[counter++] = integer. The condition counter < k can be removed from for loop.
Thank you bhai. Great explanation
Hi Nikhil, Great explanation!! At end we have for loop inside another for loop, so why is still O(n) not O(n*k)?
just having a nested loop does not necessarily mean O(n*k) complexity. Try to analyze, what is happening with the values in the loop
Wonderful explanation!
superb explanation, thank you, I hope you have the leetcode blind 75 solutions
in the final example, the result array is defined as int[] res = new int[k] where k = 2. So only 2 elements can be added. However, the answer is [1, 2, 3]. Wont this throw index out of bounds for this example?
can you give me a sample test case?
@@nikoo28the same one in the video. [1,1,1,1,2,2,3,3,4] gave an indexoutof bounds. Try it.
thanks, I fixed the code in the github link now. Basically add all elements to a list, and then return it as an array.
This particular test case is kinda unique, the value of k=2 but we have 3 elements. Hence, needed to handle it separately. Sorry for the confusion.
Nice and simple thank you Sir
Can you please make a video on 658. Find K Closest Elements too ?
Sure..gradually though :)
Understood, thanks for the content!
If we get three numbers in the result it is throwing index out of bounds as the size of the array has been limited to K.
Is your testcase within the problem constraints?
Nikhil, your code would not work for the test case you mentioned:
[1,1,1,1,2,2,3,3,4] & k=2
This code is getting submitted on Leetcode because there it is mentioned that unique answers only.
But in the about test case:
We should get [1,2,3] as ans for k=2.
You cannot assume the res array of size k since there might be duplicacy.
Otherwise solution works fine for the Leetcode problem.
Here is the Code which will cover duplicacy as well.
class Solution {
public static int[] topKFrequent(int[] nums, int k) {
int n = nums.length;
List[] bucket = new ArrayList[n + 1];
HashMap frequencyMap = new HashMap();
ArrayList resultList = new ArrayList();
for (int num : nums) {
frequencyMap.put(num, frequencyMap.getOrDefault(num, 0) + 1);
}
for (int i = 0; i {
bucket[frequency].add(element);
});
for (int i = n; i >= 0; i--) {
if (bucket[i] != null) {
resultList.addAll(bucket[i]);
if (resultList.size() >= k) {
break;
}
}
}
int[] result = new int[resultList.size()];
for (int i = 0; i < result.length; i++) {
result[i] = resultList.get(i);
}
return result;
}
good job well explained :)
brother the way u solve the problem is like ABCD. How to create that thinking in DSA.
It is so wonderful once you start piecing things together :)
Great explanation u r amazing dude ❤😊keep it up
Awesome Explanation Nikhil. Thank you so much for time and effort and sharing your knowledge. I have tested your code with this input int[] arr = new int[]{1, 1, 1, 1, 2, 2, 3, 3, 4,4}; out output should be [1] [2,3,4] but i found an error since you have int[] res=new int[k]; , so we need to change this line as int[] res=new int[nums.length];
What is your value of k in your test case?
@@nikoo28 Hi Nikhil. K value is 2. Please correct me if my understanding is wrong.
@@mamu11111 that is a very good catch, and I verified it myself. Thanks for pointing that out, I will correct it. :) and I think even LeetCode does not have that test case 👍
your changes are wrong. the question is for top k frequent elements, thats why your test case is not valid for the question.
Great explanation but I have never seen List initialized like an array. Is there any alternative to do that? I understand now how it works and why it is needed but it's just not that intuitive to me. Probably I am dumb. Probably Map would be more intuitive to me
no approach is dumb, just a preference...as long as you work within the expected time limits...
you're the goat
Thank you for your extremely clear and concise video. Please rest assured that the RUclips algorithim will catch notice of your quality, and your channel will gain very quick and upward traction.
Thank you so much
Just curious, doesn't bucket sort have n^2 at the worst case and only n at the average case? While a heap would have n log k at the worst case? Shouldn't a heap be more efficiency?
It depends on your input constraints…with a smaller range, you can expect better time complexity.
In the question it's mentioned that "It is guaranteed that the answer is unique."
so the example test cases taken by you are wrong, or your problem statement is different then Leetcode-347.
Thanks alot
Thanks a ton!
You're welcome!
How is this not O(n+k) because of the nested for loop?
your line of code in the dry run, when populating the result array :
res[counter++] = integer;
^ shouldnt the above line just be: res[counter] = integer without incrementing counter first? when you do counter++, res[1] will be populated.
i think populating the result array should be:
for (Integer integer : bucket[pos]) {
res[counter] = integer;
counter++;
}
please let me know what you think
counter++ is post increment it doesn't matter if its
res[counter++] = integer;
or
res[counter] = integer;
counter++; both are same
so at first iteration counter will be 0 then after that it will increment by 1
bhai [1,1,1,1,2,2,3,3,4] and k = 2, test case hi galat hai kyuki answer unique nahi hai, it is clearly mentioned in constraints, It is guaranteed that the answer is unique. toh 2 and 3 ki freq same nahi ho sakti aur agar hogi toh k ki value 3 hogi.
I have silly doubt here. You are saying that for your test case ans is 1,2,3 . here is three element. and size of the res array is 2 cause k is 2. Thats makes me confused . It might be stupid question to ask!
There are 3 types of elements -> 1, 2 and 3
We need only top k (2) frequent elements. So I only give answer as 1 and 2
You are returning 2 elements.
Hi, what if the nums=[-1,-1] at that time hashMap = {-1 : 2} but bucket Array starts from 0? how to handle this test case? Thanks.
can you please elaborate?
You find the index using frequency not the key. In your case the frequency of -1 is 2 , So -1 is inserted at index 2.
❤
Amazing
Thank you! Cheers!
why create bucket of length nums.length+1? why not just nums.length?
Because of 0 based indexing.
Because the numbers in a given array appears at least once, therefore creating bucket of length nums.length for an array of a single element would have only one element of index 0 which means elements with 0 frequency (e.g. array = [1], k=1) this would create a bucket of a single element (bucket[0]) with index 0, which means there can be only elements with 0 frequency that can be stored there which we don't need.
14:03 you are creating an array of size k then how can you add 3 elements if k is 2 as stated in example 6:09
where am i adding 3 elements?
this solution will certainly not work for the input nums= [-1,-1] & k =1
thanks for the test case. I had missed these cases while making the video. However, if you check the code on Github, I have updated it to handle such cases. :)
Hope it helps
Why is the solution O(n) and yet there was a nested loop at the end? i don't understand
Just because there is a nested loop does not mean a time complexity of O(n ^ 2).
You need to think how many iterations will happen. In the last loop, you can have a maximum of n iterations when all elements of array are different and the value of k=n
Hence the time complexity will be O(n)
First view, first like, first comment
how do you handle -ve numbers
That will be a different problem
6:37 the test case, you have taken to demonstrate the problem is not correct because according to the problem statement the answer should be unique
yes, I realized it a while ago. Have fixed the code in github link to handle that particular case. Thanks for pointing that out :)
But hope you get the idea, how to solve the problem.
this can be solved by PriorityQueue also
yes
it is better if u use a mic
Am I missing some thing here? The same code is giving ArrayIndexOutOfBoundsException for input {1,1,1,1,2,2,3,3,4}, 2 in my IDE in the last for loop but it is accepted in Leet code.
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: Index 2 out of bounds for length 2
at TopKFrequentElements.topKFrequent(TopKFrequentElements.java:30)
at TopKFrequentElements.main(TopKFrequentElements.java:39)
try having a look again, maybe you are missing something
@sakishakkari You are correct. For that test case, this code does throw an exception since int[] res = new int[k]