Top K Frequent Elements - Bucket Sort - Leetcode 347 - Python

Поделиться
HTML-код
  • Опубликовано: 11 сен 2024

Комментарии • 481

  • @Xeoncross
    @Xeoncross 2 года назад +456

    I appreciate the time you put making and sharing all your content for free. Here is the $10 I might have spent on your udemy course.

    • @NeetCode
      @NeetCode  2 года назад +62

      Thank you so much!!!

    • @onlysubscriptions2152
      @onlysubscriptions2152 Год назад

      Does he have udemy course???

    • @PeterPan-xe7qw
      @PeterPan-xe7qw Год назад +10

      @@onlysubscriptions2152 nah, just a hypothetical $10 he would’ve spent since most people pay wall this content, but he does it for free.

    • @hamdi_
      @hamdi_ Год назад +78

      As a side note, consider donating directly to the creators if they have a donation link, because RUclips takes a whopping 30% of your donation.
      In this case, Neetcode accepts Patreon donations, which takes a more reasonable commission of about 8%.

    • @mostinho7
      @mostinho7 Год назад +1

      @@hamdi_even 8% is too high tbh. Not for neetcode specifically, he sells his own courses and is set. But for someone else who might be in need of the money 8% is ridiculous

  • @rhitamdutta1996
    @rhitamdutta1996 6 месяцев назад +67

    I have never practiced DSA in my life, not even in college. After getting laid off, I stumbled across your videos to learn DSA. They are so crisp, informative, and to the point. I can't thank you enough.

  • @tweefeety
    @tweefeety 3 года назад +336

    I love you man. You're an actual angel. Your explanations are always so clear. And your drawings are so easy to understand.

    • @NeetCode
      @NeetCode  3 года назад +33

      Thanks, appreciate the kind words 🙂

    • @fortitude242
      @fortitude242 2 года назад

      @@NeetCode you are n angel. :)

  • @rohananjaria1009
    @rohananjaria1009 3 года назад +49

    Best youtube channel for leetcode problems hands down.

  • @justsimple6333
    @justsimple6333 2 года назад +30

    i used your previous video on groupAnagrams to solve this, just hashmapped the array in to a defaultdict(int) den sorted the dictionary entirely in a descending order. Your videos have been really helpful, first time i solved a medium all by myself

    • @trenvert123
      @trenvert123 Год назад +9

      That's so cool that python has a convenient way to sort hashmap by value. I looked into it for java, and it is a nightmare. I would have to create my own comparator. I think if I'm doing that, I may as well just learn bucket sort at this point.

    • @robinfelix3879
      @robinfelix3879 Год назад +3

      haha, did the same, cheers

    • @moveonvillain1080
      @moveonvillain1080 Год назад +2

      @@trenvert123 Java is sooooo verbose......

    • @Albert-nc1rj
      @Albert-nc1rj 7 месяцев назад

      @@trenvert123 In Go I faced the same problem, spent 20 minutes trying to sort a hash map by values (failed). So I just copied values into new array and sorted them there lol.

    • @dumdum407
      @dumdum407 5 месяцев назад

      @@Albert-nc1rj you can also insert the contents of your hashMap into a second hashMap, but use the values from frequency hashMap as the keys of the second hashMap.
      Once you have done that, take all the keys out into an array, sort the array and retrieve first k values, use first hashMap to get integers and return that.

  • @joshmarion8640
    @joshmarion8640 Год назад +168

    Am I the only one who is a little confused as to how this solution is O(N). If you loop through the array which is the size of the array, and then in each index you might have to loop through up to N times. So how is this not o(n^2)
    Edit: Nevermind, I think I realize it now, I figured I would write it out for anyone who might still be confused. As we traverse through the array, we go through the whole array. So this is O(n). But we aren't doing an operation n times at each stop. We are doing N more operation throughout the entire array. So even though the for loops are nested, we are doing N more operations throughout a for loop which is N, so the total is just N+N, which simplifies to O(N)

    • @abhishekdhyade7500
      @abhishekdhyade7500 Год назад +20

      Thanks a lot buddy! I was scratching my head off to find out this same doubt.
      Now that I saw your comment, I was able to understand it.
      Thanks again!!

    • @ahmedmansour5032
      @ahmedmansour5032 Год назад +4

      So essentially the inner loop is just operating on the subset of N elements?

    • @quanmai5759
      @quanmai5759 Год назад +22

      I would say it's n+k rather than n+n, because the size of the res array is k. So after looping through the freq array of size n, you only need to fill the res array k times then stop, so k more operations. Still it's O(n)

    • @ubermensch_1111
      @ubermensch_1111 Год назад

      @@quanmai5759 Yah I also think so it will be n+k

    • @DiaaHaresYusf
      @DiaaHaresYusf Год назад

      @@ahmedmansour5032 but at worest case you will face frequency = 1 for each element in nums .. and O(N) is always calcualted in worest case, I have made a commend on video please go through it , you will understand what I am saying

  • @sandeshpaudel9665
    @sandeshpaudel9665 2 года назад +54

    for the heap solution, it's better to use a min heap of size k rather than using a max heap and then removing max k times. Using the min heap, you would remove min and add the next frequency. by the end, you are left with k most frequent ones and removing the min gives you the answer. You can reduce this to n log k and not n log n

    • @gurmukhsinghnirman4935
      @gurmukhsinghnirman4935 2 года назад +4

      and even for values like k = 1e9, logk is around 30 so the complexity is around O(30*n) which is basically O(n)

    • @sandeshpaudel9665
      @sandeshpaudel9665 2 года назад

      @@gurmukhsinghnirman4935 indeed!

    • @PippyPappyPatterson
      @PippyPappyPatterson 2 года назад

      How would you cap the size of the heap `h` at size `k`?
      As you're adding frequencies, `if len(h) > k: heapq.heappop(h)`?

    • @sandeshpaudel9665
      @sandeshpaudel9665 2 года назад +2

      @@PippyPappyPatterson so let's suppose k = 3 and you have numbers [1 , 2 , 3, 4, 5]. You can find the k-largest or in this case 3rd largest using a min-heap of size 3. As you add in numbers, your heap can grow like this:
      [ 1 ]
      [ 1 , 2 ]
      [ 1, 2 , 3 ] ** you're capped at 3 ***
      [2, 3, 4] ** add next( 4 ) and remove min (1)**
      [3, 4, 5] ** add 5, remove 2 **
      Now the head of the heap will be the 3rd largest element.

    • @MinhNguyen-lz1pg
      @MinhNguyen-lz1pg 2 года назад

      @@PippyPappyPatterson here
      class Solution:
      def topKFrequent(self, nums: List[int], k: int) -> List[int]:
      num_to_count = collections.defaultdict(int)
      for num in nums:
      num_to_count[num] += 1
      min_heap = []
      for num in num_to_count:
      if len(min_heap) < k:
      heapq.heappush(min_heap, (num_to_count[num], num))
      else:
      heapq.heappushpop(min_heap, (num_to_count[num], num))
      res = []
      while min_heap:
      _, val = heapq.heappop(min_heap)
      res.append(val)
      return res

  • @maierdanefan6998
    @maierdanefan6998 2 года назад +33

    Amazing contents! The best algorithms channel that focus on logic and thinking in a clear way. Happy to have found this channel, been writing neetcode ever since.

  • @Number_Crunch
    @Number_Crunch 2 года назад +38

    The algorithm that you explained at 3:15 was counting sort and not bucket sort. What you did, however, towards the end was similar (not same as) bucket sort.

  • @leofastov9205
    @leofastov9205 Месяц назад +2

    I'd like to suggest a minor improvement:
    for n in nums:
    count[n] = 1 + count.get(n, 0)
    if count.get(n, 0) > max_val:
    max_val = count.get(n, 0)
    freq = [[] for i in range(max_val + 1)]
    The max_val variable is used to track the maximum frequency instead of using the length of nums.
    This can potentially save some space if the maximum frequency is significantly less than the length of nums.
    Addition of max_val adds a small const time overhead

  • @Thrashmetalman
    @Thrashmetalman 3 года назад +11

    the one thing I dont like about usage of heap questions is that most of the times you havge to default to some library to do it cause I doubt any of us could code up a heap in a phone screen.

  • @netanelkaye3014
    @netanelkaye3014 8 месяцев назад +28

    People say you are supposed to learn enough to be able to figure out leetcode problems, as opposed to memorizing leetcode. Are we seriously supposed to have been figured this method out? This was so specific...

    • @edd4851
      @edd4851 3 месяца назад +3

      When you encounter a similar problem next time, you will think , well, i already saw it somewhere. I can do this.

    • @netanelkaye3014
      @netanelkaye3014 3 месяца назад +2

      @@edd4851 Really? How many leetcode problems are solved this way?

    • @_carrbgamingjr
      @_carrbgamingjr 3 месяца назад

      @@netanelkaye3014 1😅

    • @wotizit
      @wotizit 2 месяца назад

      its best not to think about it lol

    • @beaglesnlove580
      @beaglesnlove580 Месяц назад

      @@netanelkaye3014none of

  • @user-ui2hr6cl3q
    @user-ui2hr6cl3q 2 дня назад

    this is how i did it, i basically converted the list in a dict, sorted it using values and the took out the k most frequent values but i really appreciate your videos.
    class Solution:
    from collections import Counter
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
    value = []
    my_dict = dict(Counter(nums))
    my_dict2 = dict(sorted(my_dict.items(), key=lambda item: item[1], reverse = True))
    mlis = list(my_dict2.keys())
    for i in range(k):
    value.append(mlis[i])
    return value

  • @andrepinto7895
    @andrepinto7895 2 года назад +12

    While counting, you can keep track of the max occurrences and then you only need to initialize freq to that max instead of len(nums)

    • @amitkoushik5504
      @amitkoushik5504 Год назад

      Good one ...I spent a lot of time understanding this but finally got it..🤗🤗

    • @arneishprateek6444
      @arneishprateek6444 Год назад +1

      Sure but it's still O(N).

    • @s1kebeats
      @s1kebeats Год назад

      thx

    • @MrHarryGaming
      @MrHarryGaming Год назад

      count = {}
      maxFreq = 0 # or 1
      for each in nums:
      count[each] = count.get(each, 0) + 1
      maxFreq = max(maxFreq, count[each])

      freq = [set() for i in range(maxFreq + 1)]

  • @wow_donnie
    @wow_donnie 2 года назад +2

    I came up with this solution originally but really appreciated the thoughtful description of the linear solution!
    result = defaultdict(int)
    for num in nums:
    result[num] += 1
    result = dict(sorted(result.items(), key=lambda item: item[1]))
    return list(result.keys())[-k:]

  • @md-ayaz
    @md-ayaz Месяц назад +1

    One thing to note in python. Please do not intialize like this ( if you are )
    bucket = [[]] * (len(nums) + 1)
    this references to the same list and your output will be wrong.
    Use the bucket = [[] for _ in range(len(nums)+ 1)] , as mentioned in the video.

  • @johnzheng849
    @johnzheng849 2 года назад +11

    Your video got a me a job as an SDE at AWS!!

    • @NeetCode
      @NeetCode  2 года назад +1

      Congratulations 🎉🎉

    • @jwastken8814
      @jwastken8814 2 года назад +1

      Hey John, which neetcode 150 questions did they ask? I have a phone interview coming up

  • @awesome_ashu
    @awesome_ashu Год назад +11

    We can optimize it more by storing the maxFrequency while creating the HashMap (which has the integer and their corresponding frequency). Then, the next iteration to get the required elements can start from this maxFrequency instead of N.

  • @naveennvnkumar4615
    @naveennvnkumar4615 2 года назад +3

    I am bit confused, at 12:25 and lines 12,13 wouldn't it become a N^2 solution if all the elements are distinct

    • @ParryHotter73
      @ParryHotter73 2 года назад +3

      yeah, the example he showed is actually the worst case and it may be O(n²) but we can't say n² cause we don't find n elements at every index, this is an exceptional and not like general nested loop

    • @naveennvnkumar4615
      @naveennvnkumar4615 2 года назад

      @@ParryHotter73 ok got it. thanks buddy

  • @SHARMATUSHAR1_
    @SHARMATUSHAR1_ 3 месяца назад +1

    Also, you can do res += freq[i] in line 13. The problem description mentions the solution will be unique. So, we know that all the elements added if will either match k or will be lesser. So, no need to run a loop.

  • @bryanleow5024
    @bryanleow5024 2 года назад +2

    for the return function, an alternate way is to use the extend() method in python:
    res = [ ]
    ptr = len(frequency)-1
    while len(res)

    • @kestiv2429
      @kestiv2429 Год назад +1

      Result size will be wrong if len(frequency[-1]) > k I think

    • @vachannadupalli6133
      @vachannadupalli6133 Год назад +1

      @@kestiv2429 The question guarantees that there is a unique solution. Hence every time we extend the result array, at some point it will(has to be) be equal to k. If it were not guaranteed, then you would be right.

  • @SHARMATUSHAR1_
    @SHARMATUSHAR1_ 3 месяца назад +1

    You can also use count = Counter(list)
    This will count in itself. No need for a loop. Another day thanking Guido van Rossum for making my life easier.

  • @mohamedeltawab
    @mohamedeltawab 3 года назад +11

    Your explanation is like art! Thank you!

  • @pinakadhara7650
    @pinakadhara7650 Год назад +2

    Thanks for the video! I came up with the same solution except I assumed each element is "repeated unique number of times" from the problem statement - "It is guaranteed that the answer is unique.". So instead of looping over each lists, I just considered the first element.

  • @xrdevelopment1390
    @xrdevelopment1390 2 года назад +4

    is it just me or after struggling on a particular part for a while.... then it hits!!!!! best feeling ever. NeetCode, I am following along every problem and have the confidence that I'll get my dream tech position. thank you, its the same feeling I had when I found out about khan academy in high school

    • @Milan-vi1bq
      @Milan-vi1bq 2 года назад

      we're both gonna get the dream job homie!

  • @tszyinshirleycheung4040
    @tszyinshirleycheung4040 3 года назад +8

    I think the runtime of using heap is O(n log k), we need O(n) to construct the heap and remove an item cost O(log k) ?

    • @MrACrazyHobo
      @MrACrazyHobo 3 года назад

      Yes, this is even what the leetcode official answers says

    • @michaelchen9275
      @michaelchen9275 3 года назад

      Isn't it O(log n) to remove from a heap with n elements? And we do that k times, so that makes O(k log n).

    • @theniknik0999
      @theniknik0999 2 года назад

      @@michaelchen9275 If we restrict the heap to be of size k (since we only care about k most frequent), at worst case we'll end up popping n elements. i.e. O(n log k)

  • @symbol767
    @symbol767 2 года назад +5

    This is perfect, thank you bro, took me a while to understand this problem, gonna need to redo it without looking at the solution in a couple days.
    Thank you, liked and commented again to support

  • @kwakukusi4094
    @kwakukusi4094 2 года назад +2

    I got a similar question on my onsite interview with amazon (not the same question but same concept). I did not know bucket sort so I used the sorting method. The interviewer said there was a way of getting a linear time complexity and I did not know what to do .

  • @YNA64
    @YNA64 2 года назад +3

    Holy this is so much clearer than the quick select one.... Thank you so much

  • @saifmohamed1776
    @saifmohamed1776 3 года назад +4

    i didn't see any body came up with this solution in the discussion on leetcode , all of the solutions were use heap or may be some of them use quick select ;
    so i was afraid that i analysis my algorithm wrong but after watching you i know that i was right about my solution .

    • @CarlJohnson-iv7sn
      @CarlJohnson-iv7sn 2 года назад

      Infact the top solution in the discuss is using bucket sort itself.

  • @chesea9790
    @chesea9790 6 месяцев назад +1

    Awesome video! One small thing, shouldn't it be O(n log k) instead of O(k log n) for the heap solution since there's n elements, heap of size k means log k time to heapify and n calls so n * log k?

    • @AustinCS
      @AustinCS 3 месяца назад

      Yes, it isn’t O(n) like he said

  • @VarunMittal-viralmutant
    @VarunMittal-viralmutant 2 года назад +5

    The final array that you create may have a lot of holes. Say there are 1000 elements, but consisting of only 1's and 2's, then the list will be full of holes. It can be further optimized by keeping a 'max_freq' variable which will provide an upper bound on the size of the array. This max_freq can be updated while creating the hash-map.

    • @mearaftadewos8508
      @mearaftadewos8508 2 года назад

      or may be this may not be the most efficient way to do this for all kinds of inputs like the one you said and unsorted numbers. It doesn't give the top frequent one's. Dictionary with maxheap is the one that can handle all possible inputs.

  • @yarnehermann
    @yarnehermann 2 года назад +5

    I'm thinking you can just invert the frequencyMap to {frequency: list of values with that frequency} and then sort the keys in that inverted map. This sorting would be O(sqrt(n) log(sqrt(n))) (which is < O(n)) because there cannot be more than O(sqrt(n)) different frequencies (if each value has a different frequency, then n = O(c * (c+1) /2), with c being the number of distinct frequencies).
    Then it's just a matter of iterating over the reverse sorted keys and adding values to a resultArray until that array reaches length k
    lookupDict = defaultdict(int)
    for n in nums:
    lookupDict[n] += 1
    inverseDict = defaultdict(list)
    for key, v in lookupDict.items():
    inverseDict[v].append(key)
    sortedKeys = sorted(inverseDict.keys(), reverse = True)
    sortedKeysIndex = 0
    res = []
    while k > 0:
    values = inverseDict[sortedKeys[sortedKeysIndex]]
    if len(values) > k:
    res.extend(values[:k])
    else:
    res.extend(values)
    k -= len(values)
    sortedKeysIndex += 1
    return res

    • @pragnyatata491
      @pragnyatata491 10 месяцев назад

      really love the approach taken, thank you

  • @prithvirajchavan9186
    @prithvirajchavan9186 2 года назад +1

    So basically the map of frequency to values can actually be of size O(n*n)? The first 'n' is due to worst case frequency being 1 to n and the second 'n' is because at most 'n' numbers can have the same frequency of 1. I kinda understand how its not really O(n*n) and a bit less than that as the duplicates are being consumed into a single key, but its still more than O(n) right? Another observation: The only possible values of frequency that have a value attached to them are the subsets that add up to N. In the given example, 100 occurs once, 2 occurs twice and 1 occurs thrice, so sum of frequencies is actually 1+2+3=6 which is N in this case. So the worst case space complexity is actually O(n*length of longest subset of frequencies that sum to N). Dont really know how I ended up with a Subset Sum problem just because i couldnt justify the O(n) space complexity lol. Any clarifications are appreciated :D

  • @haarismirza2044
    @haarismirza2044 2 года назад +2

    How come the solution code doesn't work when we instantiate our freq array like this:
    freq = [[]] * (len(nums) + 1)

  • @bundayyolayinka3352
    @bundayyolayinka3352 2 года назад +8

    My current status: Understands the Questions, Have an Idea of how to start (not how to finish), Watches just your drawing explanation, Realize what is missing, Writes the whole solution.

  • @VineetKrGupta
    @VineetKrGupta 4 месяца назад

    I got my first job after following your neetcode 150, 2 years ago. now after the layoff i am here again learning the dsa.

  • @ronakpatel7911
    @ronakpatel7911 11 месяцев назад +2

    What I don't quite understand is that when I implemented this solution, the speed and memory is not as efficient compared to my initial solution in
    ```
    class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
    count = {}
    for num in nums:
    count[num] = 1 + count.get(num, 0)
    res = sorted(count, key = count.get, reverse = True)
    return res[:k]
    ```
    Even though using sorted() here causing it to be n log n... Can someone explain why this one is appearing to be much quicker than the solution in the video?

    • @pruthvipegasus
      @pruthvipegasus 5 месяцев назад

      sorted() uses quick sort which has O(n log n) time complexity in the python interpreter.

  • @numberonep5404
    @numberonep5404 2 года назад +4

    I love how your videos are always so damn clear :) ! Thanks alot
    Btw, here is a weird(?) quicksort version that beats 93%:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
    count=list(Counter(nums).items())
    def quick(l,r):
    pivot,p = count[r][1],l
    for i in range(l,r):
    if pivot>=count[i][1]:
    count[i], count[p] = count[p], count[i]
    p+=1
    count[r], count[p] = count[p], count[r]
    if p>len(count)-k:
    return quick(l,p-1)
    if p=ind]

    • @NeetCode
      @NeetCode  2 года назад +3

      Nice!

    • @heathergray4880
      @heathergray4880 2 года назад +2

      Mine beats 98.62 time and 90 on space and is two lines long :)

    • @lucaslau8379
      @lucaslau8379 2 года назад

      @@heathergray4880 would you share your code for learning please?

  • @akarsan9121
    @akarsan9121 Месяц назад +1

    questions: lets assume we have array [1,2,2,3,3,4,4,4] and k = 2. Then for first most frequent we will have 4 but what about the 2nd most frequent element? cuz count of both 2,3 is 2 so how do we decide which value to insert into our resulting list???
    Please help
    ps: "The test cases are generated such that the answer is always unique." is this statement accounting for that edge case?

  • @eulier1
    @eulier1 Месяц назад

    That's was a very interesting way to solve a specific real life problems, by counting and working with hash and array as a way to identify K most frequent elements. This can be useful for small to big business handling inventory or when you need to pack-up your stuff to travel.

  • @sammyj29
    @sammyj29 2 года назад +8

    Amazing content as usual!! But I still don't understand, why is it O(n) even if there are 2 nested for loops? Thanks for creating these helpful videos!

    • @nathanwailes
      @nathanwailes 2 года назад

      The inner "for" loop on line 13 generally won't be iterating through an O(n) list, but instead just the number of input numbers that occurred a particular number of times.

    • @sravankumar4195
      @sravankumar4195 2 года назад +2

      I think, in worst case, it will be O(n^2). Let's consider the case, nums = [1,2,3,4,5,6] and k =6, then on line 13 the inner loop freq[1] = [1,2,3,4,5,6] will iterate "n" times. But in Avg. case it is O(n).

    • @chanpreetsingh007
      @chanpreetsingh007 2 года назад +1

      @@sravankumar4195 nope its o(n) only.

    • @Logan-mj3wx
      @Logan-mj3wx 2 года назад

      @@sravankumar4195 I believe you are correct, in the worst case this would be O(n^2) for the situation you described. You have to iterate over the whole frequency list(size n) and then when you get to the solution set(in your case index 1) you have to iterate over n elements once more. I had the same thought

    • @SlakOffs
      @SlakOffs 2 года назад

      @@Logan-mj3wx What about, if instead of a for loop, you simply did res += freq[i]. Is that operation faster O(1)?

  • @vasumahalingam5162
    @vasumahalingam5162 11 месяцев назад +3

    Nice algorithm but it looks like it will fail when the nums list is [-1,-1] which makes c as -1 throwing index out of range exception.

  • @hiota45
    @hiota45 29 дней назад

    I used a dictionary and pair of while loops in python. Basically, while the nums list > 0, I would .pop the 0th number off the list and check to see if the number was in the dict as a key. If it was, add one to the value for that key. If not add it to dict as a key with a value of one. When the nums list was exhausted, another while loop for k > 0. Use the python max function to give a variable equal to the key with the highest value. Add that key to the results list, set the key value to 0, and -1 on k. I'm a novice though so this may be suboptimal?

  • @denshaSai
    @denshaSai 2 года назад +1

    Got this question for google, what to do then if input is streaming (like a log)? guess we keep updating the count (histogram), and rebuild the freq array everytime?

  • @ARJUN-op2dh
    @ARJUN-op2dh 2 года назад +1

    More simpler way
    from collections import Counter
    def dx(nums: list, k: int):
    x = Counter(nums).most_common()[:k]
    ls = []
    for i,j in x:
    ls.append(i)
    return ls

  • @arungowda
    @arungowda 2 года назад +2

    We can do a little space optimization by having max(counts) size for bucket instead of nums.length

  • @theornament
    @theornament 7 месяцев назад

    I did the solution with priority queue and hashmap and it seemed to have better time complexity and space efficiency than using bucketsort. I feel like this is tricky because, when we are solutions for problems, we start analyzing which data structure we are going to use, its time complexity, etc. based on how those data structures are regularly implemented. The thing is, algorithms and built in functions in languages have improved drastically that they take less time than what theoretically they should take.
    It's tricky but are those are things that we should consider as well?

  • @javidabderezaei3632
    @javidabderezaei3632 Год назад

    Heap solution (faster than 95% of solutions on leetcode):
    dic = {}
    for i in nums:
    if i not in dic.keys():
    dic[i] = 1
    else:
    dic[i] += 1
    heaped = []
    for key in dic:
    heaped.append([-dic[key], key])
    heapq.heapify(heaped)
    res = []
    for i in range(k):
    freq, x = heapq.heappop(heaped)
    res.append(x)
    return res

  • @rakshitkumar1141
    @rakshitkumar1141 Год назад +1

    I think your solution is O(n2). because in last part you performed n operations inside n.
    for i in range(len(freq) -1, 0, -1): ==== O(n)
    for n in freq[i]: ==== O(n)

  • @leetcodemermaid9804
    @leetcodemermaid9804 2 года назад +2

    For the first heap solution, you can do a total complexity of n Log(k). Instead of heapify the whole dictionary ,you can heap push and heap pop when you go through the dictionary. When len(heap) > k, heap pop.

    • @MehmetDemir-xi3yy
      @MehmetDemir-xi3yy 6 месяцев назад

      nlogk generally bigger than klogn
      k=1200 and n=1500 nlogk ≈1500×3.0792≈4618.8

      klogn=1200×log1500 ≈ 3811.2

  • @vdyb745
    @vdyb745 2 года назад +1

    I was wondering where you were going with the bucket. This is so clever !!! Brilliant !!

  • @sangramshinde2211
    @sangramshinde2211 2 года назад +1

    nick white, kevin, neetcode best guys to get the perfect explanation...

  • @amrojjeh
    @amrojjeh 5 месяцев назад

    In this case, a heap sort might be more optimal. Even though your code is O(n), it's making a ton of allocations, which may make it practically slower in some cases.

  • @yxngboypolo
    @yxngboypolo 26 дней назад

    11:42 you could also use `for i in reversed(freq):` and modify the code as needed

  • @dadisuperman3472
    @dadisuperman3472 2 года назад +1

    Actually there is a small improvement on this solution.
    Given the following observation:
    The length of the hashmap is the number of distinct elements in the array, all the rest are repetition of some or all of the elements of the array.
    So if we suppose that all elements are repeated only once except one elements is repeated enough to fill the array size, we can calculate the largest frequency achievable as follow:
    maxFreq = Len(arr) - Len(hashmap) + 1
    that means the size of the frequency array is just : maxFreq, instead of len(nums)+1
    freq = [[] for in in range(MaxFreq)]
    That was my 50cents contribution ;)

  • @WinnerSingh
    @WinnerSingh Год назад

    Well what mistake I did is - I watched other python tutorials
    What good thing happen is - I am watching your videos
    But you are geniuses you explain well but I have to watch two times every video to understand correctly.
    Thank you

  • @venkatshiva8275
    @venkatshiva8275 2 месяца назад

    First of all thank you very much @neetCode for this amazing explanation. Anyone has suggestions or improvements on the below solution??
    class Solution {
    topKFrequent(nums, k) {
    let countObj = {};
    const resArr = [];
    let ctr = nums.length;
    for(const i of nums){
    countObj[i] = (countObj[i] || 0) + 1;
    }
    while(ctr > 0){
    for(const key in countObj){
    if(countObj[key] == ctr && resArr.length < k){
    resSet.push(key);
    }
    }
    if(resArr.length == k){
    break;
    }
    ctr--;
    }
    return resArr;
    }
    }

  • @AJ-ju7tl
    @AJ-ju7tl 2 месяца назад +1

    there is a mistake at the end, heap solution is O(N logK ) not O(K logN)!!! otherwise this is a very good explanation, thanks!

  • @sharvitomar3809
    @sharvitomar3809 Год назад +3

    Firstly, thank you for being so amazing with your videos!
    Actually I was wondering in the for loop on line 12, since inside that loop we keep checking for the length of res, isn't that increases the time complexity from the linear expectation?

    • @jambajuice07
      @jambajuice07 Год назад

      nooo actually the inner runs only for n times . soo thats n(outer loop) + n(inner loop) = 2n which is O(n).

  • @shreyaskaup
    @shreyaskaup 2 года назад +2

    i had actually thought of the 2nd array implementation you said with N array size.. but i didnt think on how I would extract the top K as you did by going backwards! you're a genius!

  • @nishanttripathy8275
    @nishanttripathy8275 2 года назад +3

    For the heap the time complexity should be n.Log(k) since the max size of the heap can only be k and n is the number of elements

    • @firasyousfi2269
      @firasyousfi2269 Год назад

      Nope, if you are using a maxHeap it will be k log n. Because for that you need to heapify the whole thing first so n elements would be in the heap.
      Then you would pop 'k' times from the heap of size 'n'. So O(k log n).
      This is Valid if you use a maxHeap!!!
      If you are using a minHeap then you would be correct, then the heap would have a max size of k as you said. And you would loop n times and push then poll when size reaches k.

  • @freddy5638
    @freddy5638 2 года назад +1

    My man! I've spent HOURS watching you. FYI you can do the counter with collections, and save few lines of implementation

    • @yossarian2909
      @yossarian2909 2 года назад +1

      How is that?? Can you pls share

    • @hamzasayyid8152
      @hamzasayyid8152 2 года назад

      @@yossarian2909 from collections import Counter, then Counter(nums) gives the frequency dict

    • @balavikashkandukuri6139
      @balavikashkandukuri6139 2 года назад

      @@hamzasayyid8152 continue the code

  • @devashishubale1565
    @devashishubale1565 10 месяцев назад

    I was solving 692, and I got AC, I remembered this video came back to this. Thanks for such detailed videos.

  • @34535fff
    @34535fff 9 месяцев назад

    Man I could not do it myself because I didn't understand problem clearly, after 50 seconds of the video I understand and did it with ease, thanks a lot. Now I am going to watch rest of the video to learn optimal solution)

  • @ShreksSpliff
    @ShreksSpliff 2 месяца назад

    Watching this after attempting on NeetCode is soooo good!

  • @xinyiwu2837
    @xinyiwu2837 2 года назад +3

    Man you light up my leetcode

  • @srinadhp
    @srinadhp 3 года назад +1

    Got the same question in one of the phone screens. I came up with map and heap approach. The interviewer asked for an optimization for memory. And, I could not think of any other solution. It costed me next round unfortunately. Only if I notice your solution!!!! :-(
    Again! Great explanation and "neat" solution!!

    • @harigovind11
      @harigovind11 2 года назад +4

      This solution is also not memory optimised. Using "min" heap instead of max heap with size k would be memory optimised.

    • @kobeissi721
      @kobeissi721 2 года назад

      @@harigovind11 Wouldn't it still be O(N) since you still need to add them to a map to store the frequencies?

    • @sampatkalyan3103
      @sampatkalyan3103 2 года назад +1

      @@harigovind11 "Using "min" heap instead of max heap with size k would be memory optimised. " can you explain how ?

  • @tarunkathuria2705
    @tarunkathuria2705 6 месяцев назад

    You don't need bucket sort for linear time. You can just store the number, freq. pairs and run quickselect.

  • @angelinazhou667
    @angelinazhou667 3 месяца назад

    If we were to use the max heap approach, how would we apply heapfiy if heapify only takes in a list and not a dictionary? Thanks for the amazing video!

  • @indoor-adventurer
    @indoor-adventurer 3 месяца назад

    Does this become linear time because, we iterate the list of nums (n), then the map of counts (n), and eventually the frequency list (n). Which ends up as O(n + n +n) = O(3n) = O(n) ???

  • @FieryDragon883
    @FieryDragon883 Год назад +1

    Why would we initialise freq with (length of array + 1) empty lists instead of just len(array)

  • @1000marcelo1000
    @1000marcelo1000 3 месяца назад +1

    It should be `for i in range(len(freq)-1, -1, -1):` instead of `for i in range(len(nums)-1, 0, -1):` right?

  • @archmad
    @archmad Год назад

    you dont really need the to loop the line 13, just get the length of freq[i] and return if the res length is equal or greater than k

  • @CodeWithRVR
    @CodeWithRVR 6 месяцев назад +1

    man this was the video , great great explanation

  • @jugsma6676
    @jugsma6676 4 месяца назад

    The simplest method is:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
    cache = {}
    for c in nums:
    if c not in cache:
    cache[c] = 1
    else:
    cache[c] = cache[c] + 1
    cache = dict(sorted(cache.items(), key=lambda i:i[1], reverse=True))
    return list(cache.keys())[:k]
    O(nlogn)

  • @tomskrovan6622
    @tomskrovan6622 Год назад

    I believe the length of the frequency/bucket array could be lower than len(nums) + 1 because of the rule that the # of distinct elements >= k. This tells us that there will be at least k distinct elems and since each one must have at least one frequency, no single number could occupy all len(nums) spots (unless k is 1). Therefore I think it could be further optimized (albeit minimally lol) to:
    freq = [ [ ] for i in range( (len(nums) + 1) - (k - 1) ) ]

    • @ichigokurosaki7762
      @ichigokurosaki7762 8 месяцев назад

      Could you explain how the for loop works within the array, I didn't understand that part?
      Freq=[[] for i in range(Len(nums)+1)]

  • @fakruu
    @fakruu 2 года назад +5

    how does it work for negative numbers ? For eg : arr [ -1, -1] and k = 1

  • @akashp4863
    @akashp4863 2 года назад +1

    is log(n) better solution than nlog(n) and log(n)?.... can you create a video on how to find big o and which one is better than which? i know there are many videos in youtube, but yours will be the best

  • @geekydanish5990
    @geekydanish5990 2 года назад +1

    class Solution:
    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
    _map = {}
    res = []
    for num in nums:
    _map[num] = 1 + _map.get(num,0)
    heap = [(val,key) for key,val in _map.items()]
    for val,key in (sorted(heap,reverse=True)[0:k]):
    res.append(key)
    return res

  • @pratikmhatre4815
    @pratikmhatre4815 Год назад +2

    Also, in the first approach, its mentioned that if we use a max heap the total time will be O(k log n) which seems incorrect , as first to add all items to the max heap we have to traverse through all items which in worst case could be O(n) and adding takes O(log n) so in total it becomes O(n log n).
    If we use min heap and only hold max k items then it will give O(N log k)

    • @wernerheisenberg9653
      @wernerheisenberg9653 Год назад +1

      Time complexity for building heap is actually O(n) and not O(n log n). O(n log n) is not the tightest bound.

  • @avadheshsingh4255
    @avadheshsingh4255 9 месяцев назад

    without knowing what is bucket sort I was only able to come with the hashmap sorting solution thanks mate for the o(n) soln.. great explanation

  • @John-z7m
    @John-z7m Месяц назад

    I guess this is what makes the question medium? I just did a quick hash count of numbers and then Object.entries()'d it out and sorted by second index, then pulled the first k values out of that. So O(n log m). I wonder if this is weak in an interview. Offering bucket sort solution (or sometimes other niche solutions) just feels like I musta done the same question or a nearly identical question to pull it quickly out of an interview though... I'm surprised Neet code starts with two mediums whose solution feel a bit hyper specific. Is it intentional?

  • @seanbarel2
    @seanbarel2 4 месяца назад

    Is this solution also works when theres no limit on the values of the initial array? Or it assumes that the values in the array are bounded?

  • @shivani5882
    @shivani5882 2 года назад

    Hey! Could I receive a bit of clarification please? Why is your approach at 3:10 not O(n)? I thought it would be as would first need to find the max value in the given array (nums), then create our bucket array with that number as the upper limit? But since max( ) takes O(n) time - I'm quite confused. Thanks in advance!

  • @cgqqqq
    @cgqqqq 3 года назад +2

    fantastic explanation! the only issue I think is the first for loop should be for i in range(len(freq) - 1, -1, -1)

    • @Avuvos
      @Avuvos 3 года назад

      Not needed because that last spot is for values who appear 0 times in the array, so it’d be an emtpy list either way

  • @atulkumar-bb7vi
    @atulkumar-bb7vi Год назад +1

    Trust me bro, you are amazing explaining things. Thanks a lot for such content. Pls keep posting..

  • @haardpatel4242
    @haardpatel4242 8 месяцев назад

    using the two loops in the last actually increases the time complexity.

  • @arnavgoel1022
    @arnavgoel1022 Год назад

    This won't work for negative values. The question you are referring to from Leetcode also has negative values as inputs in the array

  • @shuvbhowmickbestin
    @shuvbhowmickbestin 9 месяцев назад

    Can we not use a heap/priority queue instead of using an arrya? Wouldn't that automatically keep the most frequent elements at the top if the sorting was done according to the count?

  • @tanayshah275
    @tanayshah275 3 года назад +3

    Made it simple but efficient as always!

  • @shawnlin9873
    @shawnlin9873 2 года назад +1

    I used a dict to maintain the num->occurance pair for all items. Barely got the AC but the sorting part threw me off...had to use a lambda function to sort the dict by value.
    The optimal solution is a lot more subtle.
    Great content, very informative and well explained.

    • @lemonke8132
      @lemonke8132 2 года назад

      i did the exact same thing lmao

  • @syedzami-ul-haquenavid9392
    @syedzami-ul-haquenavid9392 Год назад

    The explanation was so amazing that I understood how to solve half way through the video!

  • @AsafVertman53
    @AsafVertman53 5 месяцев назад

    Note that for the last loop using reversed(freq) is faster and more readable

  • @KardboardCode
    @KardboardCode 8 месяцев назад

    Hey everyone, I have one question.
    Why is this solution considered O(n) where n is the size of the nums array? Why are we not looking at the worst case complexity for lines 11-16 at 12:52
    If all items are unique
    Line 12: i will iterate over the range (0 , n) //Size of freq is n+1
    Line 13: if all items inside nums are unique, every item will be present at the first index
    This will lead to looping through the entire length of the array n
    => this gives us O(n^2)

  • @Nohope__
    @Nohope__ Год назад

    or alternatively sort the count table by values, and then return first k elements :
    # create hash_table where num in nums is ke
    # increment value for each key repeatation
    table = dict()
    for n in nums:
    if n in table:
    table[n] += 1
    else:
    table[n] = 1
    # sort by values
    table = sorted(table.items(), key=lambda item: item[1], reverse=True)
    # create a list
    answer = list()
    # store to answers
    for item in table[:k]:
    answer.append(item[0])
    return answer

    • @basedasuka
      @basedasuka 11 месяцев назад +1

      ur solution is much faster for me, thank u

  • @rohanaurangabadkar951
    @rohanaurangabadkar951 3 месяца назад

    Best explanation this helped me in solving Top K Frequent Elements and Sort Characters By Frequency

  • @mgst4699003
    @mgst4699003 Год назад

    Thank you man! Blessed to have you!

  • @il5083
    @il5083 Год назад

    The solution and thought process is genius! Can't come up with this optimal solution by myself, thanks a lot.

  • @jamescolon4478
    @jamescolon4478 2 года назад

    I don't understand why its considered bucket sort. I get that each indicie acts as a bucket for # of occurrences, but the actual sorting part never takes place as far as I can tell