Back to Arrays And Hashing

Top K Frequent Elements

Top K Frequent Elements

🧩 Problem Statement

Given an integer array nums and an integer k, return the k most frequent elements. You may return the answer in any order.

šŸ”¹ Example 1:

Input:

nums = [1,1,1,2,2,3], k = 2

Output:

[1,2]

šŸ”¹ Example 2:

Input:

nums = [1], k = 1

Output:

[1]

šŸ” Approaches

1. Bucket Sort (Optimized)

Instead of sorting the frequencies ($O(n \log n)$), we can use the fact that the maximum possible frequency is $n$ (length of array).

✨ Intuition

  • Count the frequency of each number. Map: 1 -> 3, 2 -> 2, 3 -> 1.
  • We can create "buckets" where the index represents the frequency.
  • bucket[3] will contain [1] (numbers that appeared 3 times).
  • bucket[2] will contain [2].
  • bucket[1] will contain [3].
  • Iterate from the highest frequency bucket down to 1 and collect k elements.

šŸ”„ Algorithm Steps

  1. Create a count hash map to store frequency of each number.
  2. Create freq (arrays of lists), where freq[i] stores list of numbers that appear i times.
  3. Iterate through nums to populate count.
  4. Iterate through count to populate freq.
  5. Iterate freq backwards (from $n$ to 1). Append elements to res.
  6. Stop when len(res) == k.

ā³ Time & Space Complexity

  • Time Complexity: $O(n)$, since we iterate through the array to count, then map to buckets, then iterate buckets. All are linear passes.
  • Space Complexity: $O(n)$, to store the hash map and the bucket array.

šŸš€ Code Implementations

C++

class Solution {
public:
vector<int> topKFrequent(vector<int>& nums, int k) {
unordered_map<int, int> count;
for (int n : nums) count[n]++;
vector<vector<int>> bucket(nums.size() + 1);
for (auto& p : count) {
bucket[p.second].push_back(p.first);
}
vector<int> res;
for (int i = bucket.size() - 1; i >= 0; i--) {
for (int n : bucket[i]) {
res.push_back(n);
if (res.size() == k) return res;
}
}
return res;
}
};

Python

class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
count = {}
freq = [[] for i in range(len(nums) + 1)]
for n in nums:
count[n] = 1 + count.get(n, 0)
for n, c in count.items():
freq[c].append(n)
res = []
for i in range(len(freq) - 1, 0, -1):
for n in freq[i]:
res.append(n)
if len(res) == k:
return res

Java

class Solution {
public int[] topKFrequent(int[] nums, int k) {
Map<Integer, Integer> count = new HashMap<>();
for (int n : nums) {
count.put(n, count.getOrDefault(n, 0) + 1);
}
List<Integer>[] bucket = new List[nums.length + 1];
for (int key : count.keySet()) {
int frequency = count.get(key);
if (bucket[frequency] == null) {
bucket[frequency] = new ArrayList<>();
}
bucket[frequency].add(key);
}
int[] res = new int[k];
int index = 0;
for (int i = bucket.length - 1; i >= 0 && index < k; i--) {
if (bucket[i] != null) {
for (int n : bucket[i]) {
res[index++] = n;
if (index == k) return res;
}
}
}
return res;
}
}

šŸŒ Real-World Analogy

Video Trending Chart:

Imagine YouTube ranking videos.

  • Every time a video is watched (num), we increment its view count (frequency).
  • To find the "Top 10 Trending" (k=10), we don't need to sort ALL videos.
  • We group them: "Videos with 1M views", "Videos with 900k views"...
  • We just pick from the top buckets until we have 10 videos.

šŸŽÆ Summary

āœ… O(n) Time: Beating the $O(n \log n)$ sorting approach.
āœ… Bucket Sort: Useful technique when values are bounded (freq $\le n$).

Solution Code
O(n) TimeO(1) Space
#include <iostream>
#include <unordered_map>
#include <vector>

using namespace std;

class Solution {
public:
  vector<int> topKFrequent(vector<int> &nums, int k) {
    unordered_map<int, int> count;
    for (int n : nums)
      count[n]++;

    vector<vector<int>> bucket(nums.size() + 1);
    for (auto &p : count) {
      bucket[p.second].push_back(p.first);
    }

    vector<int> res;
    for (int i = bucket.size() - 1; i >= 0; i--) {
      for (int n : bucket[i]) {
        res.push_back(n);
        if (res.size() == k)
          return res;
      }
    }
    return res;
  }
};

int main() {
  Solution sol;
  vector<int> nums = {1, 1, 1, 2, 2, 3};
  int k = 2;
  vector<int> result = sol.topKFrequent(nums, k);
  cout << "[";
  for (int i = 0; i < result.size(); ++i) {
    cout << result[i] << (i < result.size() - 1 ? ", " : "");
  }
  cout << "]" << endl;
  return 0;
}
SYSTEM STABLE
UTF-8 • STATIC_RENDER