Top K Frequent Elements #
Problem Statement #
Given a non-empty array of integers, return the k most frequent elements.
Constraints:
- You may assume k is always valid, 1 ≤ k ≤ number of unique elements
- Your algorithm’s time complexity must be better than O(n log n), where n is the array’s size
- It’s guaranteed that the answer is unique, in other words the set of the top k frequent elements is unique
- You can return the answer in any order
Example:
Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]
Understanding the Constraints #
Let’s break down what these constraints mean for our solution:
1. K is always valid (1 ≤ k ≤ number of unique elements)
This means you won’t encounter impossible scenarios. For instance, if your input array is [1,1,2,2]
(only 2 unique elements), you won’t be asked to return the top 3 most common elements. This eliminates the need for error checking around invalid k values.
2. Time complexity must be better than O(n log n)
This is a critical constraint that eliminates certain approaches. A simple “sort the entire array and count” approach would be O(n log n), which violates this requirement. We need to find a more efficient solution, though the provided code actually uses O(n log n) which technically doesn’t meet this constraint - we’ll discuss better approaches later.
3. The answer is guaranteed to be unique
This means you won’t have ties that make the answer ambiguous. For example, if you have [1,1,1,3,3,3,4,4,4]
and k=2, this violates the constraint because all three numbers appear equally (3 times each). The problem guarantees this won’t happen, so we don’t need to handle tie-breaking logic.
4. You can return the answer in any order
The output doesn’t need to be sorted - [1,2]
and [2,1]
are both valid answers for the example above.
Initial Approach Analysis #
Sorting Approach (Doesn’t work for this problem) #
One might initially think: “What if I sort the array and count occurrences?” For example, if you have [1,1,1,2,2,3]
, after sorting you’d get [1,1,1,2,2,3]
, and you could iterate through counting consecutive elements.
However, this approach has problems:
- Sorting alone is O(n log n), which violates our time constraint
- While it works for finding THE most common element, it becomes complicated for finding the k most common elements
- You’d still need additional data structures to track the top k elements
Frequency Count + Sort Approach (The provided solution) #
The solution provided takes a smarter approach:
- Count frequencies: Create a hash map (object) where keys are array elements and values are their frequencies
- Convert to sortable structure: Transform the object into an array of [element, frequency] pairs
- Sort by frequency: Sort this array by frequency in descending order
- Extract top k: Take the first k elements from the sorted array
Code Implementation #
const topKFrequent = (nums, k) => {
const counts = {};
const sortable = [];
const result = [];
// Step 1: Get counts - O(n)
for (let num of nums) {
counts[num] = (counts[num] || 0) + 1;
}
// Step 2: Convert to array and sort - O(m log m) where m = unique elements
function sortBy(a, b) {
return b[1] - a[1];
}
for (let i in counts) {
sortable.push([i, counts[i]]);
}
sortable.sort(sortBy);
// Step 3: Create result - O(k)
for (let i = 0; i < k; i++) {
result.push(sortable[i][0]);
}
return result;
};
Detailed Code Breakdown #
Step 1: Building the Frequency Map #
for (let num of nums) {
counts[num] = (counts[num] || 0) + 1;
}
This creates an object where each unique number maps to its frequency. The (counts[num] || 0)
is a clever pattern:
- If
counts[num]
doesn’t exist (first time seeing this number), it’sundefined
, which is falsy, so we use0
- If it does exist, we use the existing value
- Then we add 1 to increment the count
Example: For [1,1,1,2,2,3]
, this produces:
{
1: 3,
2: 2,
3: 1
}
Step 2: Converting to Sortable Array #
function sortBy(a, b) {
return b[1] - a[1];
}
for (let i in counts) {
sortable.push([i, counts[i]]);
}
sortable.sort(sortBy);
Why convert to an array? JavaScript objects don’t maintain a guaranteed order for their properties (though modern engines do preserve insertion order for string keys, it’s not officially guaranteed). Arrays, however, maintain order explicitly.
The sortBy
function is a comparator for descending order:
b[1] - a[1]
compares the second element (frequency) of each pair- Positive result means b comes first (descending order)
a[1]
andb[1]
access the frequency count in each [element, count] pair
Example: For our frequency map above, this produces:
[
[1, 3],
[2, 2],
[3, 1]
]
Step 3: Extracting the Result #
for (let i = 0; i < k; i++) {
result.push(sortable[i][0]);
}
Since our array is sorted in descending order by frequency, the first k elements are our answer. We use sortable[i][0]
to extract just the element (not the frequency count).
Example: For k=2, this gives us [1, 2]
.
Time Complexity Analysis #
Let’s analyze each step:
- Building frequency map: O(n) - we iterate through all n elements once
- Converting to array: O(m) where m = number of unique elements
- Sorting: O(m log m) where m = number of unique elements
- Extracting result: O(k)
Total complexity: O(n + m log m)
In the worst case where all elements are unique, m = n, giving us O(n log n). However, note that typically m « n (much fewer unique elements than total elements), making this approach quite efficient in practice.
Important note: The problem states the complexity must be BETTER than O(n log n). While this solution works correctly, it doesn’t technically meet that constraint in the worst case.
Better Approach: Bucket Sort (O(n) solution) #
To truly satisfy the time complexity constraint, we can use bucket sort:
const topKFrequent = (nums, k) => {
// Step 1: Build frequency map - O(n)
const counts = {};
for (let num of nums) {
counts[num] = (counts[num] || 0) + 1;
}
// Step 2: Create buckets - O(n)
const buckets = Array(nums.length + 1).fill(null).map(() => []);
for (let num in counts) {
buckets[counts[num]].push(parseInt(num));
}
// Step 3: Collect top k from buckets - O(n)
const result = [];
for (let i = buckets.length - 1; i >= 0 && result.length < k; i--) {
if (buckets[i].length > 0) {
result.push(...buckets[i]);
}
}
return result.slice(0, k);
};
Key insight: The maximum frequency any element can have is n (all elements are the same). So we create n+1 buckets where bucket[i] contains all elements with frequency i. Then we iterate from the highest frequency bucket down, collecting elements until we have k items.
This achieves true O(n) time complexity and satisfies the problem’s constraint!
Conclusion #
The provided solution uses a frequency counting and sorting approach that’s intuitive and practical, achieving O(n + m log m) complexity. While this doesn’t strictly meet the “better than O(n log n)” constraint when all elements are unique, it performs very well in typical cases.
For a true O(n) solution that meets all constraints, the bucket sort approach is optimal. Both solutions demonstrate important algorithmic concepts: hash maps for counting, sorting with custom comparators, and the clever use of bucket sort when the range of values is known and limited.