Top K Frequent Elements

Top K Frequent Elements #

Problem Statement #

Given a non-empty array of integers, return the k most frequent elements.

Constraints:

  • You may assume k is always valid, 1 ≤ k ≤ number of unique elements
  • Your algorithm’s time complexity must be better than O(n log n), where n is the array’s size
  • It’s guaranteed that the answer is unique, in other words the set of the top k frequent elements is unique
  • You can return the answer in any order

Example:

Input: nums = [1,1,1,2,2,3], k = 2
Output: [1,2]

Understanding the Constraints #

Let’s break down what these constraints mean for our solution:

1. K is always valid (1 ≤ k ≤ number of unique elements)

This means you won’t encounter impossible scenarios. For instance, if your input array is [1,1,2,2] (only 2 unique elements), you won’t be asked to return the top 3 most common elements. This eliminates the need for error checking around invalid k values.

2. Time complexity must be better than O(n log n)

This is a critical constraint that eliminates certain approaches. A simple “sort the entire array and count” approach would be O(n log n), which violates this requirement. We need to find a more efficient solution, though the provided code actually uses O(n log n) which technically doesn’t meet this constraint - we’ll discuss better approaches later.

3. The answer is guaranteed to be unique

This means you won’t have ties that make the answer ambiguous. For example, if you have [1,1,1,3,3,3,4,4,4] and k=2, this violates the constraint because all three numbers appear equally (3 times each). The problem guarantees this won’t happen, so we don’t need to handle tie-breaking logic.

4. You can return the answer in any order

The output doesn’t need to be sorted - [1,2] and [2,1] are both valid answers for the example above.

Initial Approach Analysis #

Sorting Approach (Doesn’t work for this problem) #

One might initially think: “What if I sort the array and count occurrences?” For example, if you have [1,1,1,2,2,3], after sorting you’d get [1,1,1,2,2,3], and you could iterate through counting consecutive elements.

However, this approach has problems:

  1. Sorting alone is O(n log n), which violates our time constraint
  2. While it works for finding THE most common element, it becomes complicated for finding the k most common elements
  3. You’d still need additional data structures to track the top k elements

Frequency Count + Sort Approach (The provided solution) #

The solution provided takes a smarter approach:

  1. Count frequencies: Create a hash map (object) where keys are array elements and values are their frequencies
  2. Convert to sortable structure: Transform the object into an array of [element, frequency] pairs
  3. Sort by frequency: Sort this array by frequency in descending order
  4. Extract top k: Take the first k elements from the sorted array

Code Implementation #

const topKFrequent = (nums, k) => {
  const counts = {};
  const sortable = [];
  const result = [];
  
  // Step 1: Get counts - O(n)
  for (let num of nums) {
    counts[num] = (counts[num] || 0) + 1;
  }
  
  // Step 2: Convert to array and sort - O(m log m) where m = unique elements
  function sortBy(a, b) {
    return b[1] - a[1];
  }
  
  for (let i in counts) {
    sortable.push([i, counts[i]]);
  }
  sortable.sort(sortBy);

  // Step 3: Create result - O(k)
  for (let i = 0; i < k; i++) {
    result.push(sortable[i][0]);
  }

  return result;
};

Detailed Code Breakdown #

Step 1: Building the Frequency Map #

for (let num of nums) {
  counts[num] = (counts[num] || 0) + 1;
}

This creates an object where each unique number maps to its frequency. The (counts[num] || 0) is a clever pattern:

  • If counts[num] doesn’t exist (first time seeing this number), it’s undefined, which is falsy, so we use 0
  • If it does exist, we use the existing value
  • Then we add 1 to increment the count

Example: For [1,1,1,2,2,3], this produces:

{
  1: 3,
  2: 2,
  3: 1
}

Step 2: Converting to Sortable Array #

function sortBy(a, b) {
  return b[1] - a[1];
}

for (let i in counts) {
  sortable.push([i, counts[i]]);
}
sortable.sort(sortBy);

Why convert to an array? JavaScript objects don’t maintain a guaranteed order for their properties (though modern engines do preserve insertion order for string keys, it’s not officially guaranteed). Arrays, however, maintain order explicitly.

The sortBy function is a comparator for descending order:

  • b[1] - a[1] compares the second element (frequency) of each pair
  • Positive result means b comes first (descending order)
  • a[1] and b[1] access the frequency count in each [element, count] pair

Example: For our frequency map above, this produces:

[
  [1, 3],
  [2, 2],
  [3, 1]
]

Step 3: Extracting the Result #

for (let i = 0; i < k; i++) {
  result.push(sortable[i][0]);
}

Since our array is sorted in descending order by frequency, the first k elements are our answer. We use sortable[i][0] to extract just the element (not the frequency count).

Example: For k=2, this gives us [1, 2].

Time Complexity Analysis #

Let’s analyze each step:

  1. Building frequency map: O(n) - we iterate through all n elements once
  2. Converting to array: O(m) where m = number of unique elements
  3. Sorting: O(m log m) where m = number of unique elements
  4. Extracting result: O(k)

Total complexity: O(n + m log m)

In the worst case where all elements are unique, m = n, giving us O(n log n). However, note that typically m « n (much fewer unique elements than total elements), making this approach quite efficient in practice.

Important note: The problem states the complexity must be BETTER than O(n log n). While this solution works correctly, it doesn’t technically meet that constraint in the worst case.

Better Approach: Bucket Sort (O(n) solution) #

To truly satisfy the time complexity constraint, we can use bucket sort:

const topKFrequent = (nums, k) => {
  // Step 1: Build frequency map - O(n)
  const counts = {};
  for (let num of nums) {
    counts[num] = (counts[num] || 0) + 1;
  }
  
  // Step 2: Create buckets - O(n)
  const buckets = Array(nums.length + 1).fill(null).map(() => []);
  for (let num in counts) {
    buckets[counts[num]].push(parseInt(num));
  }
  
  // Step 3: Collect top k from buckets - O(n)
  const result = [];
  for (let i = buckets.length - 1; i >= 0 && result.length < k; i--) {
    if (buckets[i].length > 0) {
      result.push(...buckets[i]);
    }
  }
  
  return result.slice(0, k);
};

Key insight: The maximum frequency any element can have is n (all elements are the same). So we create n+1 buckets where bucket[i] contains all elements with frequency i. Then we iterate from the highest frequency bucket down, collecting elements until we have k items.

This achieves true O(n) time complexity and satisfies the problem’s constraint!

Conclusion #

The provided solution uses a frequency counting and sorting approach that’s intuitive and practical, achieving O(n + m log m) complexity. While this doesn’t strictly meet the “better than O(n log n)” constraint when all elements are unique, it performs very well in typical cases.

For a true O(n) solution that meets all constraints, the bucket sort approach is optimal. Both solutions demonstrate important algorithmic concepts: hash maps for counting, sorting with custom comparators, and the clever use of bucket sort when the range of values is known and limited.