Contains Duplicate

Problem Statement #

Given an array of integers, find if the array contains any duplicates. Your function should return true if any value appears at least twice in the array, and it should return false if every element is distinct.

Examples #

Example 1:

Input: [1,2,3,1]
Output: true
Explanation: The value 1 appears twice in the array.

Example 2:

Input: [1,2,3,4]
Output: false
Explanation: All elements are distinct.

Example 3:

Input: [1,1,1,3,3,4,3,2,4,2]
Output: true
Explanation: Multiple values appear more than once.

Solution #

Approach: Hash Set/Object Tracking #

The most efficient approach to solve this problem is to use a hash-based data structure to track elements we’ve already encountered. As we iterate through the array, we check if each element has been seen before. If it has, we immediately return true. If we complete the iteration without finding duplicates, we return false.

This approach leverages the O(1) average-case lookup time of hash-based structures, making it significantly more efficient than comparing every element with every other element (which would be O(n²)).

JavaScript Implementation #

var containsDuplicate = function(nums) {
    let seen = {};
    for (let i = 0; i < nums.length; i++) {
        if (seen[nums[i]]) {
            return true;
        }
        seen[nums[i]] = 1;
    }
    return false;
};

Code Breakdown #

Let’s analyze each component of the solution:

1. Initialize the tracking object:

let seen = {};

We create an empty object to store the elements we’ve encountered. This serves as our hash set.

2. Iterate through the array:

for (let i = 0; i < nums.length; i++) {

We use a standard for loop to examine each element in the input array.

3. Check for duplicates:

if (seen[nums[i]]) {
    return true;
}

For each element, we check if it exists as a key in our seen object. If it does, we’ve found a duplicate and can immediately return true. This early return optimization prevents unnecessary iterations once a duplicate is found.

4. Mark element as seen:

seen[nums[i]] = 1;

If the element hasn’t been seen before, we add it to our seen object with any truthy value (in this case, 1). The specific value doesn’t matter—we only care about the presence of the key.

5. Return false if no duplicates found:

return false;

If we complete the entire loop without finding any duplicates, we return false.

Alternative Implementation: Using Set #

JavaScript’s Set data structure provides a more idiomatic solution:

var containsDuplicate = function(nums) {
    return new Set(nums).size !== nums.length;
};

This one-liner works because a Set automatically removes duplicates. If the size of the set is less than the length of the original array, duplicates must have existed.

Alternative Implementation: Using Map #

We could also use JavaScript’s Map for more explicit tracking:

var containsDuplicate = function(nums) {
    let seen = new Map();
    for (let num of nums) {
        if (seen.has(num)) {
            return true;
        }
        seen.set(num, true);
    }
    return false;
};

Complexity Analysis #

Time Complexity: O(n) #

We iterate through the array once, and each lookup/insertion operation in the hash table takes O(1) average time. Therefore, the overall time complexity is linear relative to the input size.

Space Complexity: O(n) #

In the worst case (when all elements are unique), we store every element in our hash structure, resulting in O(n) space complexity.

Why This Approach? #

Why use an object/Map instead of an array?

While we could theoretically use an array with array indices corresponding to the values in nums, this approach has significant drawbacks:

  1. Sparse arrays: If the input contains values like [1, 1000000], we’d need to allocate an array with 1,000,000 slots, wasting enormous amounts of memory for just two values.

  2. Negative numbers: Arrays can’t have negative indices in JavaScript, so we’d need additional logic to handle negative values.

  3. Non-integer values: If the problem were extended to handle non-integer types, an array-based approach wouldn’t work at all.

Hash-based structures (objects, Maps, Sets) elegantly handle all these cases with consistent O(1) performance and minimal memory overhead.

Edge Cases #

The solution correctly handles several edge cases:

  • Empty array: Returns false (no duplicates in an empty array)
  • Single element: Returns false (one element can’t be a duplicate)
  • All duplicates: Returns true on the second element
  • Negative numbers: Works correctly due to hash-based storage
  • Zero: Handled like any other number

Conclusion #

The hash set approach provides an optimal solution to the contains duplicate problem, balancing time and space efficiency. While the Set-based one-liner is more concise, the explicit loop implementation offers better clarity for understanding the underlying algorithm and can be more easily adapted if we need to track additional information about the duplicates (such as their positions or frequency).