Intersection of Two Arrays: A Comprehensive Guide

Problem Statement #

Given two integer arrays, write a function to compute their intersection. The intersection should include all elements that appear in both arrays, including duplicates.

Example #

Input:  nums1 = [1,2,2,1], nums2 = [2,2]
Output: [2,2]

Note: Each element in the result should appear as many times as it shows in both arrays. The result can be in any order.

Understanding the Problem #

The intersection of two arrays refers to finding common elements that exist in both arrays. This concept comes from set theory in mathematics, where the intersection of two sets contains only elements that belong to both sets. However, unlike mathematical sets where each element is unique, arrays can contain duplicate values, and we need to preserve these duplicates in our result based on their frequency in both arrays.

For instance, if nums1 = [1,2,2,1] and nums2 = [2,2], the element 2 appears twice in both arrays, so our output should be [2,2].

Solution Approaches #

Approach 1: Brute Force (Nested Loop) #

The most straightforward approach would be to use a nested loop structure. For every element in the first array, we check if it exists in the second array. When we find a match, we add it to our result and mark that element as used to handle duplicates correctly.

Time Complexity: O(N × M) where N and M are the lengths of the two arrays.

Space Complexity: O(1) excluding the output array.

Analysis: While this approach is intuitive, it’s inefficient for large datasets because for each element in the first array, we potentially scan the entire second array.

Approach 2: Binary Search Optimization #

We can improve upon the brute force method by sorting the second array first. This allows us to use binary search instead of linear search when looking for elements from the first array.

Time Complexity: O(M log M + N log M) where sorting takes O(M log M) and searching N elements takes O(N log M).

Space Complexity: O(1) excluding the output array.

Analysis: This is better than brute force but still not optimal. The sorting overhead and repeated binary searches add unnecessary complexity.

Approach 3: Two-Pointer Method (Optimal) #

The most efficient approach is to sort both arrays and then use a two-pointer technique. We maintain pointers for both arrays and move them based on the comparison of elements at each position.

Algorithm:

Sort both arrays in ascending order
Initialize two pointers, one for each array, starting at index 0
Compare elements at both pointers:
- If they’re equal, add to result and move both pointers forward
- If element in first array is smaller, move its pointer forward
- If element in second array is smaller, move its pointer forward
Continue until one array is exhausted

Time Complexity: O(N log N + M log M) for sorting, plus O(N + M) for the traversal = O(N log N + M log M)

Space Complexity: O(1) excluding the output array (or O(log N + log M) if we count the sorting stack space).

Analysis: This is the optimal solution because we only traverse each array once after sorting. The sorting step dominates the time complexity, but the linear traversal ensures we don’t waste time on unnecessary comparisons.

Implementation #

Here’s the optimized two-pointer solution implemented in JavaScript:

var intersect = function(nums1, nums2) {
    // Custom comparison function for sorting in ascending order
    // While JavaScript's default sort works for numbers, 
    // explicitly defining the comparator ensures consistent behavior
    function comparator(a, b) {
        if (a < b) {
            return -1;
        }
        else if (a > b) {
            return 1;
        }
        return 0;
    }
    
    // Sort both arrays in ascending order
    nums1 = nums1.sort(comparator);
    nums2 = nums2.sort(comparator);
    
    let result = [];
    let i = 0; // pointer for nums1
    let j = 0; // pointer for nums2
    
    // Traverse both arrays using two pointers
    while (i < nums1.length && j < nums2.length) {
        if (nums1[i] === nums2[j]) {
            // Found a common element
            result.push(nums1[i]);
            i++;
            j++;
        }
        else if (nums1[i] < nums2[j]) {
            // Element in nums1 is smaller, move to next element in nums1
            i++;
        }
        else {
            // Element in nums2 is smaller, move to next element in nums2
            j++;
        }
    }
    
    return result;
};

Code Walkthrough #

Let’s trace through an example to see how this works:

Input: nums1 = [4,9,5], nums2 = [9,4,9,8,4]

Step 1: Sort both arrays
nums1 = [4,5,9]
nums2 = [4,4,8,9,9]

Step 2: Initialize pointers
i = 0 (pointing to 4 in nums1)
j = 0 (pointing to 4 in nums2)

Step 3: Traverse
- nums1[0] === nums2[0]: 4 === 4 → add 4 to result, i=1, j=1
- nums1[1] vs nums2[1]: 5 < 4 is false, 5 > 4 → j=2
- nums1[1] vs nums2[2]: 5 < 8 → i=2
- nums1[2] vs nums2[2]: 9 > 8 → j=3
- nums1[2] === nums2[3]: 9 === 9 → add 9 to result, i=3, j=4
- i reached end of nums1, stop

Result: [4,9]

JavaScript Sorting Considerations #

Important Note about JavaScript’s Sort Method:

JavaScript’s Array.sort() method has some quirks that are important to understand:

Default Behavior: Without a comparator function, sort() converts elements to strings and sorts them lexicographically. This means [1, 2, 10, 21] becomes [1, 10, 2, 21] which is usually not what we want for numbers.
Custom Comparator: By providing a comparator function, we ensure numerical sorting:
- Return negative value if a should come before b
- Return positive value if a should come after b
- Return 0 if they’re equal
In-Place Sorting: The sort() method sorts the array in place and returns a reference to the same array. In the code above, nums1 = nums1.sort(comparator) is technically redundant but makes the intent clear.

Alternative Approach: Hash Map #

For scenarios where one array is significantly smaller than the other, or when you can’t modify the input arrays, a hash map approach might be more suitable:

var intersect = function(nums1, nums2) {
    // Build frequency map for nums1
    const map = new Map();
    for (let num of nums1) {
        map.set(num, (map.get(num) || 0) + 1);
    }
    
    // Find intersection by checking nums2 against the map
    const result = [];
    for (let num of nums2) {
        if (map.has(num) && map.get(num) > 0) {
            result.push(num);
            map.set(num, map.get(num) - 1);
        }
    }
    
    return result;
};

Time Complexity: O(N + M)

Space Complexity: O(min(N, M)) for the hash map

This approach is optimal when you cannot modify the input arrays or when one array is much smaller than the other.

Follow-up Questions #

What if the given array is already sorted?
- Skip the sorting step and directly apply the two-pointer technique, reducing time complexity to O(N + M).
What if nums1’s size is small compared to nums2’s size?
- Use the hash map approach, building the map from the smaller array to minimize space usage.
What if elements of nums2 are stored on disk and memory is limited?
- Sort nums2 on disk using external sorting, then use the two-pointer approach by loading chunks of nums2 into memory.
- Alternatively, if nums1 fits in memory, use the hash map approach and stream nums2 from disk.

Conclusion #

The intersection of two arrays problem demonstrates the importance of choosing the right algorithm based on constraints. While the two-pointer method after sorting provides an optimal general solution with O(N log N + M log M) time complexity, understanding alternative approaches like hash maps or binary search allows you to optimize for specific scenarios. Always consider the characteristics of your input data—whether arrays are pre-sorted, their relative sizes, and memory constraints—to select the most appropriate solution.

Complexity Summary #

Approach	Time Complexity	Space Complexity	Best Use Case
Brute Force	O(N × M)	O(1)	Very small arrays
Binary Search	O(N log M + M log M)	O(1)	When one array is small
Two Pointers	O(N log N + M log M)	O(1)	General purpose, large arrays
Hash Map	O(N + M)	O(min(N,M))	Cannot modify arrays, one is smaller