Reference: LeetCode

Difficulty: Easy

## Problem

Given two arrays, write a function to compute their intersection.

**Note:**

- Each element in the result should appear as many times as it shows in both arrays.
- The result can be in any order.

**Example:**

1 | Input: nums1 = [1,2,2,1], nums2 = [2,2] |

**Follow up:**

- What if the given array is already sorted? How would you optimize your algorithm?

- Binary search
- Two pointers

- What if
`nums1`

‘s size is small compared to`nums2`

‘s size? Which algorithm is better? (Reference)

- The size of
`nums1`

is $M$ ($M < N$). - Two Pointers: The worst case runtime is $O(N)$.
- Binary Search: The worst case runtime is $O(M\log{N})$.
- So the binary search method is better.

- What if elements of
`nums2`

are stored on disk, and the memory is limited such that you cannot load all elements into the memory at once?

- If only
`nums2`

cannot fit in memory, put all elements of`nums1`

in a hash map. Read chunks of array that fit into the memory and calculate the intersections. It’s better to use the smaller array (`nums1`

) to construct the counter hash map. - If both
`nums1`

and`nums2`

are so large that neither fit into the memory, sort them individually, then read two elements from each array at a time and calculate their intersections.

## Analysis

### Hash Map

Let `nums2`

be the smaller array ($M > N$) and convert it to a hash map. We use this hash map to count the occurrence of elements.

**Note:**

1 | public int[] intersect(int[] nums1, int[] nums2) { |

**Time:** $O(M + N)$**Space:** $O(N)$ where $M > N$.

### Binary Search

Assume the arrays `nums1`

and `nums2`

are sorted, otherwise it takes $O(M\log{M})$ and $O(N\log{N})$ time to sort them.

Let the size of `nums1`

is smaller than `nums2`

($M < N$). For each element in `nums1`

, it takes $O(M)$ and for each step we do a lower bound binary search in `nums2`

so it takes $O(\log{N})$. In total, the runtime should be $O(M\log{N})$ if the arrays are pre-sorted.

**Note:**

- Write an extra boundary checking function since we check boundary usually.
- Consider the following cases:
- Case 1:
- nums1:
`[1 1 1 2 3]`

- nums2:
`[1 2 3 4 5 6 7]`

- nums1:
- Case 2:
- nums1:
`[1 2 3]`

- nums2:
`[1 1 1 2 3 4]`

- nums1:
- Case 3:
- nums1:
`[1 2 3]`

- nums2:
`[1 2 3]`

- nums1:

- Case 1:
- So for each iteration in
`nums1`

we need to do the following checking:- Continue adding to result list when
`nums1[idx1] == nums2[idx2]`

. - In each iteration we are inspecting
`curr`

in`nums1`

. As`idx1`

increases, we make sure`nums1[idx1] == curr`

. (Case 3)

- Continue adding to result list when
- When
`nums1[idx1] != nums2[idx2]`

, we need to update`idx1`

until`nums1[idx1] != curr`

since we have done with`curr`

. (Case 1)

1 | public int[] intersect(int[] nums1, int[] nums2) { |

**Time:** $O(M\log{N})$**Space:** $O(1)$

### Two Pointers

Based on the solution in `349. Intersection of Two Arrays`

, we change the hash set to a list.

1 | public int[] intersect(int[] nums1, int[] nums2) { |

**Time:** $O(M)$ or $O(N)$**Space:** $O(1)$

As for the time complexity, consider:

- Case 1:
`[1 2 3 4 5 6 7 8]`

$O(M)$`[9 10]`

- Case 2:
`[1 2 3]`

$O(M)$`[4 5 6 7 8 9 10]`

So we say the time complexity is $O(M)$ `OR`

$O(N)$. In the worst case, it is $O(\text{max}(M, N))$