Dmytro Chyzhykov's Blog

Yet another programmer.

First Occurrence Binary Search

How to implement Binary Search algorithm which returns index of the first occurrence of a given value in a sorted array with repeated values.

Classic Binary Search algorithm

The well-known classic iterative binary search algorithm from the standard JDK works fine for sorted arrays with unique values. Here is its possible implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
private int binarySearch(int[] source, int needle) {
    int low = 0;
    int high = source.length - 1;

    while (low <= high) {
        int middle = low + ((high - low) >>> 1);

        if (source[middle] == needle) {
            return middle;  // key found
        } else if (source[middle] < needle) {
            low = middle + 1;
        } else {
            high = middle - 1;
        }
    }

    return -(low + 1);      // key not found
}

Problem

When we feed the algorithm a sorted array with non-unique values, it not necessary returns us an index of the first occurrence of a given key, but one from a range of possible indexes. For instance, the next code returns us 1.

1
2
int[] array = { 1, 1, 1, 1, 2, 2, 3, 3, 3 };
binarySearch(array, 1);  // 1

How can we get the first occurrence of a given key without impact on performance?

Solution

Our modified binary search algorithm has got a few simple modifications which apply when we found a given key: we remember found index, decrease the high bound and search again while we have where to search. Finally, if any occurrence found we return one, or return default binary algorithm not found message value, otherwise. Algorithm’s complexity is O(log n).

Here is its possible implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
private int firstOccurrenceBinarySearch(int[] source, int needle) {
    int low = 0;
    int high = source.length - 1;
    int firstOccurrence = Integer.MIN_VALUE;

    while (low <= high) {
        int middle = low + ((high - low) >>> 1);

        if (source[middle] == needle) {
            // key found and we want to search an earlier occurrence
            firstOccurrence = middle;
            high = middle - 1;
        } else if (source[middle] < needle) {
            low = middle + 1;
        } else {
            high = middle - 1;
        }
    }

    if (firstOccurrence != Integer.MIN_VALUE) {
        return firstOccurrence;
    }

    return -(low + 1);  // key not found
}

and with unit test:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
package com.ffbit.fun.array;

import static org.hamcrest.CoreMatchers.is;
import static org.junit.Assert.assertThat;

import java.util.Arrays;
import java.util.Collection;

import org.junit.Test;
import org.junit.runner.RunWith;
import org.junit.runners.Parameterized;
import org.junit.runners.Parameterized.Parameters;

@RunWith(Parameterized.class)
public class FirstOccurrenceBinarySearchTest {
    private int[] array;
    private int key;
    private int expectedIndex;

    public FirstOccurrenceBinarySearchTest(int[] array, int key, int expectedIndex) {
        this.array = array;
        this.key = key;
        this.expectedIndex = expectedIndex;
    }

    @Parameters
    public static Collection<Object[]> init() {
        return Arrays.asList(new Object[][] {
                { new int[] {}, 0, -1 },

                { new int[] { 1 }, 0, -1 },
                { new int[] { 1 }, 1, 0 },
                { new int[] { 1 }, 2, -2 },

                { new int[] { 1, 2 }, 0, -1 },
                { new int[] { 1, 2 }, 1, 0 },
                { new int[] { 1, 2 }, 2, 1 },
                { new int[] { 1, 2 }, 3, -3 },

                { new int[] { 1, 1, 1, 1, 2, 2, 3, 3, 3 }, 1, 0 },
                { new int[] { 1, 1, 1, 1, 2, 2, 3, 3, 3 }, 2, 4 },
                { new int[] { 1, 1, 1, 1, 2, 2, 3, 3, 3 }, 3, 6 },
                { new int[] { 1, 1, 1, 1, 2, 2, 3, 3, 3 }, 4, -10 }
        });
    }

    @Test
    public void binarySearchTest() throws Exception {
        assertThat(firstOccurrenceBinarySearch(array, key), is(expectedIndex));
    }

    private int firstOccurrenceBinarySearch(int[] source, int needle) {
        int low = 0;
        int high = source.length - 1;
        int firstOccurrence = Integer.MIN_VALUE;

        while (low <= high) {
            int middle = low + ((high - low) >>> 1);

            if (source[middle] == needle) {
                // key found and we want to search an earlier occurrence
                firstOccurrence = middle;
                high = middle - 1;
            } else if (source[middle] < needle) {
                low = middle + 1;
            } else {
                high = middle - 1;
            }
        }

        if (firstOccurrence != Integer.MIN_VALUE) {
            return firstOccurrence;
        }

        return -(low + 1);  // key not found
    }

}