“Although the basic idea of binary search is comparatively straightforward, the details can be surprisingly tricky”. This statement is famously attributed to Donald Knuth, and I’ve recently learned the hard way how right he was.

Consider the following problem:

You’re given an inequality |A-X| < P – where A and P are some integer constants. Find how many integers satisfy that inequality. For simplicity let’s limit ourselves to X > – 2^31 and X < 2^31-1 Simple enough, we can divide the function |A-X| to two sections (- 2^31,A> where the function is strictly decreasing and <A,2^31-1) where it’s strictly increasing.

This can be solved easily analytically – X has to be between P-A and P+A, but let’s think about a generic monotonic inequality that can be specified on runtime.

To solve this more general problem,we only need to do a binary search for the minimum X satisfying inequality in the left section and maximum X satisfying inequality in the right one.

With the typical ‘wikipedia’ implementation of binary search by default we won’t do either (it assumes decreasing order, so it won’t do for the second section, and it returns invalid value if there isn’t an exact edge-value satisfying the inequality).

So there are two major considerations for our search function:

  • increasing / decreasing order of our search domain
  • edge values – should the function return the last value that satisfies conditions, the first one that doesn’t or some value indicating error.

For our example we need both increasing / decreasing order of search and last value that satisfies the conditions. Please note that in this case, we’re not really looking for a particular value (we don’t know what that value is). So, how can we achieve it? For example like this:

 

First a couple of explanations:

BinarySearchOrder is just an enum of Increasing/Decreasing (and is not that necessary as we’ll see later on).  The ISatisfactionValidator is an interface with a single method isValid which takes an index (an X if you will) and returns true if the value produced for that X (whatever that would be) is still satisfies the conditions (whatever they may be) or not. This can easily be adapted to a single element search as we’ll see later on.

A couple of interesting points here:

  • long conversion for the calculation of middle point – this can be easily overlooked with disastrous consequences. M = L + (R-L)/2 – consider R = MAX and L = MIN (negative) – the R-L is way beyond 32-bits.
  • middle point selection depending on ordering – in case of increasing sequences we’re modifying the right edge so we need to make sure that whenever middle point is between two values, we choose the right one (ceil instead of floor)
  • finally – XOR for validity check is just to mirror the search pattern in case of increasing sequence.
  • One more note here – in case of multiple elements with the same value, in the decreasing sequence the algorithm will produce the first index (from the left) and in the increasing the last (first from right) with given value.

That is all well and good, but in this form, the algorithm deals nicely with monotonic functions and we usually deal with arrays. We can easily modify it to suit our needs:

Nothing really tricky here, we just determine the order and write our Stisfaction validator in such a way that it would be usable by the algorithm. The decision to throw an exception when an element is not found may in some cases not be the best, espectially in some performance critical code, but it can easy be modified.

Finally – just noticed right now that in generic search with all elements the same, a Decreasing option will be chosen (and therefore first element returned). It would be more appropriate to name the enum NonIncreasing (or modify the selection and choose Decreasing/NonDecreasing set)