Case Studies: Analyzing Standalone Functions

We'll illustrate the techniques we've learned with some of the array manipulation functions that we introduced earlier. In a later lesson, we'll look at analysis of class member functions.

1. Ordered Insert

We start with our function for inserting an element into a sorted array.

• We start from the high end of the array and check to see if that's where we want to insert the data. If so, fine. If not, we move the preceding element up one and then check to see if we want to insert `x` in the hole left behind. We repeat this step as necessary.

• Run this algorithm until you are comfortable with your understanding of how it works.

• From the Algorithm menu, select Generate an array. Create an array with 5-10 elements.

• Again, from the Algorithm menu, select Insert in order to try the algorithm. Press the forward button to step through the execution.

• Try inserting values that should fall near the middle of the array.

• Try inserting values that are smaller than any already in the array.

• Try inserting values that are larger than any already in the array.

Then we can move on to the analysis.

1.1. Analysis of orderedInsert

We start our analysis with the easy stuff -- mark all of the non-compound statements.

Next, looking at the `while` loop, we see that its condition can be evaluated in O(1) time, and that the loop repeats at most last - first times.

1.2. The loop

The loop body is O(1), so by our special case rule for loops we evaluate the loop complexity as (last-first)*O(1 + 1) = O(last-first).

Replacing the loop by this quantity…

1.3. The main sequence

…, we get down to a statement sequence composed of O(1) and O(last-first) terms.

1.4. Finishing up

The O(last-first) term dominates the sum, so we conclude that the entire function is O(last-first).

Or, if you prefer, we could say the function runs in O(n) time where n is the number of items in the portion of the array being used (= last - first).

Note that, because none of the inputs to this function are actually named n, it is only proper to say the function is O(n) if we explicitly define n in terms of the function inputs.

1.5. Special Case Behavior

A special point worth noting:

• If we are adding a value `target` that is greater than all elements already in `arr`, this algorithm does 0 iterations of the loop.

• Suppose we are given a series of `target` values to insert into an initially empty array, and that these values are already sorted.

• Then each new value will be greater than all the ones already inserted into the array.

• Each call to `orderedInsert` will use 0 iterations.

• and so each call runs in O(1) time (for this special case of inserting sorted elements)

• We'll make use of this special case a little later when we incorporate this function into more complex algorithms.

2. Sequential Search

You may run this algorithm if you wish.

2.1. Analysis

Again, we can start the analysis by marking the easy stuff -- the non-compound statements.

Question: How should the statements marked ? on the right be labeled?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

2.2. Analysis (cont.)

Question: How should the statements marked ? on the right be labeled?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

These are simple statements with no function calls, so they are O(1).

2.3. Loop Condition

Question: Looking at the loop condition, we see that it has what complexity?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

2.4. Loop Condition (cont.)

Question: Looking at the loop condition, we see that it has what complexity?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

The loop condition contains only simple integer operations and so is O(1).

We mark the simple-statement portion of the loop header.

Next, we ask how often the loop repeats. `i` starts at `first` and goes up to `last-1`, for a total of at most last-first iterations.

Question: How do we mark the loop to reflect this?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

2.5. Marking the loop iterations

Question: How do we mark the loop to reflect this?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

We mark it with `(last-first)*` to indicate that the loop repeats that many times.

If you answered n*, you had the appropriate programming intuition, but your mathematics is sloppy. There is no symbol n defined here, so it makes no sense to say something happens n times or is O(n). You might as well say it is O(ragnarok) for all the sense that makes.

We can, of course, introduce a symbol n if we give it a proper definition, as we did at the end of the previous analysis. But in the absence of any such explicit definition, we can't say n.

OK, let's go ahead and define n as the number of elements in the portion of the array being searched. Let

n = last - first

The loop condition and body are O(1), and the loop repeats up to n times.

2.6. Loop Complexity

Question: We conclude that the entire loop is what complexity?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

2.7. Loop Complexity

Question: We conclude that the entire loop is what complexity?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

The loop is n*O(1), which simplifies to O(n).

Question: How do we mark the loop to reflect this?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

2.8. Marking the loop

The loop is n*O(1), which simplifies to O(n).

Question: How do we mark the loop to reflect this?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

We mark it as =O(n) .

Because this is a conclusion about the total cost of a compound statement, the label begins with =.

2.9. Finishing up

…and then replace the entire loop by an O(n) marker.

Question: Summing up the terms in the remaining statement sequence, we see that the entire function is what?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

2.10. Finishing up (cont.)

Question: Summing up the terms in the remaining statement sequence, we see that the entire function is what?

1. 1*

2. O(1)

3. =O(1)

4. n*

5. O(n)

6. =O(n)

7. none of the above

The entire function is O(n).

It's worth noting that both of the examples we've looked at yielded results that are unsurprising, in that they match what we would have guessed under the nested loops sanity check.

Our next example, however, will have a bit more of a surprise.

2.11. Ordered Sequencial Search

When arrays have been sorted, they could be searched using a slightly faster variant of the sequential sort.

The functions shown here differs from `seqSearch` only in replacing the `!=` operator by a `<`. This causes us to exit the loop as soon as we see a value that is equal to or greater than the `target`. If `arr` is sorted, then encountering a value greater than the `target` would imply that the `target` value is not actually in the array,

Without this assurance, we would need to search the entire array if the element being looked for was not present, which is what happens in `seqSearch`.

The worst case for the two algorithms is the same. We might guess, however, that the ordered search does better on average when applied to sorted data. Of course, if the data in the array isn't sorted, then we can't use ordered search at all.

3. Binary Search

Run this algorithm until you are familiar with how it works.

3.1. Starting the Analysis

Working from the inside out, we begin with the statements nested within the `if`.

The then parts and else parts are clearly O(1), as are the if conditions.

3.2. Inner if

So we can conclude that the inner if is O(1).

3.3. Outer if

And can then replace it by its marker.

Now it's obvious that the outer if is also O(1).

And can then replace it by its marker.

…as shown here.

3.5. Loop body

The remaining statements in the loop body are also O(1).

3.6. Loop Body (cont.)

So the entire loop body is O(1).

3.7. Loop condition

Continuing to work from the inside-out, we note that the loop condition is O(1).

Now, how many times does this loop repeat?

3.8. Loop repetitions

How often does the loop repeat?

To answer this question let's go back to the original listing. Remember how this function actually works.

• `first``last` define our current search area. Initially, there are last-first items in this area.

• The two values first and last define our current search range. If the value we are looking for is somewhere in the array, it is going to be at a position `>= first` and `< last`. So the difference, `last-first`, defines how many values we have left in the search range.

Reducing the search range

There are `last-first` values left in the search range.

• Each time around the loop, we cut this area in half.

• We stop when the search area has been reduced to a single item.

• Let n denote the value of `last-first`.

How many times can we divide n things into 2 equal parts before getting down to only 1?

That's the interesting question!

Logarithmic Behavior

If I start with, let's say, N things in this array, how many times I keep cutting that search area of N things in half till I get to only a single item? (An item that must be the value we are looking for, if that value is anywhere in the array at all.)

Let's assume for the sake of our argument that N is the power of 2 to start with. So we cut N in half -- we get N/2. Next time we cut that half which makes N/4, then N/8 and then N/16 and so on. And the question is, how often can we keep doing that until we reduce the number down to just 1?

The answer may be a bit clearer if we turn the problem around. Start at 1 and keep doubling until we get N. So we start at 1, 2, 4, … and we keep going until we actually get up to N. The number of steps we did in doubling is same as number of steps when we start at N and kept dividing by 2.

How many steps is that? Well, what power of 2 do we reach when we finally get to N? Suppose we took k steps. Then we are saying that N = 2k. Solving for k, we get k = log N.

3.9. Loop complexity

So the loop repeats `log(n)*` where n is defined to be the initial value of `last-first`. The loop complexity is therefore (log n)*O(1) = O(log n)

3.10. Collapsing the loop

And we can replace the entire loop by O(log n)

3.11. Function body sequence

The remaining statements are all O(1).

3.12. Function body sequence (cont.)

The remaining statements are all O(1).

And so the entire function is O(log n) where n is the initial value of `last-first`. [1]