Shell Sort

The program from the test is a sort, called Shell Sort (named after Donald Shell who invented it in 1959). It is the most effecient of the O(n2) algorithms (which include insertion sort, bubble sort, and selection sort).

The following explanation is taken from the web page: http://linux.wku.edu/~lamonml/algor/sort/shell.html

The shell sort is a "diminishing increment sort", better known as a "comb sort" to the unwashed programming masses. The algorithm makes multiple passes through the list, and each time sorts a number of equally sized sets using the insertion sort. The size of the set to be sorted gets larger with each pass through the list, until the set consists of the entire list. (Note that as the size of the set increases, the number of sets to be sorted decreases.) This sets the insertion sort up for an almost-best case run each iteration with a complexity that approaches O(n).

The items contained in each set are not contiguous - rather, if there are i sets then a set is composed of every i-th element. For example, if there are 3 sets then the first set would contain the elements located at positions 1, 4, 7 and so on. The second set would contain the elements located at positions 2, 5, 8, and so on; while the third set would contain the items located at positions 3, 6, 9, and so on.

void myProg(int numbers[], int array_size)
{
  int i, j, increment, temp;

  increment = 3;
  while (increment > 0)
  {
    for (i=0; i < array_size; i++)
    {
      j = i;
      temp = numbers[i];
      while ((j >= increment) && (numbers[j-increment] > temp))
      {
        numbers[j] = numbers[j - increment];
        j = j - increment;
      }
      numbers[j] = temp;
    }
    if (increment/2 != 0)
      increment = increment/2;
    else if (increment == 1)
      increment = 0;
    else
      increment = 1;
  }
}
a) Explain what the algorithm does. Explain how it does it, using examples and explaining the examples in good English. Most of the points for this problem will be in the explanation. (30 pts) The algorithm sorts the array. It does this by sorting using sets based on modular equality (p%incr = q%incr), or by "skipping" by increment through the array. The increment, or space between item in a set, starts at 3 for the first pass and then goes down to 1. By sorting sets first, we know when we do the second pass we will only need to move any item by a maximum of 2 places. This makes what is essentially the insertion algorithm more efficient.

b) What is the complexity of this algorithm? Show the derivation of the complexity and explain the steps in good English. Most of the points in this problem will be in the derivation and explanation. (30 pts)

To figure the complexity of the program, first notice that the outer while loop will run only twice, once when increment is 3 and once when increment is 1. So, we have

2 * (the complexity of the inner loops)
The for loop will run n times (i = 0 to n-1). But, we know that the inside while loop will run a different number of times depending on what increment we are on. So, we have a complexity of:
   n * (number of times inner while runs when increment = 3)
 + n * (number of times inner while runs when increment = 1)
It is useful to note that without the sort where increment = 3, the sort where increment = 1 is just an insertion sort, with complexity = 0(n2).

The maximum number of operations in the inner while loop when increment = 3 is

n/3

since any one item can at most skip down all the elements in its set.
This means that the number of times the inner loop will run when
i = 3 is

 (n-3) * n/3 
= n2/3 -n
We can get put a tighter bound on the complexity by noting that the number of checks that is performed at each position in the array is performed only for the items before it in it's set in the array. This is just like the case in insertion sort, but here it's only in it's set. So rather than using
n/3 
for the number of moves that each item can make, we sum up the maximum possible moves for all the items in a group. Here, the total is
3 * (the number of operations in the pass through one set)
That number is :
3 * (((n/3 + 1)^2) / 2)
= n^2/9 + 2n/3 + 5/2

When increment=1, the number of times through the for loop will be n-1 (we start at the element at index 1). Each time we check an element, we know it cannot move more than 2 places (since it's already sorted in order with its group that are 3 apart). So we get

(n-1)*2
= 2n - 2
When we combine both loops, we get
n2/3 - n + 2n - 2
= n2/3 + n - 2
or, if we use the tighter complexity for the outer loop, we get
n^2/9 + 8n/3 - 1
Loop invariant

For the inner while loop:

for 0 <= j < p <= q  <= i
and (p mod increment) = (q mod increment)
  numbers[p] < numbers[q]

English: 
  All numbers in each set between j and i are sorted within
  that set.  
for the for loop:
 
for 0 < p < q < i
and (p mod increment) = (q mod increment)
  numbers[p] < numbers[q]
  
English: All numbers before i have been sorted within their sets.
  Thus, when i = array_size, all the sets are sorted within themselves.
for the outer while loop:
for 0 < p < q < array_size
and (p mod increment) = (q mod increment)
  numbers[p] < numbers[q]

English:  All numbers before in the array have been sorted within their
  sets.  Thus, when increment = 1, the whole array is within one set,
  so the whole array is sorted.