Sorting is a very common and useful process. We sort names, salaries, movie grosses, Nielsen ratings, home runs, populations, book sales, to name a few. It is important to understand how sorting works and how it can be done efficiently. By default, we will consider sorting in increasing order:
i
, j
: if i
< j
, then A[i] <= A[j]
Note that we are allowing for duplicates here. For decreasing order, we simply change right side to A[i] >= A[j]
.
The basic idea for insertion sort is to "remove" the items one at a time from the original array and "insert" them into a new array, putting them into the correct sorted order as you insert.
We could accomplish this by using two arrays (as implied above), but that would double our memory requirements. We haven't been worrying much about memory needs (we've been worrying about runtime), but it would be nice to not use more memory than needed. So, we'd rather be able to sort in place, allowing us to use only a constant amount of extra memory. Space needs can be analyzed similar to runtime.
To implement this, we need to think of this one array being in two parts, a sorted part and an unsorted part:
SORTED | UNSORTED |
In each iteration of our outer loop, we will take an item out of the UNSORTED section and put it into its correct relative location in the SORTED section. How do we pick the item from the UNSORTED section? For insertion sort, on iteration i
of the outer loop, we take element i
in the array (ignoring sorted/unsorted) and insert it into the correct position in the sorted section of the array.
For example, start with this array:
i |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|---|
A[i] |
40 | 70 | 20 | 30 | 50 | 10 | 80 | 60 |
For insertion sort, we start by sorting element at i=0
into the correct location in the sorted array. Since the sorted array is empty and the sorted array grows from the left, this first element is already in its (temporarily) sorted position.
i |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|---|
A[i] |
40 | 70 | 20 | 30 | 50 | 10 | 80 | 60 |
We now move on to i=1
, and sort A[1]
into the sorted section of the array. Since 40 < 70
, no values need to switch positions.
i |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|---|
A[i] |
40 | 70 | 20 | 30 | 50 | 10 | 80 | 60 |
Next is i=2
, and sort A[2]
into the sorted section of the array. Since is smaller than all of the values in the sorted part of the array, it gets inserted to the front of the sorted array, giving:
i |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|---|
A[i] |
20 | 40 | 70 | 30 | 50 | 10 | 80 | 60 |
Next is i=3
, and sort A[3]
into the sorted section of the array, which is between 20 and 40:
i |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|---|
A[i] |
20 | 30 | 40 | 70 | 50 | 10 | 80 | 60 |
This continues until the entire array has been sorted (i.e. after we finish with i=7
in the example). Notice that the sorted array grows on each iteration and the unsorted array shrinks. For each iteration, we take the next value in the unsorted array and insert it into the sorted array.
Let's take a look at how this all actually works. Here's the code from a previous version of the textbook:
The code is a bit wordy (the authors present it in this way to be more readable). The basic idea though is that the initial method (insertionSort
) has only array and length as parameters. It calls an overloaded version with start and end index values as parameters, which allows us to sort only part of the array if we want. Each iteration in this method brings one more item from the unsorted portion of the array into the sorted portion. It does this by calling another method to actually move the value into its correct spot. Values are shifted from left to right, leaving a "hole" in the spot where the item should be.
extends
mean in the context of a generic type?super
mean in the context of a generic type?The insertion sort code above makes use of generics (how can you tell?), but its generic type looks like <T extends Comparable<? super T>>
. What does that mean? Why is it needed instead of just <T>
? In recitation 2, you were briefly introduced to bounded generics. Bounded generics is when you put constraints on what types are allowed for generics.
<T>
- This allows any class<T extends Class>
- T
can be any class, as long as it is either:
Class
classClass
interface (note, in this case, Class
is an interface and not a class)T
, saying it can be any type, as long as it can look like Class
.
<? super T>
- This is basically the reverse of the above. You're saying that T
must be the ancestor of ?
. Said another way, you've placed an upperbound on ?
, allowing it to be any type as long as it can look like T
. The ?
is a wildcard placeholder. We don't care what it is, just that it must be a descendent of T
.Let's put all this together to make sense of that generic type.
<? super T>
. This tells us that ?
must be a descendent of T
. That may not mean much to us yet, but let's look at the context.Comparable<? super T>>
tells us that the Comparable
interface (which is a generic interface) can compare ?
things (and ?
can be anything, as long as it can look like T
).<T extends Comparable<? super T>>
tells us that T must implement the Comparable
interface (and that the comparable interface must compare things that can look like T
).So basically, T
can be any type as long as it can be compared to things that look like T
. Why not just say <T extends Comparable<T>>
? Take a look at the NumericPoint class in recitation 2. Or, think about insertion sort. It will be dealing with objects of type T
or objects of children classes of T
. They all need to be comparable to each other, but they may not all be of type T
, so saying Comparable<T>
wouldn't work. Only when saying "comparable with things that can look like T
" will insertion sort be able to work with different, but related, classes.
So, what is the runtime of insertion sort? Recall the procedure for determining runtime.
What key instruction (or group of instructions) should we measure? Since we're sorting, we should look at the number of comparisons between array elements.
Now that we've decided on the instruction, let's try to come up with the number of comparisons needed in the worst case scenario. What is the worst case scenario for insertion sort? For the main loop (which is in insertionSort
):
unsorted = 1
, 1 comparison in insertInOrder
methodunsorted = 2
, 2 comparison in insertInOrder
methodunsorted = N-1
, N-1 comparison in insertInOrder
methodWhat would the array look like (before sorting) to cause the worst case?
If we add up the comparisons, we get:
Now, it turns out that on average, the number of comparisons are a bit better, but the average case is still O(N2).
Can insertion sort be used with a linked list?
It turns out that insertion sort is probably more natural with a linked list than with an array. At each iteration, simply remove the front node from the list, and "insert it in order" into a second, new list. In this case, we are not creating any new nodes, just moving the ones we have around.
The worst-case runtime turns out to be O(N2). Why? What even is the worst-case for insertion sort on a linked list?
For more information, see Section 8.15 of the textbook.
Another simple sorting algorithm is selection sort. At iteration i
of the outer loop, find the ith smallest item and swap it into location i. So:
i = 0
: find 0th smallest value and swap into location 0i = 1
: find 1st smallest value and swap into location 1i = N-1
: find (N-1)th smallest value and swap into location N-1Like insertion sort, selection sort has a very simple implementation using nested for loops (or method calls, as shown in text). We actually saw this algorithm earlier in the term with the PeopleSort example from lecture 4 (see the SortArray.java file, selectionSort
method).
With Bubble Sort, loops through the array and "bubble up" smaller values while "sinking" larger values until each is in its appropriate spot. Item i in the array is compared to item i+1:
This continues until the array is sorted. Like insertion sort and selection sort, bubble sort uses two loops. The outer loop executes while the array is unsorted. The inner loop walks through the array and performs the swaps. Below is an example of one iteration of the inner loop. The first column is for i
; the red, bold cells are being compared; and the last column is to indicate whether a swap occurred (you could also just compare the row to the next to see if it changed).
i |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
swapped? |
---|---|---|---|---|---|---|---|---|
0 | 50 | 30 | 40 | 70 | 10 | 80 | 20 | yes |
1 | 30 | 50 | 40 | 70 | 10 | 80 | 20 | yes |
2 | 30 | 40 | 50 | 70 | 10 | 80 | 20 | no |
3 | 30 | 40 | 50 | 70 | 10 | 80 | 20 | yes |
4 | 30 | 40 | 50 | 10 | 70 | 80 | 20 | no |
5 | 30 | 40 | 50 | 10 | 70 | 80 | 20 | yes |
30 | 40 | 50 | 10 | 70 | 20 | 80 | N/A |
Bubble sort is famous for its inefficiency. Donald Knuth (famous computer scientist), in the 3rd volume of his book The Art of Computer Programming, said that "bubble sort seems to have nothing to recommend it". Wikipedia's article on Bubble Sort often points out how it is a bad choice for sorting. However, it is not the worst sorting algorithm. That award goes to Bogosort or a related algorithm. Basically, don't use Bubble Sort. We cover it hear just so you've seen it before.
The text also discusses recursive implementations of InsertionSort and SelectionSort. As with Sequential Search and some other simple problems, this is more to show how it can be done rather than how it should be done (recall the recursion overhead and added implementation complexity with recursion). Read over these explanations and convince yourselves that the recursive versions do the same thing as the iterative versions.
All three of the sorting algorithms we've looked at have similar runtimes in the worst case: O(N2). However, insertion sort is actually a good choice for mostly-sorted arrays. Selection sort is a good choice when swapping values is expensive. Bubble sort is never a good choice. The runtime of O(N2) makes these algorithms ok for small arrays, but for a large number of items, it is too big. What we'll look at next is how to come up with faster sorting algorithms.
To improve on our simple sorts it helps to consider why they are not so good. Let's again consider Insertion Sort. What about the algorithm makes its performance poor? Consider what occurs with each comparison:
If the data is greatly out of order, it will take a lot of comparisons to get into order. If we can move the data farther with one comparison, perhaps we can improve our runtime. This is the idea of Shellsort. Rather than comparing adjacent items, we compare items that are farther away from each other. Specifically, we compare and "sort" items that are K locations apart for some K. That is, we apply Insertion Sort to non-contiguous subarrays of our original array that are K locations apart. We gradually reduce K from a large value to a small one, ending with K = 1 (which is straight insertion sort). Let's take a look at an example:
We want to sort this array:
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|
40 | 20 | 70 | 60 | 50 | 10 | 80 | 30 |
For our first iteration, K = 4. In the array below, the cells are colored by which sub-arrays (so, the two red cells belong to the same sub-array). How does shellsort know which cells belong to the same subarray?
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|
40 | 20 | 70 | 60 | 50 | 10 | 80 | 30 |
After this first iteration, each sub-array has had insertion sort applied to it, giving this array (bold values indicate values that moved):
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|
40 | 10 | 70 | 30 | 50 | 20 | 80 | 60 |
Why didn't 10 move to the very beginning of the array?
We now move on to the next iteration of shellsort. This time, K=2, giving us new sub-arrays. Starting with the partially-sorted array from the first iteration, our sub-arrays are now:
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|
40 | 10 | 70 | 30 | 50 | 20 | 80 | 60 |
Applying insertion sort to these two sub-arrays, we end up with (bold values indicate values that moved):
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|
40 | 10 | 50 | 20 | 70 | 30 | 80 | 60 |
Finally, on the next iteration, K=1, giving us just a single sub-array:
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|
40 | 10 | 50 | 20 | 70 | 30 | 80 | 60 |
Applying insertion sort to this sub-array, yields (again, bold values indicate values that moved):
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|
10 | 20 | 30 | 40 | 50 | 60 | 70 | 80 |
We now have a sorted array. But, how is this an improvement over regular insertion sort? The idea is that by the time K = 1, most of the data will not have very far left to move.
It seems like this algorithm will actually be worse than Insertionsort - why? It's last "iteration" is a full Insertion Sort and previous iterations do Insertion Sorts of sub-arrays. However, when timed it actually outperforms Insertion Sort. The exact analysis is tricky, and depends on the initial value for K. The basic idea is that Shellsort moves the data closer to Insertion Sort's best case arrangement. While best-case analysis is often ignored, if you look at Insertion Sort's best-case performance, it's O(N) (why?). So, Shellsort can benefit from this towards the end of its sorting. A good implementation will have about O(N3/2) performance compared to O(N2) for regular Insertion Sort. See the text for more details.
You may wonder how K is picked. This is actually a complex question with ongoing research and beyond the scope of this course. In the example above (and the code below), we started with K = floor(N / 2) and updated K by dividing it by 2 and taking the floor. This is easy to understand, but actually yields O(N2). More advanced classes would go over this analysis, except there are faster sorting algorithms with interesting analysis opportunities to focus on instead.
The code for Shellsort is:
If we approach sorting in a different way, we can improve the run-time even more. Let's take a look at divide and conquer approaches. The general idea is to define sorting an array of N items in terms of sorting one or more smaller arrays (for example, of size N/2). As we said previously (for Binary Search), this works well when implemented using recursion. So we will look at the next two sorting algorithms recursively.
How can we apply divide and conquer to sorting? There are two questions to consider:
Merge sort is a divide and conquer sorting algorithm. Let's examine the questions above for Merge sort:
How do we "divide" the problem into sub-problems? Simply break the array in half based on index value. Given the array:
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|
40 | 80 | 60 | 20 | 30 | 10 | 70 | 50 |
We would divide it into these two arrays:
|
|
We then recursively divide each array in two, giving:
|
|
|
|
Then recursively divide each array in two again and again and again until we reach our base case. In the example array, dividing in two again gives:
|
|
|
|
|
|
|
|
This is our base case (where all arrays are single-element arrays). When implementing Merge sort, you often don't actually create new arrays when making recursive calls. Instead, you pass in the full array, and both the starting index and ending index into your recursive calls. This is why the arrays above still have their original indices.
Once you reach your base case, you need to decide how to sort a single-element array. How do you sort a single-element array?
Now with your single-element array sorted, we now need to "pick up the pieces and put them together again", which brings us to our second question: How do we use sub-problem solutions to solve the overall problem?
When the recursive calls complete, we will have two sorted sub-arrays, one on the left and one on the right. Let's look at this from the first call's point of view (this is after the two recursive calls have completed - that's why they're each sorted):
|
|
How do we produce a single sorted array from these two sorted subarrays? We "merge" them together, moving the next appropriate item into an overall sorted array. Note that this is where we are really doing the "work" of the sort. We are comparing items and moving them based on those comparisons.
Let's look at a pseudocode implementation:
Looking at the pseudocode, the algorithm seems pretty easy. The algorithm doesn't return the sorted array because the sorted array is stored in A (why does that mean we don't need to return it?). The only part that requires some thought is the merge. Let's take a look at how to merge on the board. See TextMergeQuick.java for a Java implementation.
How long does MergeSort take to run? Consider an original array of size N. The analysis is tricky due to the recursive calls, so let's think of the work "level by level". At each level of the recursion, we need to consider and possibly move O(N) items. Since the size is cut in half with each call, we have a total of O(log2(N)) levels. Thus, we have N * log2(N) work to do, so our runtime is O(N log(N)). Note that when multiplying Big-O terms, we do not throw out the smaller terms.
We are looking at MergeSort "level by level" simply to do the analysis. The actual execution of MergeSort is a tree execution. We recursively sort the left side of the array, going all the way down to the base case, then merging back up, before we even start working on the right side.
While the runtime of O(N log(N)) is a definite improvement over the simple sorting algorithms with O(N2), it comes at a cost. We are not sorting in place: the merge step requires an auxiliary array, giving us O(N) additional space required. While that's not much for today's computers, copying to/from this auxiliary array slows down the algorithm in real terms (i.e. timing the algorithm shows that it's slower than it should be).
Quick sort is a divide and conquer algorithm that makes a few different decisions than Merge Sort.
When dividing the problem into smaller sub-problems, Quick Sort breaks up the data based on how it compares to a special data value (called the pivot). (How does this differ from Merge Sort?) We compare all values to this pivot and place them into three groups:
data <= pivot | pivot | data >= pivot |
Since we are dividing the data by comparing values to a value from the data, the division may not be exactly half. Let's take a look at an example (same as Merge Sort's):
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
---|---|---|---|---|---|---|---|
40 | 80 | 60 | 20 | 30 | 10 | 70 | 50 |
Before we can divide, we need to pick a pivot. For now, let's just take the last value of the array: A[last]
(in this case, A[7]
, or 50
). Note that after the "divide" step, the pivot may end of at a different index since it'll be between the two groups formed during division.
With our pivot picked, we now divide the data (called "partition"). We'll see later how this partitioning happens, but for now we'll just focus on understanding what Quick Sort does.
|
|
|
So, what does this achieve? The data isn't sorted yet, but at least we know the final position of one value (which one?) and the others are "more sorted" than they were before. So, we now make recursive calls to sort each of the two groups. Just like with Merge Sort, the base case for Quick Sort is when you try to sort a one-element array.
We're now ready for some pseudocode:
How do we use the sub-problem solution to solve the overall problem? We don't need to do anything! During the partition step, we move the pivot to its final position. Then, the two recursive calls sort the left and right sides. Once those are done, there's no work to convert those sub-problem solutions into the solution. So even though we need to consider this question, it turns out that for Quick Sort we don't need to actually do anything.
Now we just need to figure out the partition step. Well, it'd be nice to do this in-place so we don't need any extra memory. The basic idea for partitioning is:
Once the two sides meet, swap the pivot into the first position of the right-side partition (for value greater than the pivot) -- why this partition? Now, recursively sort both partitions. Note that the pivot from this first partition is never again touched - it is in its absolute correct spot. The other items, however, could move considerably within their sides of the array.
Let's take a look at how that works on the board. For a complete implementation of Quick Sort, see Quick.java.
How long does QuickSort take to run? The performance of QuickSort depends on the "quality" of the partitions (i.e. how other values relate to the pivots). Let's look at two different scenarios:
So which run-time will we actually get? It depends on how the data is originally distributed and how the pivot is chosen. Our simple version of Quicksort picks A[last] as the pivot. This makes the interesting worst case of the data being already sorted! The pivot is always the greatest element and there is no data in the "greater than the pivot" partition. Reverse sorted data is also a worst case (now the pivot is always the smallest item).
We can make the worst case less likely to occur by choosing the pivot in a more intelligent way. One technique is the Median of Three. With this technique, don't pick the pivot from any one index. Rather, consider three possibilities each time we partition: A[first], A[mid], A[last]. Order these items and put the smallest value back into A[first], the middle into A[mid] and the largest into A[last]. So now we know that A[first] <= A[mid] <= A[last]. Now use A[mid] as the pivot. How does this affect the runtime?
Median of three does not guarantee that the worst case (N2) will not occur. It only reduces the likelihood and makes the situation in which it would occur not obvious. So we say that the expected runtime of QuickSort is O(N log2(N)), but the worst case runtime of QuickSort is O(N2). For code, see TextMergeQuick.java.
What if we choose the pivot index randomly? For each call, choose a random index between first and last (inclusive) and use that as the pivot. The worst case could be just as bad as the simple pivot choice. But in general, it is very unlikely that a random pivot will always be bad. Overall, this should give good results. However, we have overhead of generating random numbers.
The idea behind dual pivot is to use two pivots P1 and P2 and create three partitions:
This yields three subarrays that must be sorted recursively. As long as the pivots are chosen wisely, this actually has an incremental improvement over traditional QuickSort. In fact, dual pivot has been incorporated into Java 7's JDK.
Simple QuickSort stops when the logical size of the array is 1. However, the benefit of divide in conquer decreases as the problem size gets smaller. At some point, the cost of the recursion outweighs the divide and conquer savings. So, choose a size > 1 to stop recursing and switch to another (good) algorithm at that point.
What good sorting algorithm should we pick? Insertion Sort. Even though it is poor overall, if the data is "mostly" sorted due to QuickSort, we will be close to the best case for Insertion Sort and maybe we will get better overall results! See TextMergeQuick.java.
So which do we prefer, Merge Sort or Quick Sort?
Thus, for complex (Object) types, it may be better to use Merge Sort even if it is a bit slower. For example, JDK 6 Java used Merge Sort for objects and Quick Sort for primitive types. Since stability does not matter for primitive types, they picked the faster algorithm. For objects, where stability could be important, they chose the less-fast (but definitely not slow) sorting algorithm. However, in JDK 7 they switched to TimSort which is much more complicated but a bit faster.
<< Previous Notes | Daily Schedule | Next Notes >> |