Quicksort
Quicksort is an efficient sorting algorithm. Developed by British computer scientist Tony Hoare in 1959 and published in 1961, it is still a commonly used algorithm for sorting. When implemented well, it can be about two or three times faster than its main competitors, merge sort and heapsort.
Quicksort is a divide-and-conquer algorithm. It works by selecting a 'pivot' element from the array and partitioning the other elements into two sub-arrays, according to whether they are less than or greater than the pivot. The sub-arrays are then sorted recursively. This can be done in-place, requiring small additional amounts of memory to perform the sorting.
Quicksort is a comparison sort, meaning that it can sort items of any type for which a "less-than" relation is defined. Efficient implementations of Quicksort are not a stable sort, meaning that the relative order of equal sort items is not preserved.
Mathematical analysis of quicksort shows that, on average, the algorithm takes O comparisons to sort n items. In the worst case, it makes O comparisons, though this behavior is rare.
History
The quicksort algorithm was developed in 1959 by Tony Hoare while in the Soviet Union, as a visiting student at Moscow State University. At that time, Hoare worked on a project on machine translation for the National Physical Laboratory. As a part of the translation process, he needed to sort the words in Russian sentences prior to looking them up in a Russian-English dictionary that was already sorted in alphabetic order on magnetic tape. After recognizing that his first idea, insertion sort, would be slow, he quickly came up with a new idea that was Quicksort. He wrote a program in Mercury Autocode for the partition but could not write the program to account for the list of unsorted segments. On return to England, he was asked to write code for Shellsort as part of his new job. Hoare mentioned to his boss that he knew of a faster algorithm and his boss bet sixpence that he did not. His boss ultimately accepted that he had lost the bet. Later, Hoare learned about ALGOL and its ability to do recursion that enabled him to publish the code in Communications of the Association for Computing Machinery, the premier computer science journal of the time.Quicksort gained widespread adoption, appearing, for example, in Unix as the default library sort subroutine. Hence, it lent its name to the C standard library subroutine qsort and in the reference implementation of Java.
Robert Sedgewick's Ph.D. thesis in 1975 is considered a milestone in the study of Quicksort where he resolved many open problems related to the analysis of various pivot selection schemes including Samplesort, adaptive partitioning by Van Emden as well as derivation of expected number of comparisons and swaps. Jon Bentley and Doug McIlroy incorporated various improvements for use in programming libraries, including a technique to deal with equal elements and a pivot scheme known as pseudomedian of nine, where a sample of nine elements is divided into groups of three and then the median of the three medians from three groups is chosen. Bentley described another simpler and compact partitioning scheme in his book Programming Pearls that he attributed to Nico Lomuto. Later Bentley wrote that he used Hoare's version for years but never really understood it but Lomuto's version was simple enough to prove correct. Bentley described Quicksort as the "most beautiful code I had ever written" in the same essay. Lomuto's partition scheme was also popularized by the textbook Introduction to Algorithms although it is inferior to Hoare's scheme because it does three times more swaps on average and degrades to runtime when all elements are equal.
In 2009, Vladimir Yaroslavskiy proposed the new dual pivot Quicksort implementation. In the Java core library mailing lists, he initiated a discussion claiming his new algorithm to be superior to the runtime library's sorting method, which was at that time based on the widely used and carefully tuned variant of classic Quicksort by Bentley and McIlroy. Yaroslavskiy's Quicksort has been chosen as the new default sorting algorithm in Oracle's Java 7 runtime library after extensive empirical performance tests.
Algorithm
Quicksort is a divide and conquer algorithm. It first divides the input array into two smaller sub-arrays: the low elements and the high elements. It then recursively sorts the sub-arrays. The steps for in-place Quicksort are:- Pick an element, called a pivot, from the array.
- Partitioning: reorder the array so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it. After this partitioning, the pivot is in its final position. This is called the partition operation.
- Recursively apply the above steps to the sub-array of elements with smaller values and separately to the sub-array of elements with greater values.
The pivot selection and partitioning steps can be done in several different ways; the choice of specific implementation schemes greatly affects the algorithm's performance.
Lomuto partition scheme
This scheme is attributed to Nico Lomuto and popularized by Bentley in his book Programming Pearls and Cormen et al. in their book Introduction to Algorithms. This scheme chooses a pivot that is typically the last element in the array. The algorithm maintains index as it scans the array using another index such that the elements at through are less than the pivot, and the elements at through are equal to or greater than the pivot. As this scheme is more compact and easy to understand, it is frequently used in introductory material, although it is less efficient than Hoare's original scheme e.g., when all elements are equal. This scheme degrades to when the array is already in order. There have been various variants proposed to boost performance including various ways to select pivot, deal with equal elements, use other sorting algorithms such as Insertion sort for small arrays and so on. In pseudocode, a quicksort that sorts elements at through of an array can be expressed as:algorithm quicksort is
if lo < hi then
p := partition
quicksort
quicksort
algorithm partition is
pivot := A
i := lo
for j := lo to hi do
if A < pivot then
swap A with A
i := i + 1
swap A with A
return i
Sorting the entire array is accomplished by.
Hoare partition scheme
The original partition scheme described by C.A.R. Hoare uses two indices that start at the ends of the array being partitioned, then move toward each other, until they detect an inversion: a pair of elements, one greater than or equal to the pivot, one less than or equal, that are in the wrong order relative to each other. The inverted elements are then swapped. When the indices meet, the algorithm stops and returns the final index. Hoare's scheme is more efficient than Lomuto's partition scheme because it does three times fewer swaps on average, and it creates efficient partitions even when all values are equal. Like Lomuto's partition scheme, Hoare's partitioning also would cause Quicksort to degrade to for already sorted input, if the pivot was chosen as the first or the last element. With the middle element as the pivot, however, sorted data results with no swaps in equally sized partitions leading to best case behavior of Quicksort, i.e.. Like others, Hoare's partitioning doesn't produce a stable sort. In this scheme, the pivot's final location is not necessarily at the index that is returned, as the pivot and elements equal to the pivot can end up anywhere within the partition after a partition step, and may not be sorted until the base case of a partition with a single element is reached via recursion. The next two segments that the main algorithm recurs on are and as opposed to and as in Lomuto's scheme. However, the partitioning algorithm guarantees which implies both resulting partitions are non-empty, hence there's no risk of infinite recursion. In pseudocode,algorithm quicksort is
if lo < hi then
p := partition
quicksort
quicksort
algorithm partition is
pivot := A
i := lo - 1
j := hi + 1
loop forever
do
i := i + 1
while A < pivot
do
j := j - 1
while A > pivot
if i ≥ j then
return j
swap A with A
An important point in choosing the pivot item is to round the division result towards zero. This is the implicit behavior of integer division in some programming languages, hence rounding is omitted in implementing code. Here it is emphasized with explicit use of a floor function, denoted with a symbols pair. Rounding down is important to avoid using A as the pivot, which can result in infinite recursion.
The entire array is sorted by.
Implementation issues
Choice of pivot
In the very early versions of quicksort, the leftmost element of the partition would often be chosen as the pivot element. Unfortunately, this causes worst-case behavior on already sorted arrays, which is a rather common use-case. The problem was easily solved by choosing either a random index for the pivot, choosing the middle index of the partition or choosing the median of the first, middle and last element of the partition for the pivot. This "median-of-three" rule counters the case of sorted input, and gives a better estimate of the optimal pivot than selecting any single element, when no information about the ordering of the input is known.Median-of-three code snippet for Lomuto partition:
mid := / 2
if A < A
swap A with A
if A < A
swap A with A
if A < A
swap A with A
pivot := A
It puts a median into
A
first, then that new value of A
is used for a pivot, as in a basic algorithm presented above.Specifically, the expected number of comparisons needed to sort elements with random pivot selection is. Median-of-three pivoting brings this down to, at the expense of a three-percent increase in the expected number of swaps. An even stronger pivoting rule, for larger arrays, is to pick the ninther, a recursive median-of-three, defined as
Selecting a pivot element is also complicated by the existence of integer overflow. If the boundary indices of the subarray being sorted are sufficiently large, the naïve expression for the middle index,, will cause overflow and provide an invalid pivot index. This can be overcome by using, for example, to index the middle element, at the cost of more complex arithmetic. Similar issues arise in some other methods of selecting the pivot element.
Repeated elements
With a partitioning algorithm such as the Lomuto partition scheme described above, quicksort exhibits poor performance for inputs that contain many repeated elements. The problem is clearly apparent when all the input elements are equal: at each recursion, the left partition is empty, and the right partition has only decreased by one element. Consequently, the Lomuto partition scheme takes quadratic time to sort an array of equal values. However, with a partitioning algorithm such as the Hoare partition scheme, repeated elements generally results in better partitioning, and although needless swaps of elements equal to the pivot may occur, the running time generally decreases as the number of repeated elements increases. In the case where all elements are equal, Hoare partition scheme needlessly swaps elements, but the partitioning itself is best case, as noted in the Hoare partition section above.To solve the Lomuto partition scheme problem, an alternative linear-time partition routine can be used that separates the values into three groups: values less than the pivot, values equal to the pivot, and values greater than the pivot. The values equal to the pivot are already sorted, so only the less-than and greater-than partitions need to be recursively sorted. In pseudocode, the quicksort algorithm becomes
algorithm quicksort is
if lo < hi then
p := pivot
left, right := partition // note: multiple return values
quicksort
quicksort
The
partition
algorithm returns indices to the first and to the last item of the middle partition. Every item of the partition is equal to p
and is therefore sorted. Consequently, the items of the partition need not be included in the recursive calls to quicksort
.The best case for the algorithm now occurs when all elements are equal. In the case of all equal elements, the modified quicksort will perform only two recursive calls on empty subarrays and thus finish in linear time.
Optimizations
Two other important optimizations, also suggested by Sedgewick and widely used in practice, are:- To make sure at most space is used, first into the smaller side of the partition, then use a tail call to recur into the other, or update the parameters to no longer include the now sorted smaller side, and iterate to sort the larger side.
- When the number of elements is below some threshold, switch to a non-recursive sorting algorithm such as insertion sort that performs fewer swaps, comparisons or other operations on such small arrays. The ideal 'threshold' will vary based on the details of the specific implementation.
- An older variant of the previous optimization: when the number of elements is less than the threshold, simply stop; then after the whole array has been processed, perform insertion sort on it. Stopping the recursion early leaves the array -sorted, meaning that each element is at most positions away from its final sorted position. In this case, insertion sort takes time to finish the sort, which is linear if is a constant. Compared to the "many small sorts" optimization, this version may execute fewer instructions, but it makes suboptimal use of the cache memories in modern computers.
Parallelization
Quicksort has some disadvantages when compared to alternative sorting algorithms, like merge sort, which complicate its efficient parallelization. The depth of quicksort's divide-and-conquer tree directly impacts the algorithm's scalability, and this depth is highly dependent on the algorithm's choice of pivot. Additionally, it is difficult to parallelize the partitioning step efficiently in-place. The use of scratch space simplifies the partitioning step, but increases the algorithm's memory footprint and constant overheads.
Other more sophisticated parallel sorting algorithms can achieve even better time bounds. For example, in 1991 David Powers described a parallelized quicksort that can operate in time on a CRCW PRAM with processors by performing partitioning implicitly.
Formal analysis
Worst-case analysis
The most unbalanced partition occurs when one of the sublists returned by the partitioning routine is of size. This may occur if the pivot happens to be the smallest or largest element in the list, or in some implementations when all the elements are equal.If this happens repeatedly in every partition, then each recursive call processes a list of size one less than the previous list. Consequently, we can make nested calls before we reach a list of size 1. This means that the call tree is a linear chain of nested calls. The th call does work to do the partition, and, so in that case Quicksort takes time.
Best-case analysis
In the most balanced case, each time we perform a partition we divide the list into two nearly equal pieces. This means each recursive call processes a list of half the size. Consequently, we can make only nested calls before we reach a list of size 1. This means that the depth of the call tree is. But no two calls at the same level of the call tree process the same part of the original list; thus, each level of calls needs only time all together. The result is that the algorithm uses only time.Average-case analysis
To sort an array of distinct elements, quicksort takes time in expectation, averaged over all permutations of elements with equal probability. We list here three common proofs to this claim providing different insights into quicksort's workings.Using percentiles
If each pivot has rank somewhere in the middle 50 percent, that is, between the 25th percentile and the 75th percentile, then it splits the elements with at least 25% and at most 75% on each side. If we could consistently choose such pivots, we would only have to split the list at most times before reaching lists of size 1, yielding an algorithm.When the input is a random permutation, the pivot has a random rank, and so it is not guaranteed to be in the middle 50 percent. However, when we start from a random permutation, in each recursive call the pivot has a random rank in its list, and so it is in the middle 50 percent about half the time. That is good enough. Imagine that a coin is flipped: heads means that the rank of the pivot is in the middle 50 percent, tail means that it isn't. Now imagine that the coin is flipped over and over until it gets heads. Although this could take a long time, on average only flips are required, and the chance that the coin won't get heads after flips is highly improbable. By the same argument, Quicksort's recursion will terminate on average at a call depth of only. But if its average call depth is, and each level of the call tree processes at most elements, the total amount of work done on average is the product,. The algorithm does not have to verify that the pivot is in the middle half—if we hit it any constant fraction of the times, that is enough for the desired complexity.
Using recurrences
An alternative approach is to set up a recurrence relation for the factor, the time needed to sort a list of size. In the most unbalanced case, a single quicksort call involves work plus two recursive calls on lists of size and, so the recurrence relation isThis is the same relation as for insertion sort and selection sort, and it solves to worst case.
In the most balanced case, a single quicksort call involves work plus two recursive calls on lists of size, so the recurrence relation is
The master theorem for divide-and-conquer recurrences tells us that.
The outline of a formal proof of the expected time complexity follows. Assume that there are no duplicates as duplicates could be handled with linear time pre- and post-processing, or considered cases easier than the analyzed. When the input is a random permutation, the rank of the pivot is uniform random from 0 to. Then the resulting parts of the partition have sizes and, and i is uniform random from 0 to. So, averaging over all possible splits and noting that the number of comparisons for the partition is, the average number of comparisons over all permutations of the input sequence can be estimated accurately by solving the recurrence relation:
Solving the recurrence gives.
This means that, on average, quicksort performs only about 39% worse than in its best case. In this sense, it is closer to the best case than the worst case. A comparison sort cannot use less than comparisons on average to sort items and in case of large, Stirling's approximation yields, so quicksort is not much worse than an ideal comparison sort. This fast average runtime is another reason for quicksort's practical dominance over other sorting algorithms.
Using a binary search tree
To each execution of quicksort corresponds the following binary search tree : the initial pivot is the root node; the pivot of the left half is the root of the left subtree, the pivot of the right half is the root of the right subtree, and so on. The number of comparisons of the execution of quicksort equals the number of comparisons during the construction of the BST by a sequence of insertions. So, the average number of comparisons for randomized quicksort equals the average cost of constructing a BST when the values inserted form a random permutation.Consider a BST created by insertion of a sequence of values forming a random permutation. Let denote the cost of creation of the BST. We have, where is an binary random variable expressing whether during the insertion of there was a comparison to.
By linearity of expectation, the expected value of is.
Fix and. The values, once sorted, define intervals. The core structural observation is that is compared to in the algorithm if and only if falls inside one of the two intervals adjacent to.
Observe that since is a random permutation, is also a random permutation, so the probability that is adjacent to is exactly.
We end with a short calculation:
Space complexity
The space used by quicksort depends on the version used.The in-place version of quicksort has a space complexity of, even in the worst case, when it is carefully implemented using the following strategies:
- in-place partitioning is used. This unstable partition requires space.
- After partitioning, the partition with the fewest elements is sorted first, requiring at most space. Then the other partition is sorted using tail recursion or iteration, which doesn't add to the call stack. This idea, as discussed above, was described by R. Sedgewick, and keeps the stack depth bounded by.
From a bit complexity viewpoint, variables such as lo and hi do not use constant space; it takes bits to index into a list of items. Because there are such variables in every stack frame, quicksort using Sedgewick's trick requires bits of space. This space requirement isn't too terrible, though, since if the list contained distinct elements, it would need at least bits of space.
Another, less common, not-in-place, version of quicksort uses space for working storage and can implement a stable sort. The working storage allows the input array to be easily partitioned in a stable manner and then copied back to the input array for successive recursive calls. Sedgewick's optimization is still appropriate.
Relation to other algorithms
Quicksort is a space-optimized version of the binary tree sort. Instead of inserting items sequentially into an explicit tree, quicksort organizes them concurrently into a tree that is implied by the recursive calls. The algorithms make exactly the same comparisons, but in a different order. An often desirable property of a sorting algorithm is stability – that is the order of elements that compare equal is not changed, allowing controlling order of multikey tables in a natural way. This property is hard to maintain for in situ quicksort. For variant quicksorts involving extra memory due to representations using pointers or files, it is trivial to maintain stability. The more complex, or disk-bound, data structures tend to increase time cost, in general making increasing use of virtual memory or disk.The most direct competitor of quicksort is heapsort. Heapsort's running time is, but heapsort's average running time is usually considered slower than in-place quicksort. This result is debatable; some publications indicate the opposite. Introsort is a variant of quicksort that switches to heapsort when a bad case is detected to avoid quicksort's worst-case running time.
Quicksort also competes with merge sort, another sorting algorithm. Mergesort is a stable sort, unlike standard in-place quicksort and heapsort, and has excellent worst-case performance. The main disadvantage of mergesort is that, when operating on arrays, efficient implementations require auxiliary space, whereas the variant of quicksort with in-place partitioning and tail recursion uses only space.
Mergesort works very well on linked lists, requiring only a small, constant amount of auxiliary storage. Although quicksort can be implemented as a stable sort using linked lists, it will often suffer from poor pivot choices without random access. Mergesort is also the algorithm of choice for external sorting of very large data sets stored on slow-to-access media such as disk storage or network-attached storage.
Bucket sort with two buckets is very similar to quicksort; the pivot in this case is effectively the value in the middle of the value range, which does well on average for uniformly distributed inputs.
Selection-based pivoting
A selection algorithm chooses the th smallest of a list of numbers; this is an easier problem in general than sorting. One simple but effective selection algorithm works nearly in the same manner as quicksort, and is accordingly known as quickselect. The difference is that instead of making recursive calls on both sublists, it only makes a single tail-recursive call on the sublist that contains the desired element. This change lowers the average complexity to linear or time, which is optimal for selection, but the sorting algorithm is still.A variant of quickselect, the median of medians algorithm, chooses pivots more carefully, ensuring that the pivots are near the middle of the data, and thus has guaranteed linear time –. This same pivot strategy can be used to construct a variant of quicksort with time. However, the overhead of choosing the pivot is significant, so this is generally not used in practice.
More abstractly, given an selection algorithm, one can use it to find the ideal pivot at every step of quicksort and thus produce a sorting algorithm with running time. Practical implementations this variant are considerably slower on average, but they are of theoretical interest because they show an optimal selection algorithm can yield an optimal sorting algorithm.