In my previous post, I asked whether anybody knows what sorting algorithm they last used, and said that most real engineers probably don’t. That’s because in real life you usually use the built-in API sorting functions, and more often than not they don’t even tell you what specific algorithm those use.
On that note, here is my attempt to actually answer the original question. What follows is what I have been able to learn about the sorting algorithms used by various programming languages and compilers. Additions, corrections, and additional details are welcome in the comments.
C
The C Standard Library contains a function called qsort() for sorting arrays. While the name obviously hints at to which algorithm they had in mind, the standard itself apparently didn’t require that any particular algorithm be used.
C++
Like C, the C++ Standard Library‘s sort() function (which also has its own Wikipedia page) doesn’t specify any particular algorithm. Officially defined in the C++ ISO standard [pdf] (§25.4.1.1, page 868), sort() is referenced only as having O(n log n) average complexity, and is not guaranteed to be stable (another function, stable_sort() is provided for when stability is required).
Thus, the exact algorithm used is (like many things in C++), entirely implementation-dependent. GCC‘s sort() implementation uses a hybrid approach that starts with an introsort to a certain maximum depth, then does an insertion sort on the result. Since introsort is itself a hybrid algorithm, starting with a quicksort and then moving to a heapsort after a certain maximum depth, a single call to the sort() function can actually go through three different sorting algorithms when compiled with GCC.
Microsoft’s sort() documentation doesn’t say what algorithm it uses. SGI (who who apparently had an STL implementation at one point) says they used to use quicksort, but later switched to an introsort.
Go
Google’s Go language uses a quicksort, though this is considered by some to be a bug.
Haskell
Haskell98 has lists with a built-in sort function, though I haven’t been able to determine what algorithm it uses. It is however defined to be stable, which means that it isn’t quicksort or introsort.
Java
The java.util.Arrays.sort function officially uses a “Dual-Pivot Quicksort”, though the OpenJDK implementation recently switched from a “modified mergesort” to a Timsort.
JavaScript
The ECMAScript specification says that Array.prototype.sort is “not necessarily stable”, though web browsers have traditionally implemented it as a stable sort. Mozilla’s JavaScript implementation was originally an unstable quicksort, then a (still unstable) heapsort, but was switched to a (stable) mergesort in 2006.
Lisp
While I don’t pretend to understand the differences between Lisp dialects, it seems that common Lisp has two sorting functions: sort, and stable-sort, much like C++. The actual algorithm used is implementation dependent. GNU’s CLISP, for example, uses some kind binary tree sort that I can’t quite identify, though they source it as coming from C: A Reference Manual.
Lua
Lua’s table.Sort is defined to be unstable. The current implementation uses a quicksort.
Perl
The official Perl docs contain the following long explanation about the algorithm behind Perl’s sort() function:
Perl 5.6 and earlier used a quicksort algorithm to implement sort. That algorithm was not stable, and could go quadratic. (A stable sort preserves the input order of elements that compare equal. Although quicksort’s run time is O(NlogN) when averaged over all arrays of length N, the time can be O(N**2), quadratic behavior, for some inputs.) In 5.7, the quicksort implementation was replaced with a stable mergesort algorithm whose worst-case behavior is O(NlogN). But benchmarks indicated that for some inputs, on some platforms, the original quicksort was faster. 5.8 has a sort pragma for limited control of the sort. Its rather blunt control of the underlying algorithm may not persist into future Perls, but the ability to characterize the input or output in implementation independent ways quite probably will.
PHP
Somewhat typical of PHP’s API, there are actually thirteen different built-in array sorting functions. The manual page for sort says that most of those functions use quicksort, which means they’re all unstable. Prior to version 4.1.0 some of the sorting algorithms were stable, but that was changed for “efficiency reasons”. (It seems that the usort function was previously implemented as a bubble sort, the only case of that I’ve ever seen in a serious production language).
Python
Since version 2.3, Python’s build-in sort function uses the timsort algorithm (a tuned merge sort with a bit of binary insertion sort mixed in) invented by long-time Python developer Tim Peters (author of the Zen of Python principles) specifically for use in the Python language. Thus, timsort was designed based on the realities of what can and can not be done quickly and efficiently by the Python interpreter (though this hasn’t stopped it from being adopted by other languages and platforms also).
According to a mailing list reply by the same Tim, Python originally used the platform C library’s qsort() function, but that was found to be too slow and buggy for their purposes. They then switched to a binary insertion sort/sample sort hybrid, before finally settling on Timesort in 2003.
Ruby
The docs for Ruby’s ary.sort function don’t say anything about complexity, stability, or algorithms, but a quick check of the source code shows that the implementation (look for rb_ary_sort_bang) uses a quicksort (look for ruby_qsort).
SQL
Although not typically considered a programming language, SQL’s ORDER BY clause requires that implementations have some efficient way of sorting results. The MySQL manual describes their algorithm in detail (calling it “filesort”). It is based on quicksort, with a number of optimizations around when and how the rows to be sorted are selected and read.




The Haskell Report doesn’t specify anything beyond the sort’s stability, though it does contain a non-normative insertion sort. That version is available in GHC, but only used when GHC is compiled with a special option (USE_REPORT_PRELUDE: substitutes the Report’s simple example implementations for GHC’s optimized implementations).
By default GHC use a kind of merge sort. The source contains this comment on the algorithm:
GHC’s mergesort replaced by a better implementation, 24/12/2009.
This code originally contributed to the nhc12 compiler by Thomas Nordin
in 2002. Rumoured to have been based on code by Lennart Augustsson, e.g.
http://www.mail-archive.com/haskell@haskell.org/msg01822.html
and possibly to bear similarities to a 1982 paper by Richard O’Keefe:
“A smooth applicative merge sort”.
Benchmarks show it to be often 2x the speed of the previous implementation.
Fixes ticket http://hackage.haskell.org/trac/ghc/ticket/2143
No love for the iPhone and Mac developers? Where’s the Objective-C commentary. I’d love to know what the sortedArrayUsing… methods use under the cover.
According to the docs for .NET (http://msdn.microsoft.com/en-us/library/6tf1f0bc.aspx) it uses QuickSort. That would mean C#, VB.NET, F#, etc.
@Ben: If I remember correctly, Macs’ native obj-C toolchain is GCC, which uses qsort in its implementation
[...] This post was mentioned on Twitter by reddit_prog_hot. reddit_prog_hot said: Do you know what sorting algorithm you just used? http://bit.ly/fsQeqH http://bit.ly/gEsQyZ [28 comments] [...]
I can’t find any evidence of PHP using bubble sort for any function. Even as far back as 3.0, it used quicksort as the underlying algorithm. See php3_hash.c line 1069.
The Rosetta Code wiki has a task on the stability and documentation of the sort routines of over 25 languages, that may be of use:
http://rosettacode.org/wiki/Sort_stability
A few typos:
List->Lisp
common List->Common Lisp
Timesort->Timsort
Peace
FTR, SBCL, a Common Lisp implementation uses heapsort, just like CMUCL from which it is derived.
To clarify the Java entry:
* On arrays of primitives, quicksort is used (presumably because stability is meaningless for primitives).
* On arrays of Objects (i.e. references), the “modified mergesort” (and now Timesort) is used, meaning that sorts on non-primitive arrays are always stable.
The large number of overloads (one for each primitive type) of sort() can make this confusing, of course.
And also it’s worth noting Collections.sort(List) is also stable.
I would like to note that C++’s sort() and stable_sort() are not functions but function templates.
It’s an important difference and a source of much confusion for beginners in the language that this is so often misstated. You may have seen people wonder on programming fora why their “template(d) functions” cannot be virtual or why they cannot have a vector of their “template(d) class” and suchlike.
@Pete: Thanks! Someone was making fun of me over at reddit about that, but even after reading his quoting of what I wrote, I still didn’t catch the typos.
@Everyone else: Thanks for all the additions and clarifications. I’ll try to do an update of the post text incorporating all the new information soon.
As a beginner people have to go through lot of detailing. But later things will become much easier.
[...] 各种语言的排序算法。你想知道各种语言其默认的排序算法用的是哪种排序算法吗?看看这篇文章吧。 [...]
Many of them is quick sort ,but which one is better? May be it’s hard to say.
[...] 各种语言的排序算法。你想知道各种语言其默认的排序算法用的是哪种排序算法吗?看看这篇文章吧。 [...]
Very useful details and comparisons ( between the different languages )
For learning purposes , if you want to visualize sorting algorithms , also take a look at :
http://http://www.thelearningpoint.net/computer-science
( Java applet visualization for popular sorting algorithms – Bubble sort , merge , quick , insertion , shell sort )