### What I Talk About When I Talk About Sorting: Untangling Array#sort

Sorting things is a fundamental part of our daily lives—it’s something we do everyday, following all kinds of criteria to make our lives easier. Whether you’re looking for a person’s phone number, the location of your favorite book, or even matching up your socks, sorting allows us to find what we are looking for in a faster and more effective way.

This is also the case in the world of web development. But if you thought you knew exactly how JavaScript’s Array#sort works under the hood, think twice.

## Scratchin’ the surface

No matter your skill level, if you’re a JavaScript developer you’ve probably come across the Array#sort method at some point. Do you remember the first time you tried sorting numbers in JavaScript? You were probably astonished to discover (just like the rest of us) that the sort method does NOT sort things out quite as we might expect.  You don’t know what I’m talking about? Let’s dive into some code:

``````const myArray = [33, 2, 98, 25, 4]
myArray.sort() // [ 2, 25, 33, 4, 98 ]``````

Wait, what? Is JavaScript nuts? In which world is 25 and 33 smaller than 4? Before you start rethinking your whole life, let’s figure this out.

## Lexicographical sorting

What is actually happening here is that JavaScript is sorting our numerical array in a lexicographical manner—think alphabetical order, where every value is treated as a string.

The catch here is that Array#sort can actually take a compare function as a parameter. However, if said function is not supplied, “[…] elements are sorted by converting them to strings and comparing strings in Unicode code point order” (according to the MDN docs). This means that JavaScript will treat the following arrays in a similar fashion when calling the sort method:

``````const numbers = [80, 9]
numbers.sort() // [80, 9]

const strings = ['80', '9']
strings.sort() // ['80', '9']``````

In this case, “80” comes before “9” because it has a smaller Unicode code point. If you don’t believe me, let’s take a look at the code point value of the first character of each:

``````"8".codePointAt(0)  // 56
"9".codePointAt(0)  // 57``````

Basically the function `codePointAt()` is simply a method of the String object that is used to get the Unicode code point value of any character at a given index.

At this point, the following code shouldn’t be shocking to anybody because now we know that JavaScript is just converting all the elements in those arrays to strings and comparing their Unicode values. (Yes, Emojis also have Unicode code point values.)

``````const emojis = ["😍","😂","😰"]
emojis.sort() // ["😂", "😍", "😰"]

const wtfJavaScript = [390, "😂", 1, "2325"]
wtfJavaScript.sort() // [1, "2325", 390, "😂"]``````

## Numerical sorting

After all that mumbo jumbo, what if we actually JUST wanted to sort our array numerically? As stated before, we need to provide a compare function that will sort the array elements according to the return value of that compare function.

• If the return value of `compareFunction(a, b)` is less than 0, a will come before b
• If the return value is greater than 0, b will come before a
• If the return value is 0, a and b will remain unchanged

To compare numbers instead of strings, provide a function that subtracts b from a. Here is the example:

``````const myArray = [33, 2, 98, 25, 4]
myArray.sort((a,b) => a - b) // [ 2, 4, 25, 33, 98 ]``````

## Rollin’ in the deep

During all this JavaScript sorting fun, I bet you wondered at some point about the actual algorithm that’s used behind the JS native sort method. No? That was just me? Either way, let’s check it out.

Now, here’s the thing: the ECMAScript standard doesn’t really specify which algorithm should be used, so basically every JavaScript engine is allowed to use whatever algorithm they want. Why should you care about this? Keep reading, but first let’s find out which algorithms are used by which engines. (The good thing is that most of these engines are open source so we can actually look at the code and check what they are using.)

This isn’t a computer science article, but let’s get some things straight. Because JavaScript engines don’t use the same algorithm behind the native method, it is very likely that you’ll encounter different results when running the sort method in different browsers. You might find that elements are being sorted in a different way, or some sorts run faster than others (depending on the JavaScript engine). If your application relies more crucially on sorting data, you have to pay attention to these kinds of details.

For example, Google’s V8 that powers NodeJS uses the quick sort algorithm, which is not a stable algorithm. Stability in the context of sorting means that it preserves the original order of the input set when having elements with equal values. If you have to sort a list of people in your database who were previously sorted alphabetically by last name, you might want to preserve that original order when sorting again, this time according to age and looking for people of the same age. This means you will need a stable sort.

Since the Array#sort method is implemented using different algorithms (that may or may not be stable) depending on the JavaScript engine, stability is not guaranteed. Let’s check an example:

``````const people = [
{ name: 'Kei Akamatsu', age: 32 },
{ name: 'Fumiaki Haida', age: 42 },
{ name: 'Tengo Kawana', age: 26 },
{ name: 'Sara Kimoto', age: 11 },
{ name: 'Midori Kobayashi', age: 11 },
{ name: 'Eri Kurono', age: 54 },
{ name: 'Haruki Murakami', age: 6 },
{ name: 'Satoru Nakata', age: 26 },
{ name: 'Yoshio Oumi', age: 26 },
{ name: 'Miss Saeki', age: 17 },
{ name: 'Yuzuki Shirane', age: 26 },
{ name: 'Kafka Tamura', age: 26 },
{ name: 'Tsukuru Tazaki', age: 32 },
{ name: 'Toru Watanabe', age: 12 }
]

people.sort((a,b) => a.age - b.age)``````

In V8, this is the actual result:

``````{ name: 'Haruki Murakami', age: 6 },
{ name: 'Midori Kobayashi', age: 11 },
{ name: 'Sara Kimoto', age: 11 },
{ name: 'Toru Watanabe', age: 12 },
{ name: 'Miss Saeki', age: 17 },
{ name: 'Kafka Tamura', age: 26 },
{ name: 'Satoru Nakata', age: 26 },
{ name: 'Yuzuki Shirane', age: 26 },
{ name: 'Yoshio Oumi', age: 26 },
{ name: 'Tengo Kawana', age: 26 },
{ name: 'Tsukuru Tazaki', age: 32 },
{ name: 'Kei Akamatsu', age: 32 },
{ name: 'Fumiaki Haida', age: 42 },
{ name: 'Eri Kurono', age: 54 }``````

Notice how the previous alphabetical order for the people who are 26 years old is not preserved? In JavaScript engines implemented with stable algorithms, this is not a problem.  Just keep in mind that the sort method will sort differently depending on where you’re running it.

What if it’s crucial for your application to maintain consistent behavior across engines? Is implementing your own version of the sort method even an option? Maybe yes, maybe no.  If stability and performance are high on your priority list, it’s probably yes.

Actually, implementing your own homemade JavaScript sorting function is not that difficult and takes only a few lines of code. There are plenty of books that explain how to implement the most popular sorting algorithms. Two good resources are The Algorithm Design Manual by Steven S. Skiena and Foundations of Algorithms by Richard Neapolitan et al.

You can even extend the Array Prototype to call your shiny new implemented sort functions:

``````Array.prototype.InsertionSort = function() {
}

Array.prototype.MergeSort = function() {
}

Array.prototype.QuickSort = function() {
}

myArray.InsertionSort()
myArray.MergeSort()
myArray.QuickSort()``````

Believe it or not, self-implemented JavaScript sorting functions can definitely be faster than the native method, though that depends on various things, such as the amount of space that you have available, the kind and quantity of data that you will be sorting, and also the JavaScript engine that you will be using.

## Testing and benchmarking

Too hard to believe? Let’s do some benchmarking! The following table is the result of testing the native JavaScript sort method against my own implementations of insertion sort, merge sort, and quick sort for dynamically created arrays with x elements. The actual values shown on the table are the operations per second done by every method:

As mentioned before, the quantity of data to be sorted directly impacts the performance of every algorithm. Notice how insertion sort performs better than the other methods (including the native sort) for the first thousand elements. As the data input increases, insertion sort becomes slower and for a hundred thousand elements it becomes the least performant. At this point, however, quick sort takes the lead. For the following test cases, it proves to be still more performant than the native version. (It is very important to clarify that the previous benchmark was tested in Chrome 56 on a Mac OS X 10.12.3.)

Your homework now is to try performing the same benchmark on different machines, with different input sizes and different JavaScript engines to see what results you get!

## JavaScript inception

I might not be a fortune teller, but I bet you’re probably thinking: “How come self-implemented JavaScript sorting functions can beat the native sort C/C++ implementations?”

First of all, let’s backtrack a little bit. C/C++ implementations? Are we even sure about that?

If you peeked at the source code of the JavaScript engines, perhaps you noticed that in the case of V8 and Nitro, the implementation of the sort method is actually done in JavaScript itself. Wait again, what? Am I saying that those JavaScript engines are written in JavaScript itself? Is this some kind of JavaScript inception? Yes indeed.

In the world of computer science this is actually called self hosting: the art of implementing parts of a language in that very language itself. But then again, isn’t C++ suppose to be faster than JavaScript? Yes and no. C++ programs are definitely faster than JavaScript when they operate on C++ data structures but not on JavaScript ones.

JavaScript arrays have methods like `forEach`, `map`, `reduce`, and of course `sort`. All of these methods takes a callback as an argument. The method then iterates over every element of the list and invokes the callback during each iteration. This means that the execution has to switch between compiled C++ code and interpreted JavaScript, and this context switch is expensive. By staying in the same execution context within the same language we can boost performance. If you actually need some proof, try comparing the performance of running Array#forEach against a simple `for` loop.

In certain cases, you may notice that there is unneeded overhead due to the actual Array#sort method implementation in every engine. Calling callback functions for every element in our array adds an extra layer of complexity to the task of simply sorting numbers.  It seems that the native implementations just do more overall (in terms of error handling and features) than the simple self-implementations.

## TL;DR

Yes, most people are aware of Array#sort, its funky behavior, and the fact you’ll need a compare function to actually achieve a numerical sort. Because JavaScript. But did you know that the algorithms used to natively implement this method vary across engines? And that this produces different results depending on where you are running it?

If you care about performance and consistency, creating your own self-implementation of the sort method might not be a wild idea. Of course, you should learn how to choose your battles depending on the needs of your applications, whether you’re handling complex databases in a NodeJS application or sorting in the front end. Don’t implement your own versions for the sake of it, but take time to level the pros and cons. Only you can decide how pragmatic it is to spend time creating a homemade sort algorithm, as opposed to the native one.

Source: A List Apart