Sorting in JavaScript: Should every compare function have a “return 0” statement?

I recently read many answers about sorting in JavaScript and I often stumble upon a compare function that looks like this:

array.sort(function(a,b){ a > b ? 1 : -1; });

So it is a compare function that returns 1 if a is greater than b and -1 if a is less than OR EQUAL TO b. As described on MDN (link), a compare function can also return zero, to ensure that the relative position of two items remains unchanged:

If compareFunction(a, b) returns 0, leave a and b unchanged with respect to each other, but sorted with respect to all different elements.

So the official examples look more like this:

function compare(a, b) {
  if (a < b) return -1;
  if (a > b) return 1;
  return 0;
}

And indeed, by adding a return 0 statement, the sorting algorithm often needs less iterations and runs faster in total (JSPerf).

So I was wondering if there is any advantage on omitting a return 0 statement.

I realized that on MDN, it also says:

Note: the ECMAscript standard does not guarantee this behaviour, and thus not all browsers (e.g. Mozilla versions dating back to at least 2003) respect this.

referring to the behavior, that a and b should remain unchanged if 0 is returned. So maybe, by returning 0, we get a slightly different sorted array in different browsers? Could that be a reason? And are there any other good reasons for not returning zero at all?

Answers:

Answer

A compare method should obey the contract

Math.sign(compare(a, b)) = -Math.sign(compare(b, a))

If you return a nonzero answer when a==b, that contract is violated.

Answer

So I was wondering if there is any advantage on omitting a return 0 statement.

Less letters to type. And it might be a tiny bit faster due to the one omitted comparison. All other effects are disadvantages.

I realized that on MDN, it also says:

Note: the ECMAscript standard does not guarantee this behaviour, and thus not all browsers (e.g. Mozilla versions dating back to at least 2003) respect this.

referring to the behavior, that a and b should remain unchanged if 0 is returned.

That the position of a and b may remain unchanged is only the requirement for a stable sort. This is not a specified behaviour, and some browsers have implemented a non-stable sort algorithm.

However, the actual purpose of returning zero is that neither a is sorted before b (as if less than 0) nor that b is sorted before a (as if greater than 0) - basically when a equals b. This is a must-have for a comparison, and all sorting algorithms obey it.

To produce a valid, satisfiable ordering (mathematically: divide the items into totally ordered equivalence classes), a comparison must have certain properties. They are listed in the spec for sort as requirements for a "consistent comparison function".

The most prominent one is reflexivity, demanding that an item a is equal to a (itself). Another way to say this is:

compare(a, a) must always return 0

What happens when a comparison function does not satisfy this (like the one you stumbled upon obviously does)?

The spec says

If comparefn […] is not a consistent comparison function for the elements of this array, the behaviour of sort is implementation-defined.

which basically means: If you provide an invalid comparison function, the array will probably not be sorted correctly. It might get randomly permuted, or the sort call might not even terminate.

So maybe, by returning 0, we get a slightly different sorted array in different browsers? Could that be a reason?

No, by returning 0 you get a correctly sorted array across browsers (which might be different due to the unstable sort). The reason is that by not returning 0 you get slightly different permuted arrays (if at all), maybe even producing the expected result but usually in a more complicated manner.

So what could happen if you don't return 0 for equivalent items? Some implementations have no problems with this, as they never compare an item to itself (even if apparent at multiple positions in the array) - one can optimise this and omit the costly call to the compare function when it is already known that the result must be 0.

The other extreme would be a never-terminating loop. Assuming you had two equivalent items after each other, you would compare the latter with the former and realise that you had to swap them. Testing again, the latter would still be smaller than the former and you'd have to swap them again. And so on…

However, an efficient algorithm mostly does not test already compared items again, and so usually the implementation will terminate. Still, it might do more or less swaps that actually had been unnecessary, and will therefore take longer than with a consistent comparison function.

And are there any other good reasons for not returning zero at all?

Being lazy and hoping that the array does not contain duplicates.

Tags

Recent Questions

Top Questions

Home Tags Terms of Service Privacy Policy DMCA Contact Us Javascript

©2020 All rights reserved.