Kalan's Blog

Kalan 頭像照片,在淡水拍攝,淺藍背景

四零二曜日電子報上線啦!訂閱訂起來

Software Engineer / Taiwanese / Life in Fukuoka
This blog supports RSS feed (all content), you can click RSS icon or setup through third-party service. If there are special styles such as code syntax in the technical article, it is still recommended to browse to the original website for the best experience.

Current Theme light

我會把一些不成文的筆記或是最近的生活雜感放在短筆記,如果有興趣的話可以來看看唷!

Please notice that currenly most of posts are translated by AI automatically and might contain lots of confusion. I'll gradually translate the post ASAP

Analysis of Array.sort

Analysis of Array.sort

This article does not discuss the things to be aware of in the native sort of JavaScript. For example:

;[1, 2, 3, 8, 20, 30, 11].sort()
// [1, 11, 2, 20, 3, 30, 8]

Because the default sort method converts values to strings and sorts them based on their char code, the above result occurs.

Today, we will explore the implementation behind the sort in JavaScript.

From the implementation in V8, we can see a few facts:

  • Array's sort uses quicksort for sorting.
  • When the array size is less than or equal to 10, insertion sort is used.

To simplify the code in V8, here is a simple implementation of quicksort: (for reference only)

function quickSort(arr, p, r) {
  if (p < r) {
    var q = partition(arr, p, r)
    quickSort(arr, q - 1, p)
    quickSort(arr, q + 1, r)
  }
}

function partition(arr, p, r) {
  var x = arr[r]
  var i = p - 1
  for (var j = p; j < r - 1; j++) {
    if (arr[j] <= x) {
      i += 1
      var tmp = a[j]
      arr[j] = a[i]
      arr[i] = tmp
    }
  }
  var tmp = arr[r]
  arr[r] = a[i + 1]
  arr[i + 1] = tmp
}

In-depth Analysis: Why Quicksort?

The key point in implementing quicksort is how to choose a relatively good pivot to avoid the worst-case scenario. In practical situations, the input data is not necessarily random, so random selection is often used to choose the pivot.

The first question is, why does V8 use quicksort? Although the average time complexity of quicksort can reach O(nlogn), the worst-case scenario can be O(n^2). Additionally, quicksort is not a stable algorithm, meaning that two data with the same value may have different positions after sorting.

Why not use mergesort?

Mergesort can be simply divided into two steps: first, divide the array and then continuously merge them using the merge function. Not only can it achieve average, worst, and best-case time complexity of O(nlogn), but the algorithm itself is also stable. So, why not use it?

In-place

In quicksort, we do not need to perform the merge operation on the array, which means the entire algorithm can be completed without additional space. On the other hand, mergesort requires O(n) space. Therefore, despite the aforementioned situations that can occur with quicksort, it is still a good choice in practice.

We can avoid the occurrence of the O(n^2) scenario by randomizing the selection of the pivot (how to randomly select the pivot can actually be explained in a separate article).

Stable

However, even so, we still cannot solve the stability issue. Although it may not be important in most scenarios (after all, data sorting is usually implemented by the backend), it becomes a significant consideration when it does occur.

Not every browser's implementation uses Quicksort

Insertion Sort

If you carefully examine the V8 source code, you will notice this section:

 while (true) {
  // Insertion sort is faster for short arrays.
  if (to - from <= 10) {
    InsertionSort(a, from, to);
    return;
  }

Hmm, why use insertion sort when the array size is less than or equal to 10?

To understand the reason, let's recall the principle of insertion sort. Insertion sort is similar to arranging cards in a card game. Each time you pick up a card, you find the most suitable position to insert it. Each time, the cards to be inserted are already sorted, allowing in-place sorting.

function insertionSort(arr) {
  for (var j = 1; j < arr.length; j++) {
    var key = arr[j]
    var i = j - 1
    while (i >= 0 && arr[i] > key) {
      arr[i + 1] = arr[i]
      i = i - 1
    }
    arr[i + 1] = key
  }
  return arr
}

Although insertion sort has the same time complexity as bubble sort, there is a significant difference in the number of swaps. Bubble sort has O(n^2) swaps, while insertion sort requires at most O(n).

Going back to the initial question, why use insertion sort when the array size is less than or equal to 10?

For small arrays, if they are already sorted or very close to being sorted, insertion sort is the only algorithm that can achieve a time complexity of O(n). This is quite efficient.

Conclusion

Because sorting is needed in work, I began to delve into what the native sort does behind the scenes. In addition to being aware that JavaScript defaults to comparing strings, understanding the implementation behind it is crucial when dealing with large arrays.

At the same time, we also understand that different sorting algorithms have their suitable scenarios. When using sort, remember:

  • Quicksort usually yields the best results in practice, but be aware that it is not a stable algorithm.
  • Although mergesort can achieve a time complexity of O(nlogn), it requires an additional O(n) space for merging.
  • Insertion sort performs well on small arrays and can achieve the best-case time complexity of O(n).

Prev

PostgreSQL Notes — INDEX

Next

How to Code Review

If you found this article helpful, please consider buy me a drink ☕️ It'll make my ordinary day shine✨

Buy me a coffee