Every decade or so, I become fascinated with sorting algorithms and spend some time implementing various approaches in my language of choice. The first time around, it was C++. Last time I took a stab at it, it was C#. This time around, I wanted to implement some popular sorting algorithms in Go. I think I enjoy it so much because sorting is easy to verify and there are so many varied approaches. A lot of people have spent a lot of time and energy coming up with creative methods for organizing array elements. I also wanted to use this as a test bed for some relatively new language features in Go, namely generics and fuzzing. Finally, I wanted to code with some assistance from ChatGPT, to see what it could do for me as a programming tool.

Sorting algorithms generally fall into two broad categories: comparison sorts and integer sorts. Comparison sorts involve comparing elements in an array to determine if they are less than each other (or some other comparison operator) and then moving the elements around, usually via swapping. Most sorting algorithms seem to fall into this category and they are typically the most useful for real world scenarios. Integer sorts involve non-negative integer values and usually use number related tricks to sort data. They are typically faster than comparison sorts, but not always practically useful. It really depends on your use case.

There’s also another defining characteristic that sorting algorithms either have or do not have: stability. An algorithm is considered stable if the order of the elements is maintained during the sorting operation. Meaning, if you sort an array of structs based on some identifier, all the other fields will remain in the same order. This can be useful for applying multiple sorting operations on a data set, like sorting cars by both make and model. Some algorithms are generally considered stable, while others are not stable, but frequently this can also be affected by how the algorithm is implemented.

For my experiments, I decided to prefer blazing speed with a secondary emphasis on low memory usage and completely ignore stability. Some of the algorithms I created are most likely stable, but I haven’t validated them with any kind of testing.

Before creating my first implementation of a sorting algorithm, I had to decide on a common function signature to use. I wanted it to be generic, so I ended up with this:

``````func Sort[T constraints.Ordered](s []T) []T {
``````

This allows the sorting function to accept a slice of any primitive data type that can be compared and therefore ordered. Since this was mostly for fun and I didn’t care about stability, I wasn’t concerned with the ability to sort structs. The function accepts a slice, with no guarantee that the slice will remain untouched, and returns a sorted slice. In most implementations, slices are sorted in place and the returned slice is merely a formality.

In order to ensure that the sorting algorithms were implemented correctly, I created a suite of unit tests and benchmark tests. Test cases cover arrays that are randomly sorted, already sorted, and sorted in reverse order. This covers the typical average case, best case, and worst case scenarios. I also played around with fuzzing to test with random inputs:

``````func Fuzz_Sort(f *testing.F) {
f.Fuzz(func(t *testing.T, seed int64) {
// seed the random number generator with the fuzz input, so the test case can be reproduced
r := rand.New(rand.NewSource(seed))

sliceLength := randInt(r, 1, 100000)
numLowerBound := randInt(r, 0, 999999)
numUpperBound := randInt(r, numLowerBound+1, 1000000)

// create a slice of random length with random values
length := randInt(r, 0, sliceLength)
s := make([]int, length)
for i := range s {
s[i] = randInt(r, numLowerBound, numUpperBound)
}

result := quicksort.Sort(s)

assert.True(t, sort.IntsAreSorted(result))
})
}
``````

I started with the simple sorts, namely Bubble, Selection, and Insertion. These are the types of sorting algorithms that you learn in school and are easy to understand, but are generally poor performers. Insertion sort is actually pretty quick on small arrays, so it is sometimes used to increase performance in more complicated algorithms.

Insertion sort works by inserting each element, one at a time, into its sorted position in a sorted array. One major optimization is to use a binary search to determine where the element should be inserted. Some other minor optimizations included using a bit shift for division and a single copy function call for the insertion. Here’s what the optimized Insertion sort looks like:

``````// Insertion sort
func Sort[T constraints.Ordered](s []T) []T {
for i := 1; i < len(s); i++ {
if s[i] > s[i-1] {
// if the value to insert is greater than the largest value in the
// sorted section, it is already in the correct place
continue
}

// use binary search to find the correct position to insert value
pos := 0
end := i
for pos < end {
j := int(uint(pos+end) >> 1) // use bit shift to divide by 2 since it is faster
if s[i] < s[j] {
end = j
} else {
pos = j + 1
}
}

// shift elements to the right and insert value
v := s[i]
copy(s[pos+1:], s[pos:i])
s[pos] = v
}
return s
}
``````

I then moved on to some of the more complicated sorts, like Merge and Quick. Merge sort is a divide-and-conquer algorithm that sorts an array by dividing it into two halves, recursively sorting each half, and then merging the two sorted halves. I started with a more traditional recursive implementation:

``````// Merge sort
func Sort[T constraints.Ordered](s []T) []T {
length := len(s)
if length <= 1 {
return s
}

// sort left and right sides
mid := length >> 1
left := Sort(s[:mid])
right := Sort(s[mid:])

// recombine left and right sides in sorted order
sorted := make([]T, length)
l, r := 0, 0
for i := 0; i < length; i++ {
if l >= len(left) {
sorted[i] = right[r]
r++
} else if r >= len(right) {
sorted[i] = left[l]
l++
} else if left[l] < right[r] {
sorted[i] = left[l]
l++
} else {
sorted[i] = right[r]
r++
}
}

return sorted
}
``````

The above implementation is actually pretty fast right off the bat, but I wanted to make it faster. Some optimizations I tried were to not allocate an array with each recursive call and then to eliminate the recursion completely and sort in place. These optimizations weren’t actually faster by themselves, because I was having to call `copy` more frequently. Eventually, I settled on a non-recursive version that allocates a single buffer array:

``````// Merge sort
func Sort[T constraints.Ordered](s []T) []T {
length := len(s)
buf := make([]T, length)
// sort left and right sides
for size := 1; size < length; size <<= 1 {
for segment := 0; segment < length; segment += size << 1 {
// make sure the end of the right segment doesn't extend past the end of the slice
end := segment + size<<1
if end > length {
end = length
}

// recombine left and right segments in sorted order
l, r := segment, segment+size
for i := segment; i < end; i++ {
if l >= segment+size {
buf[i] = s[r]
r++
} else if r >= end {
buf[i] = s[l]
l++
} else if s[l] < s[r] {
buf[i] = s[l]
l++
} else {
buf[i] = s[r]
r++
}
}
}

// swap the source and buffer slices for the next iteration
s, buf = buf, s
}

return s
}
``````

Finally, I had a blazingly fast Merge sort. For randomly sorted arrays, this was my benchmark to beat until I implemented a Quick sort. At a high level, Quick sort works by partitioning the array around a pivot element, with smaller elements on the left side and larger elements on the right side. It continues to recursively sort the resulting subarrays on either side of the pivot until the entire array is sorted. There are multiple methods for determining the pivot, including Lomuto (pivot at the end) and Hoare (pivot in the middle). After implementing both, I found Hoare to be more performant, which agrees with public consensus. Here is an optimized Quick sort using a Hoare partition scheme and Insertion sort if the number of elements is small:

``````// Quick sort
func Sort[T constraints.Ordered](s []T) []T {
if len(s) < 10 {
return insertionsort.Sort(s)
}

// set pivot to middle of slice
pivot := s[(len(s)-1)/2]

start := 0
end := len(s) - 1

for start <= end {
// move start index to the right while the element is less than the pivot
for s[start] < pivot {
start++
}

// move end index to the left while the element is greater than the pivot
for s[end] > pivot {
end--
}

if start <= end {
// swap elements, since left element is greater than pivot and right element is less than pivot
s[start], s[end] = s[end], s[start]
start++
end--
}
}

// sort left and right sides, not including the pivot
if end > 0 {
Sort(s[:end+1])
}
if start < len(s)-1 {
Sort(s[start:])
}

return s
}
``````

Finally, we can look at some results for the various sorting algorithms. Based on a 2022 M1 MacBook Pro, here’s the numbers for the comparison sorts. As expected, Quicksort takes home the gold:

Algorithm Random Order (ns/op) Sorted (ns/op) Reverse Sorted (ns/op)
Bubble 88513901 6059 70483268
Selection 31884152 34853398 31465277
Insertion 3096925 5307 5339183
Shell 622258 56295 112767
Heap 569913 420920 438415
Merge 493800 143152 173888
Quick 418391 46351 49302

And here’s the numbers for the integer sorts. Counting sort blazes past the competition, but it could be considered cheating, since the data is actually reconstituted, instead of swapped:

Algorithm Random Order (ns/op) Sorted (ns/op) Reverse Sorted (ns/op)
Pigeonhole 168380 169163 167443
Counting 19592 19778 19182
Bucket 331089 56711 167627