This tutorial will show the implementation of the heap sort algorithm in Python.
Heap Sort Algorithm in Python
Heap sort is a robust algorithm for sorting arrays and lists in Python. It is popular because it is very fast and doesn’t take any extra space like merge sort and quick sort.
The time complexity of heap sort is
The heap sort is an in-place algorithm that doesn’t create any more data structures to save the intermediary states of data. Instead, it makes changes in our original array.
Hence, this saves us a lot of space when the data is very large.
The only disadvantage of this algorithm is that it is very unstable. If we have multiple elements in our array with the same value at different indices, their locations will change while sorting.
The heap sort algorithm works by recursively creating either a min or a max heap, taking out the root node, placing it at the first unsorted index in our array, and converting the last heap element into the root node.
This process is recursively repeated until we are left with a single node inside our heap. In the end, the last heap element is placed at the last index of our array.
If we think about it for a second, this process resembles the selection sort algorithm as we take either the largest or the smallest values and place them at the top of our sorted array.
Implement the Heap Sort Algorithm in Python
We will first look at implementing the
build_heap() function that takes the original array, the array’s length, and the index of our parent node. Here, if we look at an array, the index for the last parent node is located at
(n//2 - 1) inside our array.
Similarly, the index for the left child of that specific parent is
2*parent_index + 1, and the index for the right child is
2*parent_index + 2.
In this example, we are trying to create a max-heap. That means every parent node needs to be greater than its child nodes.
For this, we will start from the last parent node and move upwards to the root node of our heap. If we wanted to create a min-heap, we would want all our parent nodes to be smaller than their child nodes.
build_heap() function will check whether the left or the right child is greater than the current parent node and swap the greatest node with the parent node.
This function recursively calls itself because we want to repeat the previous process incrementally for all the parent nodes in our heap.
The following code snippet demonstrates a working implementation of the
built_heap() function mentioned above in Python.
def build_heap(arr, length, parent_index): largest_index = parent_index left_index = 2 * parent_index + 1 right_index = 2 * parent_index + 2 if left_index < length and arr[parent_index] < arr[left_index]: largest_index = left_index if right_index < length and arr[largest_index] < arr[right_index]: largest_index = right_index if largest_index != parent_index: arr[parent_index], arr[largest_index] = arr[largest_index], arr[parent_index] build_heap(arr, length, largest_index)
Now, we have a function that takes the maximum value inside our array and places it on the root of our heap. We need a function that takes the unsorted array, calls the
build_heap() function and extracts elements from the heap.
The following code snippet demonstrates the implementation of the
heapSort() function in Python.
def heapSort(arr): length = len(arr) for parent_index in range(length // 2 - 1, -1, -1): build_heap(arr, length, parent_index) for element_index in range(length - 1, 0, -1): arr[element_index], arr = arr, arr[element_index] build_heap(arr, element_index, 0)
We incrementally call each parent node’s
build_heap() function inside our array. Notice that we are giving
length//2-1 as the starting index and
-1 as the ending index, with a step of
This means that we are starting from the last parent node and incrementally decreasing our index by 1 until we reach the root node.
for loop extracts elements from our heap. It also starts from the last index and stops at the first index of our array.
We swap our array’s first and last elements in this loop and execute the
build_heap() function on the newly sorted array by passing 0 as the root index.
Now, we have written our program to implement heap sort in Python. It is time to sort an array and test the code written above.
arr = [5, 3, 4, 2, 1, 6] heapSort(arr) print("Sorted array :", arr)
Sorted array : [1, 2, 3, 4, 5, 6]
As we can see, our array is completely sorted. This means that our code works just fine.
If we want to sort in descending order, we can create a min-heap instead of the max-heap implemented above.
This article won’t explain min-heap because it already discussed what a min-heap is at the start of this tutorial.
Our program works in the following fashion. The following block shows the state of our array at each stage of code execution.
Original Array [5, 3, 4, 2, 1, 6] # input array Building Heap [5, 3, 6, 2, 1, 4] # after build_heap() pass 1 Building Heap [5, 3, 6, 2, 1, 4] # after build_heap() pass 2 Building Heap [6, 3, 5, 2, 1, 4] # after build_heap() pass 3 Extracting Elements [6, 3, 5, 2, 1, 4] # before swapping and build_heap pass 1 Extracting Elements [5, 3, 4, 2, 1, 6] # before swapping and build_heap pass 2 Extracting Elements [4, 3, 1, 2, 5, 6] # before swapping and build_heap pass 3 Extracting Elements [3, 2, 1, 4, 5, 6] # before swapping and build_heap pass 4 Extracting Elements [2, 1, 3, 4, 5, 6] # before swapping and build_heap pass 5 Sorted array : [1, 2, 3, 4, 5, 6] # after swapping and build_heap pass 5
build_heap() function gets executed 3 times because there are only 3 parent nodes in our heap.
After that, our element extraction phase takes the first element, swaps it with the last element, and executes the
build_heap() function again. This process is repeated for
length - 1, and our array gets sorted.