Union-Find Algorithm in Python

Manav Narula Apr 01, 2022
  1. The Union-Find Algorithm
  2. Implement the Union-Find Algorithm in Python
Union-Find Algorithm in Python

This tutorial will discuss how to implement the union-find algorithm in Python.

The Union-Find Algorithm

In programming, selecting the optimal data structure is very important to ensure the efficient performance of our code.

The union-find data structure is used to note and keep track of a given set of values partitioned into many disjoint subsets that are not overlapping. This data structure is also known as a disjoint subset.

It supports two operations on the subsets, Find and Union. Let us discuss them below.

The Find operation finds the subset of a given element. It provides the subset representative, typically an item from this set.

The Union operation merges two subsets. It combines the subsets only if they belong to the same set, and the two subsets then share the representative.

We use two Find operations to compare the two elements and check if they belong to the same subset. If they have the same representative, they do, and then we perform the Union operation to merge the two subsets to which the two elements belonged.

The Union-Find algorithm has different applications like finding the minimum spanning tree, detecting cycles in an undirected graph, etc.

Implement the Union-Find Algorithm in Python

To implement the Union-Find in Python, we use the concept of trees. The tree’s root can act as a representative, and each node will hold the reference to its parent node.

The Union-Find algorithm will traverse the parent nodes to reach the root and combine two trees by attaching their roots.

Example:

class uf_ds:
    parent_node = {}

    def make_set(self, u):
        for i in u:
            self.parent_node[i] = i

    def op_find(self, k):
        if self.parent_node[k] == k:
            return k
        return self.op_find(self.parent_node[k])

    def op_union(self, a, b):
        x = self.op_find(a)
        y = self.op_find(b)
        self.parent_node[x] = y


def display(u, data):
    print([data.op_find(i) for i in u])

u = [1, 3, 5, 7]
data = uf_ds()
data.make_set(u)
display(u, data)
data.op_union(1, 5)
display(u, data)
data.op_union(7, 1)
display(u, data)

Output:

[1, 3, 5, 7]
[5, 3, 5, 7]
[5, 3, 5, 5]

In the above example, we implemented the Union-Find Algorithm in Python. We create a class to represent a disjoint set.

This class defines the Union and Find operations methods and displays the set. After comparing the values using the Find operation, we define an object of this class and combine the trees using the Union operation.

Notice the result after each operation.

However, the time complexity becomes linear in the worst-case scenario for the above implementation. The trees get skewered and are no better than linked lists.

Thus, two optimization techniques have been introduced to improve the Union-Find algorithm.

The first involves the ranking of the tree to improve the Union operation. The depth of the tree can increase the time complexity of the naïve approach, so this technique ensures that we attach the root of the smaller tree to the bigger tree.

This improves the worst-case time complexity of the Union-Find algorithm in Python to almost O(Log(n)).

Path compression is another technique used to improve the efficiency of the Union-Find Algorithm in Python. This approach intends to flatten the given tree and improve the Find operation.

This intends to make the discovered root node the parent of node n. This removes the need to traverse through the immediate nodes.

All the elements below this compress when node n is the root of a given subtree.

These two techniques are efficient in improving the time-complexity of the Union-Find Algorithm, making it even less than (OLog(n)). They both complement each other.

We can implement them in our code, as shown below.

class ufds:
    parent_node = {}
    rank = {}

    def make_set(self, u):
        for i in u:
            self.parent_node[i] = i
            self.rank[i] = 0

    def op_find(self, k):
        if self.parent_node[k] != k:
            self.parent_node[k] = self.op_find(self.parent_node[k])
        return self.parent_node[k]

    def op_union(self, a, b):
        x = self.op_find(a)
        y = self.op_find(b)

        if x == y:
            return
        if self.rank[x] > self.rank[y]:
            self.parent_node[y] = x
        elif self.rank[x] < self.rank[y]:
            self.parent_node[x] = y
        else:
            self.parent_node[x] = y
            self.rank[y] = self.rank[y] + 1


def display(u, data):
    print([data.op_find(i) for i in u])


u = [1, 3, 5, 7]
data = ufds()
data.make_set(u)
display(u, data)
data.op_union(1, 5)
display(u, data)
data.op_union(7, 1)
display(u, data)

Output:

[1, 3, 5, 7]
[5, 3, 5, 7]
[5, 3, 5, 5]

Notice the changes in the op_union() and op_find() functions to implement the ranking of the union and path compression techniques, respectively.

Author: Manav Narula
Manav Narula avatar Manav Narula avatar

Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.

LinkedIn

Related Article - Python Algorithm