How to Factorize Data Values in Pandas

Preet Sanghavi Feb 02, 2024
How to Factorize Data Values in Pandas

In this tutorial, we will learn to factorize in Pandas. We will be using the pandas.factorize() function to perform the task.

By recognizing different values, the pandas.factorize() method aids in obtaining the numeric representation of an array.

Firstly, we will import the Pandas and numpy libraries and other required libraries.

import numpy as np
import pandas as pd
from pandas.api.types import CategoricalDtype

Use the pandas.factorize() Function in Pandas

Now we will pass a list containing the characters to the factorize() function, which will return us the labels and the unique values. We will output the labels and unique values separately.

labels, uniques = pd.factorize(["b", "d", "d", "c", "a", "c", "a", "b"])

The above code will return us the list of the numeric representations of characters and the unique values.

Let us see the output using the below code.

print("Numeric Representation : \n", labels)
print("Unique Values : \n", uniques)
Numeric Representation :
 [0 1 1 2 3 2 3 0]
Unique Values :
 ['b' 'd' 'c' 'a']

We can also sort the alphabet using the below code.

labels, uniques = pd.factorize(["b", "d", "d", "c", "a", "c", "a", "b"], sort=True)

We will have the below output for the above amendment.

Numeric Representation :
 [1 3 3 2 0 2 0 1]
Unique Values :
 ['a' 'b' 'c' 'd']

We can also use categories to divide the data values into a category, and in this case, the unique values will differ. For this purpose, we will use the pd.Categorical() function to divide our data values.

a = pd.Categorical(["a", "a", "c"], categories=["a", "b", "c"])

label3, unique3 = pd.factorize(a)

Let us now see the output of the above code.

Numeric Representation :
 [0 0 1]
Unique Values :
 ['a', 'c']
Categories (3, object): ['a', 'b', 'c']

We can see in the above output that our unique values list contains only the unique values.

Therefore, we can factorize the data values using Pandas using the following approaches.

Preet Sanghavi avatar Preet Sanghavi avatar

Preet writes his thoughts about programming in a simplified manner to help others learn better. With thorough research, his articles offer descriptive and easy to understand solutions.

LinkedIn GitHub