How to Filter Data in a Pandas DataFrame

Fariba Laiq Feb 15, 2024
  1. Filter Data in a Pandas DataFrame Based on Single Condition
  2. Filter Data in a Pandas DataFrame Based on Multiple Conditions
  3. Filter Data in a Pandas DataFrame Based on Multiple Columns Value
How to Filter Data in a Pandas DataFrame

This tutorial will demonstrate filtering data in a Pandas dataframe based on single or multiple conditions.

Boolean indexing means choosing subsets of data or filtering data based on some conditions. We deal with the actual values of the data in the dataframe rather than their row or column labels or integer positions.

A boolean vector is used to filter data in boolean indexing. Parenthesis can be used to group several conditions involving the operators, such as | (OR), & (AND), == (EQUAL), and ~ (NOT).

Filter Data in a Pandas DataFrame Based on Single Condition

We can filter the data using a single column’s value by applying a single condition.

In the following code, we have students’ data, and we have filtered the records by applying a single condition to the Department value. Only those students’ records will be displayed whose department is CS.

Example code:

# Python 3.x
import pandas as pd

df = pd.read_csv("Student.csv")
display(df)
df_filtered = df[(df["Department"] == "CS")]
display(df_filtered)

Output:

Filter Data Based On Single Condition

Filter Data in a Pandas DataFrame Based on Multiple Conditions

We can also apply multiple conditions to select data from a single column in some cases.

If we want to display only those students’ records whose marks are greater than 60 but less than 90, we will use multiple conditions joined by the & operator.

An important thing to remember is to use operators &, |, ~ instead of AND, OR, NOT, respectively.

Example code:

# Python 3.x
import pandas as pd

df = pd.read_csv("Student.csv")
display(df)
df_filtered = df[(df["Marks"] > 60) & (df["Marks"] < 90)]
display(df_filtered)

Output:

Filter Data Based on Multiple Conditions on the Same Column

Filter Data in a Pandas DataFrame Based on Multiple Columns Value

We can also filter the data using conditions based on multiple columns value.

In the following code, we have filtered the records, and only those will be displayed whose department is EE and marks above or equal to 80. We have used parenthesis to group multiple conditions.

Whenever we filter data from multiple columns, we always apply multiple conditions.

Example code:

# Python 3.x
import pandas as pd

df = pd.read_csv("Student.csv")
display(df)
df_filtered = df[(df["Department"] == "EE") & (df["Marks"] >= 80)]
display(df_filtered)

Output:

Filter Data Based on Multiple Conditions

Author: Fariba Laiq
Fariba Laiq avatar Fariba Laiq avatar

I am Fariba Laiq from Pakistan. An android app developer, technical content writer, and coding instructor. Writing has always been one of my passions. I love to learn, implement and convey my knowledge to others.

LinkedIn

Related Article - Pandas DataFrame