Pandas DataFrame isin Function

  1. DataFrame.isin() Method
  2. Filter Rows of Pandas DataFrame Using the DataFrame.isin() Method
  3. Filter Rows of Pandas DataFrame With Specified Values for a Column Using the DataFrame.isin() Method
  4. Filter Rows of Pandas DataFrame Based on Values of Multiple Columns Using the DataFrame.isin() Method

The Pandas DataFrame.isin() method checks if each element in the DataFrame is in the given values.

import pandas as pd

students_df = pd.DataFrame({
    'Name': ["Jonathan", "Will", "Michael", "Liva", "Sia", "Alice"],
    'Age': [10, 11, 9, 10, 10, 11],
    'Group': ["A", "B", "A", "A", "B", "B"],
    'GPA': [3.2, 3.5, 4.0, 2.9, 4.0, 3.6]

})

print(students_df)

Output:

       Name  Age Group  GPA
0  Jonathan   10     A  3.2
1      Will   11     B  3.5
2   Michael    9     A  4.0
3      Liva   10     A  2.9
4       Sia   10     B  4.0
5     Alice   11     B  3.6

We will use the example to explain how we can filter rows of Pandas DataFrame in Python using the DataFrame.isin() method.

DataFrame.isin() Method

Syntax

DataFrame.isin(values)

Parameters

values iterable - list, tuple, set, etc.
Dictionary,
Series
DataFrame

Return

It returns a DataFrame of Booleans of the same dimension of the caller DataFrame, indicating whether each element is contained in the input values.

Filter Rows of Pandas DataFrame Using the DataFrame.isin() Method

import pandas as pd

students_df = pd.DataFrame({
    'Name': ["Jonathan", "Will", "Michael", "Liva", "Sia", "Alice"],
    'Age': [10, 11, 9, 10, 10, 11],
    'Group': ["A", "B", "A", "A", "B", "B"],
    'GPA': [3.2, 3.5, 4.0, 2.9, 4.0, 3.6]

})
print("The initial DataFrame is:")
print(students_df, "\n")

boolean_indicies = students_df["Group"].isin(["A"])
filtered_df = students_df[boolean_indicies]

print("The DataFrame of students from Group A is:")
print(filtered_df, "\n")

Output:

The initial DataFrame is:
       Name Age Group GPA
0 Jonathan   10     A 3.2
1      Will   11     B 3.5
2   Michael    9     A 4.0
3      Liva   10     A 2.9
4       Sia   10     B 4.0
5     Alice   11     B 3.6

The DataFrame of students from Group A is:
       Name Age Group GPA
0 Jonathan   10     A 3.2
2   Michael    9     A 4.0
3      Liva   10     A 2.9

It applies the isin() method to the Group column of the students_df DataFrame and the method returns a series with boolean values. The series’s value is True if the Group column for the row is A and otherwise it is False.

Then we use the series boolean_indicies to filter out rows from the students_df DataFrame. The rows with only True value for the boolean_indicies series are selected.

Filter Rows of Pandas DataFrame With Specified Values for a Column Using the DataFrame.isin() Method

import pandas as pd

students_df = pd.DataFrame({
    'Name': ["Jonathan", "Will", "Michael", "Liva", "Sia", "Alice"],
    'Age': [10, 11, 9, 10, 10, 11],
    'Group': ["A", "B", "A", "A", "B", "B"],
    'GPA': [3.2, 3.5, 4.0, 2.9, 4.0, 3.6]

})
print("The initial DataFrame is:")
print(students_df, "\n")

boolean_indicies = students_df["Age"].isin([10, 11])
filtered_df = students_df[boolean_indicies]

print("The DataFrame of students with age greater than 10 years is:")
print(filtered_df, "\n")

Output:

The initial DataFrame is:
       Name  Age Group  GPA
0  Jonathan   10     A  3.2
1      Will   11     B  3.5
2   Michael    9     A  4.0
3      Liva   10     A  2.9
4       Sia   10     B  4.0
5     Alice   11     B  3.6

The DataFrame of students with age greater than 10 years is:
       Name  Age Group  GPA
0  Jonathan   10     A  3.2
1      Will   11     B  3.5
3      Liva   10     A  2.9
4       Sia   10     B  4.0
5     Alice   11     B  3.6

It filters all the rows from the DataFrame students_df having the value of Age column 10 or 11.

Filter Rows of Pandas DataFrame Based on Values of Multiple Columns Using the DataFrame.isin() Method


import pandas as pd

students_df = pd.DataFrame({
    'Name': ["Jonathan", "Will", "Michael", "Liva", "Sia", "Alice"],
    'Age': [10, 11, 9, 10, 10, 11],
    'Group': ["A", "B", "A", "A", "B", "B"],
    'GPA': [3.2, 3.5, 4.0, 2.9, 4.0, 3.6]

})
print("The initial DataFrame is:")
print(students_df, "\n")

boolean_indicies_group = students_df["Group"].isin(["B"])
boolean_indicies_gpa = students_df["GPA"].isin([4.0])
filtered_df = students_df[boolean_indicies_group & boolean_indicies_gpa]

print("The DataFrame of students in Group B with GPA 4.0:")
print(filtered_df, "\n")

Output:

The initial DataFrame is:
       Name Age Group GPA
0 Jonathan   10     A 3.2
1      Will   11     B 3.5
2   Michael    9     A 4.0
3      Liva   10     A 2.9
4       Sia   10     B 4.0
5     Alice   11     B 3.6

The DataFrame of students in Group B with GPA 4.0:
  Name Age Group GPA
4 Sia   10     B 4.0

It selects all the rows in the students_df DataFrame, which have value B for the Group column and value 4.0 for the GPA column.

Contribute
DelftStack is a collective effort contributed by software geeks like you. If you like the article and would like to contribute to DelftStack by writing paid articles, you can check the write for us page.

Related Article - Pandas DataFrame

  • Pandas DataFrame DataFrame.max() Function
  • Pandas DataFrame DataFrame.query() Function