Using the isin() Function in Pandas DataFrame

Using the isin() Function in Pandas DataFrame

  1. Use the isin() Function to Create DataFrame From Dictionary Objects in Pandas
  2. Use the isin() Function to Filter Pandas DataFrame
  3. Use the isin() Method to Filter Multiple Columns in Pandas Dataframe
  4. Use the isin() Method With Not (~) Matching Condition to Filter Pandas Dataframe

We will discuss in this tutorial how to use the like SQL IN and Not IN operators to filter pandas DataFrame. Moreover, we will also show you how to filter a single row/column, filter multiple columns, filter pandas DataFrame based on conditions using the isin() function and unary operator (~) with the help of various examples.

Use the isin() Function to Create DataFrame From Dictionary Objects in Pandas

The below example DataFrame contains the columns Student Name, Subject, Semester, Marks. Import pandas library and create a DataFrame.

import pandas as pd

student_record = {
    'Student Name':["Samreena","Affan","Mirha","Asif"],
    'Subject' :["SDA","Ethics","Web Design","Web Development"],
    'Semester':['6th','7th','5th','8th'],
    'Marks':[100,90,80,70]
              }
index_labels=[0,1,2,3]
df = pd.DataFrame(student_record,index=index_labels)
print(df)

Output:

   Student Name         Subject  Semester  Marks
0     Samreena              SDA      6th    100
1        Affan           Ethics      7th     90
2        Mirha       Web Design      5th     80
3         Asif  Web Development      8th     70

Use the isin() Function to Filter Pandas DataFrame

We can filter pandas DataFrame rows using the isin() method similar to the IN operator in SQL.

To filter rows, will check the desired elements in a single column. Using the pd.series.isin() function, we can check whether the search elements are present in the series.

If the element will match in the series, it returns true otherwise false.

For example, we want to return rows that contain Web Design and Web Development subjects in the Subject column.

import pandas as pd

student_record = {
    'Name':["Samreena","Affan","Mirha","Asif"],
    'Subject' :["SDA","Ethics","Web Design","Web Development"],
    'Semester':['6th','7th','5th','8th'],
    'Marks':[100,90,80,70]
              }
index_labels=[0,1,2,3]
dataframe = pd.DataFrame(student_record,index=index_labels)
# Find elements in a Column to return rows
subjects_list=["Web Design", "Web Development"]
dataframe1=dataframe[dataframe.Subject.isin(subjects_list)]
print(dataframe1)

Output:

    Name          Subject   Semester  Marks
2   Mirha       Web Design      5th     80
3   Asif   Web Development      8th     70

Notice that only those student names are displayed Web Development and Web Design subjects are returned.

We can return a Boolean array by displaying true and false with Pandas DataFrame row indices.

import pandas as pd

student_record = {
    'Name':["Samreena","Affan","Mirha","Asif"],
    'Subject' :["SDA","Ethics","Web Design","Web Development"],
    'Semester':['6th','7th','5th','8th'],
    'Marks':[100,90,80,70]
              }
index_labels=[0,1,2,3]

dataframe = pd.DataFrame(student_record,index=index_labels)
subjects_list=["Web Design", "Web Development"]
dataframe1=dataframe.Subject.isin(subjects_list)
print(dataframe1)

Output:

0    False
1    False
2    True
3    True
Name: Subject, dtype: bool

Use the isin() Method to Filter Multiple Columns in Pandas Dataframe

We can also apply a filter on multiple columns using the isin() method. For example, we want to retrieve all those rows having the SDA subject or the fifth semester.

import pandas as pd

student_record = {
    'Name':["Samreena","Affan","Mirha","Asif"],
    'Subject' :["SDA","Ethics","Web Design","Web Development"],
    'Semester':['6th','7th','5th','8th'],
    'Marks':[100,90,80,70]
              }
index_labels=[0,1,2,3]
dataframe = pd.DataFrame(student_record,index=index_labels)
dataframe1=dataframe[dataframe[['Subject', 'Semester']].isin(["SDA", '7th']).any(axis=1)]
print(dataframe1)

Output:

       Name  Subject  Semester  Marks
0   Samreena    SDA      6th    100
1     Affan  Ethics      7th     90

Use the isin() Method With Not (~) Matching Condition to Filter Pandas Dataframe

The isin() method behaves like the IN operator in SQL. We will use the unary operator (~) to implement the Not IN operator.

For example, we want to display only those rows that do not contain the Web Design and Ethics subjects.

import pandas as pd

student_record = {
    'Name':["Samreena","Affan","Mirha","Asif"],
    'Subject' :["SDA","Ethics","Web Design","Web Development"],
    'Semester':['6th','7th','5th','8th'],
    'Marks':[100,90,80,70]
              }
index_labels=[0,1,2,3]
dataframe = pd.DataFrame(student_record,index=index_labels)
subjects_list=["Web Design", "Ethics"]

# Applying Not operator
dataframe1=dataframe[~dataframe.Subject.isin(subjects_list)]
print(dataframe1)

Output:

       Name          Subject  Semester  Marks
0  Samreena              SDA      6th    100
3      Asif  Web Development      8th     70

Related Article - Pandas DataFrame

  • Get Pandas DataFrame Column Headers as a List
  • Delete Pandas DataFrame Column
  • Convert Pandas Column to Datetime
  • Convert a Float to an Integer in Pandas DataFrame
  • Sort Pandas DataFrame by One Column's Values
  • Get the Aggregate of Pandas Group-By and Sum