Pandas DataFrame query Function

  1. DataFrame.query() Method
  2. Filter Rows of Pandas DataFrame Using the DataFrame.query() Method
  3. Filter Rows of Pandas DataFrame With Specified Values for a Column Using the DataFrame.query() Method
  4. Filter Rows of Pandas DataFrame Based on Values of Multiple Columns Using the DataFrame.query() Method

This tutorial explains how we can query rows of Pandas DataFrame in Python using the DataFrame.query() method.

We will use the example to explain how we can query rows of Pandas DataFrame in Python using the DataFrame.query() method.

import pandas as pd

students_df = pd.DataFrame({
    'Name': ["Jonathan", "Will", "Michael", "Liva", "Sia", "Alice"],
    'Age': [10, 11, 9, 10, 10, 11],
    'Group': ["A", "B", "A", "A", "B", "B"],
    'GPA': [3.2, 3.5, 4.0, 2.9, 4.0, 3.6]

})

print(students_df)

Output:

       Name  Age Group  GPA
0  Jonathan   10     A  3.2
1      Will   11     B  3.5
2   Michael    9     A  4.0
3      Liva   10     A  2.9
4       Sia   10     B  4.0
5     Alice   11     B  3.6

DataFrame.query() Method

Syntax

DataFrame.query(expr,
                inplace=False,
                **kwargs)

Parameters

expr the query expression used to filter rows from the DataFrame
inplace Boolean. If True, modify the caller DataFrame in-place
**kwargs Keyword Arguments

Return

It returns a DataFrame formed by the rows which match the expr query.

Filter Rows of Pandas DataFrame Using the DataFrame.query() Method

import pandas as pd

students_df = pd.DataFrame({
    'Name': ["Jonathan", "Will", "Michael", "Liva", "Sia", "Alice"],
    'Age': [10, 11, 9, 10, 10, 11],
    'Group': ["A", "B", "A", "A", "B", "B"],
    'GPA': [3.2, 3.5, 4.0, 2.9, 4.0, 3.6]

})
print("The initial DataFrame is:")
print(students_df, "\n")

filtered_df = students_df.query('Group=="A"')

print("The DataFrame of students in Group A:")
print(filtered_df, "\n")

Output:

The initial DataFrame is:
       Name  Age Group  GPA
0  Jonathan   10     A  3.2
1      Will   11     B  3.5
2   Michael    9     A  4.0
3      Liva   10     A  2.9
4       Sia   10     B  4.0
5     Alice   11     B  3.6

The DataFrame of students in Group A:
       Name  Age Group  GPA
0  Jonathan   10     A  3.2
2   Michael    9     A  4.0
3      Liva   10     A  2.9

It filters all the rows having the value of Group column equal to A.

Filter Rows of Pandas DataFrame With Specified Values for a Column Using the DataFrame.query() Method

import pandas as pd

students_df = pd.DataFrame({
    'Name': ["Jonathan", "Will", "Michael", "Liva", "Sia", "Alice"],
    'Age': [10, 11, 9, 10, 10, 11],
    'Group': ["A", "B", "A", "A", "B", "B"],
    'GPA': [3.2, 3.5, 4.0, 2.9, 4.0, 3.6]

})
print("The initial DataFrame is:")
print(students_df, "\n")

filtered_df = students_df.query('Age in [10,11]')

print("The DataFrame of students with age greater than 10 years is:")
print(filtered_df, "\n")

Output:

The initial DataFrame is:
       Name  Age Group  GPA
0  Jonathan   10     A  3.2
1      Will   11     B  3.5
2   Michael    9     A  4.0
3      Liva   10     A  2.9
4       Sia   10     B  4.0
5     Alice   11     B  3.6

The DataFrame of students with age greater than 10 years is:
       Name  Age Group  GPA
0  Jonathan   10     A  3.2
1      Will   11     B  3.5
3      Liva   10     A  2.9
4       Sia   10     B  4.0
5     Alice   11     B  3.6

It filters all the rows from the DataFrame students_df having the value of Age column 10 or 11 using the DataFrame.query() method.

Filter Rows of Pandas DataFrame Based on Values of Multiple Columns Using the DataFrame.query() Method

import pandas as pd

students_df = pd.DataFrame({
    'Name': ["Jonathan", "Will", "Michael", "Liva", "Sia", "Alice"],
    'Age': [10, 11, 9, 10, 10, 11],
    'Group': ["A", "B", "A", "A", "B", "B"],
    'GPA': [3.2, 3.5, 4.0, 2.9, 4.0, 3.6]

})
print("The initial DataFrame is:")
print(students_df, "\n")

filtered_df = students_df.query('Group =="B" and GPA == 4.0')

print("The DataFrame of students in Group B with GPA 4.0:")
print(filtered_df, "\n")

Output:

The initial DataFrame is:
       Name  Age Group  GPA
0  Jonathan   10     A  3.2
1      Will   11     B  3.5
2   Michael    9     A  4.0
3      Liva   10     A  2.9
4       Sia   10     B  4.0
5     Alice   11     B  3.6

The DataFrame of students in Group B with GPA 4.0:
  Name  Age Group  GPA
4  Sia   10     B  4.0

It selects all the rows in the students_df DataFrame which have value B for the Group column and value 4.0 for the GPA column using the DataFrame.query() method.

Contribute
DelftStack is a collective effort contributed by software geeks like you. If you like the article and would like to contribute to DelftStack by writing paid articles, you can check the write for us page.

Related Article - Pandas DataFrame

  • Pandas DataFrame DataFrame.median() Function
  • Pandas DataFrame DataFrame.where() Function