Pandas DataFrame DataFrame.query() Function

  1. Syntax of pandas.DataFrame.query():
  2. Example Codes: DataFrame.query() Method With Single Condition
  3. Example Codes: DataFrame.query() Method When Column Name Has Whitespace
  4. Example Codes: DataFrame.query() Method With Multiple Conditions

pandas.DataFrame.query() method filters the rows of the caller DataFrame using the given query expression.

Syntax of pandas.DataFrame.query():

DataFrame.query( expr, 
                 inplace=False, 
                 **kwargs)

Parameters

expr Query expression based upon which rows are filtered
inplace Boolean. If True, modify the caller DataFrame in-place
**kwargs Keyword arguments for the method

Return

If inplace is True, it returns the filtered DataFrame; otherwise None.

Example Codes: DataFrame.query() Method With Single Condition

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print("Original DataFrame:")
print(df)

filtered_df=df.query('X>1')
print("Filtered DataFrame:")
print(filtered_df)

Output:

Original DataFrame:
   X  Y
0  1  4
1  2  1
2  3  8
Filtered DataFrame:
   X  Y
1  2  1
2  3  8

It returns the DataFrame with only the rows that satisfy the given query expression i.e. only the rows whose value in column X is greater than 1.

Example Codes: DataFrame.query() Method When Column Name Has Whitespace

We must make sure column names to be queried do not have any white spaces before applying this method to DataFrame.

If we have column names with spaces in them, we could use backtick quoting (`).

import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8],
                    'A B':[3,5,7]})
print("Original DataFrame:")
print(df)
filtered_df=df.query('`A B`>5')
print("Filtered DataFrame:")
print(filtered_df)

Output:

Original DataFrame:
   X  Y  A B
0  1  4    3
1  2  1    5
2  3  8    7
Filtered DataFrame:
   X  Y  A B
2  3  8    7

Here, the column A B has space in its name. To make query expression for the column, we enclose the column name in backticks; otherwise, it will raise an error.

Example Codes: DataFrame.query() Method With Multiple Conditions

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print("Original DataFrame:")
print(df)

filtered_df=df.query('X>1' and 'Y==1')
print("Filtered DataFrame:")
print(filtered_df)

Output:

Original DataFrame:
   X  Y
0  1  4
1  2  1
2  3  8
Filtered DataFrame:
   X  Y
1  2  1

If we wish to filter DataFrame based on multiple conditions, we combine the multiple query expressions using and operator to make a single composite query expression.

It gives the DataFrame with rows whose value of column X is greater than 1, and value of column Y equals 1.

We can modify the original DataFrame after calling query() method by setting inplace=True.

import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
filtered_df=df.query('X>1' and 'Y==1',inplace=True)
print(df)

Output:

   X  Y
1  2  1

Related Article - Pandas DataFrame

  • Pandas DataFrame DataFrame.to_csv() Function
  • Pandas DataFrame DataFrame.to_excel() Function
  • comments powered by Disqus