Filter Rows After groupby() in Pandas Python

Filter Rows After groupby() in Pandas Python

  1. Pandas groupby() Method
  2. Filter Rows After groupby() in Pandas Python

Pandas is an open-source library in Python used to analyze and manipulate data. With the help of Pandas, we can create data frames and perform various operations on them to extract or retrieve data.

With the use of the Pandas module, we are also able to extract, filter, and sort data. This article will explore how we can filter rows in a data set after performing the groupby() operation.

Pandas groupby() Method

The groupby() method in the Pandas module helps us sort the data into categories and further apply operations to them. This method is also an efficient way to aggregate data.

We can manipulate data in data frames using dataframe.groupby() with some criteria defined. Now let’s apply the groupby() method to the following dataframe.

We use the following code to generate this dataframe.

Example Code:

import pandas as pd
data = {'Student_Name': ['Anil', 'Suharwardy', 'Fatina', 'John', 'Karen'],
        'Country': ['India', 'India', 'Pakistan', 'America', 'America'],
        'Biology': [68, 73, 87, 58, 78],
        'Chemistry': [78, 98, 89, 73, 87]}
data_frame = pd.DataFrame(data=data)
print(data_frame)

Output:

Example Pandas Dataframe

Now, let’s group this data by country with the help of groupby().

In the following snippet, we’ve added the criteria for the groupby() method, which is used to sort each entry into categories. We then print out the grouped data with the help of a loop.

Example Code:

import pandas as pd
data = {'Student_Name': ['Anil', 'Suharwardy', 'Fatina', 'John', 'Karen'],
        'Country': ['India', 'India', 'Pakistan', 'America','America'],
        'Biology': [68, 73, 87, 58, 78],
        'Chemistry': [78, 98, 89, 73, 87]}
data_frame = pd.DataFrame(data=data)
grouped = data_frame.groupby('Country')
for one in grouped:
        print(one, '\n')

Output:

Using Pandas groupby() Method

Filter Rows After groupby() in Pandas Python

Now that we understand how groupby() works, we can continue to know how we can further apply filters to the grouped data. Continuing with the example dataframe above, suppose we want to filter out the country and other information about the student who scored 73 in Biology.

We would filter this out using the apply() method right after groupby().

Example Code:

import pandas as pd
data = {'Student_Name': ['Anil', 'Suharwardy', 'Fatina', 'John', 'Karen'],
        'Country': ['India', 'India', 'Pakistan', 'America', 'America'],
        'Biology': [68, 73, 87, 58, 78],
        'Chemistry': [78, 98, 89, 73, 87]}
data_frame = pd.DataFrame(data=data)
grouped = data_frame.groupby('Country').apply(lambda x: x[x['Biology'] == 73])
print(grouped)

Output:

Pandas Filter Rows After groupby() Method - Output 1

We can even print out all the student details that belong to a specific country by specifying the country in the apply() method. This code returns two rows from our dataframe.

Example Code:

import pandas as pd
data = {'Student_Name': ['Anil', 'Suharwardy', 'Fatina', 'John', 'Karen'],
        'Country': ['India', 'India', 'Pakistan', 'America', 'America'],
        'Biology': [68, 73, 87, 58, 78],
        'Chemistry': [78, 98, 89, 73, 87]}
data_frame = pd.DataFrame(data=data)
grouped = data_frame.groupby('Country').apply(lambda x: x[x['Country'] == 'America'])
print(grouped)

Output:

Pandas Filter Rows After groupby() Method - Output 2

In conclusion, the Pandas module has made manipulating data sufficiently easier and more efficient with its comprehensive tools and smart methods. Data frames can be easily grouped and filtered using the Pandas groupby() and apply() methods described in detail above.

It is, however, important to note that there is more than one way to filter and sort data out, as Pandas is a very extensive module. We can always choose the most suitable for our unique development requirements.

Author: Fariba Laiq
Fariba Laiq avatar Fariba Laiq avatar

I am Fariba Laiq from Pakistan. An android app developer, technical content writer, and coding instructor. Writing has always been one of my passions. I love to learn, implement and convey my knowledge to others.

LinkedIn

Related Article - Pandas GroupBy

  • Introduction to Useful Rolling Functions for GroupBy Object in Pandas
  • GroupBy and Aggregate Multiple Columns in Pandas
  • Calculate the Mean of a Grouped Data in Pandas
  • GroupBy Month in Pandas
  • GroupBy Apply in Pandas