Pandas DataFrame DataFrame.aggregate() Function

  1. Syntax of pandas.DataFrame.aggregate()
  2. Example Codes: DataFrame.aggregate()
  3. Example Codes: DataFrame.aggregate() With the Multiple Functions
  4. Example Codes: DataFrame.aggregate() With a Specified Column

pandas.DataFrame.aggregate() function aggregates the columns or rows of a DataFrame. The most commonly used aggregation functions are min, max, and sum. These aggregation functions result in the reduction of the size of the DataFrame.

Syntax of pandas.DataFrame.aggregate()

DataFrame.aggregate(func, 
                    axis, 
                    *args, 
                    **kwargs) 

Parameters

func It is the aggregation function to be applied. It can be a callable or a list of callables, string or a list of strings, or a dictionary.
axis 0 by default. If it is 0 or 'index' then the function is applied to the individual columns. If it is 1 or 'columns' then the function is applied to the individual rows
*args It is a positional argument.
**kwargs It is a keyword argument.

Return

This function returns a scalar, Series, or a DataFrame.

  • It returns a scalar if a single function is called with Series.agg().
  • It returns a Series if a single function is called with DataFrame.agg().
  • It returns a DataFrame if multiple functions are called with DataFrame.agg().

Example Codes: DataFrame.aggregate()

DataFrame.agg() is an alias for DataFrame.aggregate(). It’s better to use the alias. So we will be using DataFrame.agg() in the example codes.

import pandas as pd

dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                    'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                    'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print(dataframe)

The example DataFrame is below.

   Attendance    Name Obtained Marks
0          60  Olivia            90
1         100    John            75
2          80   Laura            82
3          78     Ben            64
4          95   Kevin            45

We will first check the DataFrame.agg() function using only a single aggregation function.

import pandas as pd

dataframe= pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                    'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                    'Obtained Marks': {0: 90, 1: 75, 2: 82,3: 64,4: 45}})

dataframe1 = dataframe.agg('sum')
print(dataframe1)

Output:

Attendance                            413
Name              OliviaJohnLauraBenKevin
Obtained Marks                        356
dtype: object

The aggregate function sum is applied to the individual columns.

For integer-type column, it has generated sum; and for string-type column, it has concatenated the strings. dtype: object shows that a Series is returned.

Example Codes: DataFrame.aggregate() With the Multiple Functions

import pandas as pd

dataframe= pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                    'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                    'Obtained Marks': {0: 90, 1: 75, 2: 82,3: 64,4: 45}})

dataframe1 = dataframe.agg(['sum', 'min'])
print(dataframe1)

Output:

     Attendance                     Name  Obtained Marks
sum         413  OliviaJohnLauraBenKevin             356
min          60                      Ben              45

The aggregation functions sum and min are applied to the individual columns.

For integer-type column, min function has generated the minimum value, and for string-type column, it has shown the string with minimum length.

Example Codes: DataFrame.aggregate() With a Specified Column

import pandas as pd

dataframe= pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                    'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                    'Obtained Marks': {0: 90, 1: 75, 2: 82,3: 64,4: 45}})

dataframe1 = dataframe.agg({"Obtained Marks":'sum'})
print(dataframe1)

Output:

Obtained Marks    356
dtype: int64

The sum of a single column is returned. dtype: int64 shows that this function has returned a Series.

We could also apply multiple functions on a single column.

import pandas as pd

dataframe= pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                    'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                    'Obtained Marks': {0: 90, 1: 75, 2: 82,3: 64,4: 45}})
dataframe1 = dataframe.agg({"Obtained Marks":['sum', 'max']})
print(dataframe1)

Output:

     Obtained Marks
sum             356
max              90

Related Article - Pandas DataFrame

  • Pandas DataFrame DataFrame.sample() Function
  • Pandas DataFrame DataFrame.transpose() Function
  • comments powered by Disqus