How to get the aggregate of Pandas group-by and Sum

  1. Cumulative sum with groupby
  2. pivot() to rearrange the data in a nice table
  3. Apply function to groupby in pandas
  4. agg() to get aggregate sum of the column

We will demonstrate get the aggregate of Pandas groupby and sum. We will also look at the pivot functionality to arrange the data in a nice table and how we can define our custom function and run apply it on the dataframe. We will also get the aggregate sum by using agg().

Cumulative sum with groupby

We can get the cumulative sum by using groupby method. Consider the following Dataframe with Date, Fruit name and sale on that date:

import pandas as pd
df = pd.DataFrame({'Date': ['08/09/2018', '10/09/2018', '08/09/2018', '10/09/2018'],
                   'Fruit': ['Apple', 'Apple', 'Banana', 'Banana',],
                   'Sale': [34, 12, 22, 27]})

If we want to calculate the cumulative sum of Sale per Fruit and for every date, we can do:

import pandas as pd
df = pd.DataFrame({'Date': ['08/09/2018', '10/09/2018', '08/09/2018', '10/09/2018'],
                   'Fruit': ['Apple', 'Apple', 'Banana', 'Banana',],
                   'Sale': [34, 12, 22, 27]})
print (df.groupby(by=['Fruit','Date']).sum().groupby(level=[0]).cumsum())

Output:

Fruit  Date         Sale
Apple  08/09/2018    34
       10/09/2018    46
Banana 08/09/2018    22
       10/09/2018    49

pivot() to rearrange the data in a nice table

pivot() method could set row and column attribute of the table. Let’s change the above code and apply pivot() method to rearrange the data in a nice table:

import pandas as pd
df = pd.DataFrame({'Date': ['08/09/2018', '10/09/2018', '08/09/2018', '10/09/2018'],
                   'Fruit': ['Apple', 'Apple', 'Banana', 'Banana',],
                   'Sale': [34, 12, 22, 27]})    
print (df.groupby(['Fruit','Date'],as_index = False).sum().pivot('Fruit','Date').fillna(0))

Output:

             Sale           
Date   08/09/2018 10/09/2018
Fruit                       
Apple          34         12
Banana         22         27

Apply function to groupby in pandas

We will create a simple method to get count of values in series or 1d array and use groupby to get aggregate count of each value:

from pandas import *
d = {"series": Series(['1','2','1','1','4','4','5'])}
df = DataFrame(d)
def get_count(values):
    return len(values)
grouped_count = df.groupby("series").series.agg(get_count)
print(grouped_count)

After running the code we will get the following output which provides the occurrence of each value in series.

Output:

series
1    3
2    1
4    2
5    1
Name: series, dtype: int64

agg() to get aggregate sum of the column

We can use agg() to apply sum operation on column. Consider the following code:

import pandas as pd
df = pd.DataFrame({'Date': ['08/09/2018', '10/09/2018', '08/09/2018', '10/09/2018'],
                   'Fruit': ['Apple', 'Apple', 'Banana', 'Banana',],
                   'Sale': [34, 12, 22, 27]})
print(df.groupby(['Fruit'])['Sale'].agg('sum'))

Output:

Fruit
Apple     46
Banana    49
Name: Sale, dtype: int64

Related Article - Pandas DataFrame

  • How to Get Pandas DataFrame Column Headers as a List
  • How to Delete Pandas DataFrame Column
  • How to Convert DataFrame Column to Datetime in Pandas
  • How to Convert a float to an integer in Pandas DataFrame
  • How to Sort Pandas DataFrame by One Column's Values
  • How to convert Python dictionary to Pandas DataFrame
  • How to add header row to a pandas DataFrame
  • How to convert Pandas Dataframe to Numpy array
  • How to convert index of a Pandas Dataframe into a column
  • How to count the NaN occurrences in a column in Pandas Dataframe
  • How to change the order of Pandas DataFrame columns
  • How to add one row to Pandas DataFrame
  • How to delete a row based on column value in Pandas DataFrame
  • How to get a value from a cell of a Pandas DataFrame
  • comments powered by Disqus