How to get the sum of Pandas column

  1. Method to get sum of Pandas DataFrame column
  2. Cumulative sum with groupby
  3. Method to get sum of columns based on Conditional of other column Values

We will introduce how to get the sum of pandas dataframe column, and methods like calculating cumulative sum with groupby, and dataframe sum of columns based on conditional of other column values.

Method to get sum of Pandas DataFrame column

First, we create a random array using a numpy library and then get the sum of each column using sum() function.

import numpy as np
import pandas as pd

df = pd.DataFrame(
    np.random.randint(0,10,size=(10, 4)),
    columns=list('1234'))
print(df)
Total = df['1'].sum()
print ("Column 1 sum:",Total)
Total = df['2'].sum()
print ("Column 2 sum:",Total)
Total = df['3'].sum()
print ("Column 3 sum:",Total)
Total = df['4'].sum()
print ("Column 4 sum:",Total) 

如果运行此代码,你将获得以下输出(你的情况下值可能不同),

   1  2  3  4
0  2  2  3  8
1  9  4  3  1
2  8  5  6  0
3  9  5  7  4
4  2  7  3  7
5  9  4  1  3
6  6  7  7  3
7  0  4  2  8
8  0  6  6  4
9  5  8  7  2
Column 1 sum: 50
Column 2 sum: 52
Column 3 sum: 45
Column 4 sum: 40

Cumulative sum with groupby

We can get the cumulative sum by using groupby method. Consider the following Dataframe with Date, Fruit and Sale columns:

import pandas as pd

df = pd.DataFrame(
    {
        'Date': 
             ['08/09/2018', 
              '10/09/2018', 
              '08/09/2018', 
              '10/09/2018'],
        'Fruit': 
             ['Apple', 
              'Apple', 
              'Banana', 
              'Banana'],
        'Sale:
             [34,
              12,
              22,
              27]
    })

If we want to calculate the cumulative sum of Sale per Fruit and for every date we can do:

import pandas as pd

df = pd.DataFrame(
    {
        'Date': 
             ['08/09/2018', 
              '10/09/2018', 
              '08/09/2018', 
              '10/09/2018'],
        'Fruit': 
             ['Apple', 
              'Apple', 
              'Banana', 
              'Banana'],
        'Sale:
             [34,
              12,
              22,
              27]
    })

print(df.groupby(by=['Fruit','Date']).sum().groupby(level=[0]).cumsum())

After running the above codes we will get the following output, which shows the cumulative sum of fruit for each date:

Fruit  Date         Sale
Apple  08/09/2018    34
       10/09/2018    46
Banana 08/09/2018    22
       10/09/2018    49
        

Method to get sum of columns based on Conditional of other column Values

This method provides functionality to get sum if the given condition is True and replace the sum with given value if the condition is false. Consider the following code,

import numpy as np
import pandas as pd

df = pd.DataFrame(
    np.random.randn(5,3), 
    columns=list('xyz'))

df['sum'] = df.loc[df['x'] > 0,['x','y']].sum(axis=1)

df['sum'].fillna(0, inplace=True)
print(df)

In above code we add new column sum to Dataframe which is sum of first columns ['x','y'] if ['x'] is greater than 1 else we replace sum with 0.

After running the code we will get the following output (values might be changed in your case).

          x         y         z       sum
0 -1.067619  1.053494  0.179490  0.000000
1 -0.349935  0.531465 -1.350914  0.000000
2 -1.650904  1.534314  1.773287  0.000000
3  2.486195  0.800890 -0.132991  3.287085
4  1.581747 -0.667217 -0.182038  0.914530
comments powered by Disqus