How to get the sum of Pandas column

Method to get the sum of Pandas
DataFrame
column 
Cumulative sum with
groupby
 Method to get the sum of columns based on conditional of other column Values
We will introduce how to get the sum of pandas dataframe column
, methods like calculating cumulative sum with groupby
, and dataframe sum of columns based on conditional of other column values.
Method to get the sum of Pandas DataFrame
column
First, we create a random array using the numpy
library and then get the sum of each column using the sum()
function.
import numpy as np
import pandas as pd
df = pd.DataFrame(
np.random.randint(0,10,size=(10, 4)),
columns=list('1234'))
print(df)
Total = df['1'].sum()
print ("Column 1 sum:",Total)
Total = df['2'].sum()
print ("Column 2 sum:",Total)
Total = df['3'].sum()
print ("Column 3 sum:",Total)
Total = df['4'].sum()
print ("Column 4 sum:",Total)
Pandas DataFrame sum()
method sums the Pandas column.
If you run this code, you will get the output as follows.
1 2 3 4
0 2 2 3 8
1 9 4 3 1
2 8 5 6 0
3 9 5 7 4
4 2 7 3 7
5 9 4 1 3
6 6 7 7 3
7 0 4 2 8
8 0 6 6 4
9 5 8 7 2
Column 1 sum: 50
Column 2 sum: 52
Column 3 sum: 45
Column 4 sum: 40
Cumulative sum with groupby
We can get the Pandas cumulative sum by using groupby
method. Consider the following Dataframe
with Date
, Fruit
, and Sale
columns:
import pandas as pd
df = pd.DataFrame(
{
'Date':
['08/09/2018',
'10/09/2018',
'08/09/2018',
'10/09/2018'],
'Fruit':
['Apple',
'Apple',
'Banana',
'Banana'],
'Sale:
[34,
12,
22,
27]
})
If we want to calculate the cumulative sum of Sale per Fruit and for every date, we can do:
import pandas as pd
df = pd.DataFrame(
{
'Date':
['08/09/2018',
'10/09/2018',
'08/09/2018',
'10/09/2018'],
'Fruit':
['Apple',
'Apple',
'Banana',
'Banana'],
'Sale:
[34,
12,
22,
27]
})
print(df.groupby(by=['Fruit','Date']).sum().groupby(level=[0]).cumsum())
After running the above codes, we will get the following output, which shows the cumulative sum of Fruit
for each date:
Fruit Date Sale
Apple 08/09/2018 34
10/09/2018 46
Banana 08/09/2018 22
10/09/2018 49
Method to get the sum of columns based on conditional of other column Values
This method provides functionality to get sum if the given condition is True
and replace the sum with given value if the condition is False
. Consider the following code,
import numpy as np
import pandas as pd
df = pd.DataFrame(
np.random.randn(5,3),
columns=list('xyz'))
df['sum'] = df.loc[df['x'] > 0,['x','y']].sum(axis=1)
df['sum'].fillna(0, inplace=True)
print(df)
In above code, we add new column sum
to Dataframe
. sum
element is the sum of first two columns ['x','y']
if ['x']
is greater than 1, otherwise we replace sum
with 0
.
After running the code we will get the following output (values might be changed in your case).
x y z sum
0 1.067619 1.053494 0.179490 0.000000
1 0.349935 0.531465 1.350914 0.000000
2 1.650904 1.534314 1.773287 0.000000
3 2.486195 0.800890 0.132991 3.287085
4 1.581747 0.667217 0.182038 0.914530