Pandas DataFrame DataFrame.sum() Function
-
Syntax of
pandas.DataFrame.sum(): -
Example Codes:
DataFrame.sum()Method to Calculate Sum Along Column Axis -
Example Codes:
DataFrame.sum()Method to Find Sum Along Row Axis -
Example Codes:
DataFrame.sum()Method to Find the Sum IgnoringNaNValues -
Example Codes: Set
min_countinDataFrame.sum()Method
The function of the Python Pandas DataFrame.sum() is to calculate the sum of values of DataFrame object over the specified axis.
Syntax of pandas.DataFrame.sum():
DataFrame.sum(
axis=None, skipna=None, level=None, numeric_only=None, min_count=0, **kwargs
)
Parameters
axis |
find sum along the row (axis=0) or column (axis=1) |
skipna |
Boolean. Exclude NaN values (skipna=True) or include NaN values (skipna=False) |
level |
Count along with particular level if the axis is MultiIndex |
numeric_only |
Boolean. For numeric_only=True, include only float, int, and boolean columns |
min_count |
Integer. Minimum number of non-NaN values to calculate the sum. If this condition is not satisfied, the sum will be NaN |
**kwargs |
Additional keyword arguments to the function. |
Return
If the level is not specified, return Series of the sum of the values for the requested axis, else return DataFrame of sum values.
Example Codes: DataFrame.sum() Method to Calculate Sum Along Column Axis
import pandas as pd
df = pd.DataFrame({'X':
[1,2,3,4,5],
'Y': [1, 2, 3,4,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df.sum()
print("Column-wise Sum:")
print(sums)
Output:
DataFrame:
X Y Z
0 1 1 3
1 2 2 4
2 3 3 5
3 4 4 6
4 5 5 3
Column-wise Sum:
X 15
Y 15
Z 21
dtype: int64
It calculates the sum for all the columns X, Y, and Z and finally returns a Series object with the sum of each column.
To find the sum of a particular column of DataFrame in Pandas, you need to call the sum() function for that column only.
import pandas as pd
df = pd.DataFrame({'X':
[1,2,3,4,5],
'Y': [1, 2, 3,4,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df["Z"].sum()
print("Sum of values of Z-column:")
print(sums)
Output:
DataFrame:
X Y Z
0 1 1 3
1 2 2 4
2 3 3 5
3 4 4 6
4 5 5 3
Sum of values of Z-column:
21
It only gives the sum of values of column Z of DataFrame.
Example Codes: DataFrame.sum() Method to Find Sum Along Row Axis
import pandas as pd
df = pd.DataFrame({'X':
[1,2,3,4,5],
'Y': [1, 2, 3,4,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df.sum(axis=1)
print("Row-wise sum:")
print(sums)
Output:
DataFrame:
X Y Z
0 1 1 3
1 2 2 4
2 3 3 5
3 4 4 6
4 5 5 3
Row-wise sum:
0 5
1 8
2 11
3 14
4 13
dtype: int64
It calculates the sum for all the rows and finally returns a Series object with the sum of each row.
To find the sum of a particular row of DataFrame in Pandas, you need to call the sum() function for that specific row only.
import pandas as pd
df = pd.DataFrame({'X':
[1,2,3,4,5],
'Y': [1, 2, 3,4,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sum_3=df.iloc[[2]].sum(axis=1)
print("Sum of values of 3rd Row:")
print(sum_3)
Output:
DataFrame:
X Y Z
0 1 1 3
1 2 2 4
2 3 3 5
3 4 4 6
4 5 5 3
Sum of values of 3rd Row:
2 11
dtype: int64
It only gives the sum of values of the 3rd row of DataFrame.
Use the iloc method to select rows based on the index.
Example Codes: DataFrame.sum() Method to Find the Sum Ignoring NaN Values
Use the default value of the skipna parameter i.e. skipna=True to find the sum of DataFrame along the specified axis, ignoring NaN values.
import pandas as pd
df = pd.DataFrame({'X':
[1,None,3,4,5],
'Y': [1, None, 3,None,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df.sum()
print("Column-wise Sum:")
print(sums)
Output:
DataFrame:
X Y Z
0 1.0 1.0 3
1 NaN NaN 4
2 3.0 3.0 5
3 4.0 NaN 6
4 5.0 5.0 3
Column-wise Sum:
X 13.0
Y 9.0
Z 21.0
dtype: float64
If you set skipna=True, you’ll get NaN values of sums if the DataFrame has NaN values.
import pandas as pd
df = pd.DataFrame({'X':
[1,None,3,4,5],
'Y': [1, None, 3,None,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df.sum(skipna=False)
print("Column-wise Sum:")
print(sums)
Output:
DataFrame:
X Y Z
0 1.0 1.0 3
1 NaN NaN 4
2 3.0 3.0 5
3 4.0 NaN 6
4 5.0 5.0 3
Column-wise Sum:
X NaN
Y NaN
Z 21.0
dtype: float64
Here, you get the NaN value for the sum of columns X and Y as both of them have the NaN values in them.
Example Codes: Set min_count in DataFrame.sum() Method
import pandas as pd
df = pd.DataFrame({'X':
[1,None,3,4,5],
'Y': [1, None, 3,None,5],
'Z': [3,4,5,6,3]})
print("DataFrame:")
print(df)
sums=df.sum(min_count=4)
print("Column-wise Sum:")
print(sums)
Output:
DataFrame:
X Y Z
0 1.0 1.0 3
1 NaN NaN 4
2 3.0 3.0 5
3 4.0 NaN 6
4 5.0 5.0 3
Column-wise Sum:
X 13.0
Y NaN
Z 21.0
dtype: float64
Here, you get the NaN value for the sum of column Y as column Y has only 3 non- NaN values, which is less than the value of the min_count parameter.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn