How to count the NaN occurrences in a column in Pandas Dataframe

isna()
method to countNaN
in one or multiple columns 
Subtracting the total length from the
count
ofNaN
to countNaN
occurrences 
df.isnull().sum()
method to countNaN
occurrences 
Count
NaN
Occurrences in the whole Pandasdataframe
We will introduce the methods to count the NaN
occurrences in a column in the Pandas dataframe
. We have many options to this including isna()
method for one or multiple columns, by subtracting the total length from the count of NaN
occurrences, by using value_counts
method and by using df.isnull().sum()
method.
We will also introduce the method to calculate the total number of NaN
occurrences in the whole Pandas dataframe
.
isna()
method to count NaN
in one or multiple columns
We can use the insna()
method (pandas versions > 0.21.0) and then sum to count the NaN
occurrences. For one column we will do as follow:
import pandas as pd
s = pd.Series([
1,2,3, np.nan, np.nan])
s.isna().sum()
# or s.isnull().sum() for older pandas versions
Output:
2
For several columns, it also works:
import pandas as pd
df = pd.DataFrame({
'a':[1,2,np.nan],
'b':[np.nan,1,np.nan]})
df.isna().sum()
Output:
a 1
b 2
dtype: int64
Subtracting the total length from the count
of NaN
to count NaN
occurrences
We can get the number of NaN
occurrences in each column by subtracting the count
of nonNan
occurrences from the length of dataframe
:
import pandas as pd
df = pd.DataFrame([
(1,2,None),
(None,4,None),
(5,None,7),
(5,None,None)],
columns=['a','b','d'],
index = ['A', 'B','C','D'])
print(df)
print(len(df)df.count())
Output:
a b d
A 1.0 2.0 NaN
B NaN 4.0 NaN
C 5.0 NaN 7.0
D 5.0 NaN NaN
a 1
b 2
d 3
dtype: int64
df.isnull().sum()
method to count NaN
occurrences
We can get the number of NaN
occurrences in each column by using df.isnull().sum()
method. If we passed the axis=0
inside the sum
method it will give the number of NaN
occurrences in every column. If we need NaN
occurrences in every row, set axis=1
.
Consider the following code :
import pandas as pd
df = pd.DataFrame(
[(1,2,None),
(None,4,None),
(5,None,7),
(5,None,None)],
columns=['a','b','d'],
index = ['A', 'B','C','D'])
print('NaN occurrences in Columns:')
print(df.isnull().sum(axis = 0))
print('NaN occurrences in Rows:')
print(df.isnull().sum(axis = 1))
Output:
NaN occurrences in Columns:
a 1
b 2
d 3
dtype: int64
NaN occurrences in Rows:
A 1
B 2
C 1
D 2
dtype: int64
Count NaN
Occurrences in the whole Pandas dataframe
To get the total number of all Nan
occurrences in the dataframe
, we chain two .sum()
methods together:
import pandas as pd
df = pd.DataFrame(
[(1,2,None),
(None,4,None),
(5,None,7),
(5,None,None)],
columns=['a','b','d'],
index = ['A', 'B','C','D'])
print('NaN occurrences in DataFrame:')
print(df.isnull().sum().sum())
Output:
NaN occurrences in DataFrame:
6