How to Count the NaN Occurrences in a Column in Pandas Dataframe
-
isna()Method to CountNaNin One or Multiple Columns -
Subtract the Count of
non-NaNFrom the Total Length to CountNaNOccurrences -
df.isnull().sum()Method to CountNaNOccurrences -
Count
NaNOccurrences in the Whole PandasDataFrame
We will introduce the methods to count the NaN occurrences in a column in the Pandas DataFrame. We have many solutions including the isna() method for one or multiple columns, by subtracting the total length from the count of NaN occurrences, by using the value_counts method and by using df.isnull().sum() method.
We will also introduce the method to calculate the total number of NaN occurrences in the whole Pandas DataFrame.
isna() Method to Count NaN in One or Multiple Columns
We can use the insna() method (pandas versions > 0.21.0) and then sum to count the NaN occurrences. For one column we will do as follow:
import pandas as pd
s = pd.Series([1, 2, 3, np.nan, np.nan])
s.isna().sum()
# or s.isnull().sum() for older pandas versions
Output:
2
For several columns, it also works:
import pandas as pd
df = pd.DataFrame({"a": [1, 2, np.nan], "b": [np.nan, 1, np.nan]})
df.isna().sum()
Output:
a 1
b 2
dtype: int64
Subtract the Count of non-NaN From the Total Length to Count NaN Occurrences
We can get the number of NaN occurrences in each column by subtracting the count of non-Nan occurrences from the length of DataFrame:
import pandas as pd
df = pd.DataFrame(
[(1, 2, None), (None, 4, None), (5, None, 7), (5, None, None)],
columns=["a", "b", "d"],
index=["A", "B", "C", "D"],
)
print(df)
print(len(df) - df.count())
Output:
a b d
A 1.0 2.0 NaN
B NaN 4.0 NaN
C 5.0 NaN 7.0
D 5.0 NaN NaN
a 1
b 2
d 3
dtype: int64
df.isnull().sum() Method to Count NaN Occurrences
We can get the number of NaN occurrences in each column by using df.isnull().sum() method. If we pass the axis=0 inside the sum method, it will give the number of NaN occurrences in every column. If we need NaN occurrences in every row, set axis=1.
Example Codes:
import pandas as pd
df = pd.DataFrame(
[(1, 2, None), (None, 4, None), (5, None, 7), (5, None, None)],
columns=["a", "b", "d"],
index=["A", "B", "C", "D"],
)
print("NaN occurrences in Columns:")
print(df.isnull().sum(axis=0))
print("NaN occurrences in Rows:")
print(df.isnull().sum(axis=1))
Output:
NaN occurrences in Columns:
a 1
b 2
d 3
dtype: int64
NaN occurrences in Rows:
A 1
B 2
C 1
D 2
dtype: int64
Count NaN Occurrences in the Whole Pandas DataFrame
To get the total number of all NaN occurrences in the DataFrame, we chain two .sum() methods together:
import pandas as pd
df = pd.DataFrame(
[(1, 2, None), (None, 4, None), (5, None, 7), (5, None, None)],
columns=["a", "b", "d"],
index=["A", "B", "C", "D"],
)
print("NaN occurrences in DataFrame:")
print(df.isnull().sum().sum())
Output:
NaN occurrences in DataFrame:
6