Pandas DataFrame DataFrame.fillna() Function
-
Syntax of
pandas.DataFrame.fillna(): -
Example Codes: Fill All
NaNValues inDataFrameWithDataFrame.fillna()Method -
Example Codes:
DataFrame.fillna()Method With themethodParameter -
Example Codes:
DataFrame.fillna()Method WithlimitParameter
pandas.DataFrame.fillna() function replaces NaN values in DataFrame with some certain value.
Syntax of pandas.DataFrame.fillna():
DataFrame.fillna(
value=None, method=None, axis=None, inplace=False, limit=None, downcast=None
)
Parameters
value |
scalar, dict, Series, or DataFrame. Value used to replace NaN values |
method |
backfill, bfill, pad, ffill or None. Method used for filling NaN values. |
axis |
Fill missing values along the row (axis=0) or column (axis=1) |
inplace |
Boolean. If True, modify the caller DataFrame in-place |
limit |
Integer. If the method is specified, it is the maximum number of consecutive NaN values to forward/backward fill. If the method is not given, it is the maximum number of NaN in axis to be filled. |
downcast |
Dictionary. Specifies downcast of datatypes |
Return
If inplace is True, a DataFrame replacing all the NaN values by given value; otherwise None.
Example Codes: Fill All NaN Values in DataFrame With DataFrame.fillna() Method
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)
filled_df = df.fillna(5)
print("Filled DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Filled DataFrame:
X Y
0 1.0 4.0
1 2.0 5.0
2 3.0 8.0
3 5.0 5.0
4 3.0 3.0
It fills all NaN values in DataFrame with 5 provided as an argument in the pandas.DataFrame.fillna() method.
DataFrame.fillna() With Mean
It would be also good idea to replace NaN values of a column by mean of that column.
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)
df.fillna(df.mean(),inplace=True)
print("Filled DataFrame:")
print(df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Filled DataFrame:
X Y
0 1.00 4.0
1 2.00 5.0
2 3.00 8.0
3 2.25 5.0
4 3.00 3.0
It fills NaN values of column X by mean of column X and NaN values of column Y by mean of column Y.
Due to inplace=True, the original DataFrame is modified after calling fillna() function.
DataFrame.fillna() With 0
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)
df.fillna(0,inplace=True)
print("Filled DataFrame:")
print(df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Filled DataFrame:
X Y
0 1.0 4.0
1 2.0 0.0
2 3.0 8.0
3 0.0 0.0
4 3.0 3.0
It fills all NaN with 0.
Example Codes: DataFrame.fillna() Method With the method Parameter
We can also fill NaN values in DataFrame using different choices of the method parameter.
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)
filled_df = df.fillna(method="backfill")
print("Filled DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Filled DataFrame:
X Y
0 1.0 4.0
1 2.0 8.0
2 3.0 8.0
3 3.0 3.0
4 3.0 3.0
Setting method="backfill" fills all the NaN values of DataFrame with the value after NaN value in the same column.
We can also use bfill, pad and ffill methods to fill NaN values in DataFrame.
method |
Description |
|---|---|
backfill / bfill |
fill all the NaN values of DataFrame with the value after NaN value in the same column. |
ffill / pad |
fill all the NaN values of DataFrame with the value before NaN value in the same column. |
Example Codes: DataFrame.fillna() Method With limit Parameter
limit parameter in DataFrame.fillna() method restricts the maximum number of consecutive NaN values to be filled by the method.
import pandas as pd
import numpy as np
df = pd.DataFrame({'X': [1, 2,np.nan, 3,3],
'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)
filled_df = df.fillna(3,limit=1)
print("Filled DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 NaN 8.0
3 3.0 NaN
4 3.0 3.0
Filled DataFrame:
X Y
0 1.0 4.0
1 2.0 3.0
2 3.0 8.0
3 3.0 NaN
4 3.0 3.0
Here, once a NaN is filled in a column, the other NaN value in the same column remains as it is.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn