How to Replace NA Values using Pandas fillna()
-
DataFrame.fillna()Method -
Fill Entire DataFrame With Specified Value Using the
DataFrame.fillna()Method -
Fill
NaNValues of the Specified Column With a Specified Value
This tutorial explains how we can fill NaN values with specified values using the DataFrame.fillna() method.
We will use the below DataFrame in this article.
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
print(student_df)
Output:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 30.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
DataFrame.fillna() Method
Syntax
DataFrame.fillna(
value=None, method=None, axis=None, inplace=False, limit=None, downcast=None
)
The DataFrame.fillna() method enables us to fill the NaN values in the DataFrame with the specified value or method.
Fill Entire DataFrame With Specified Value Using the DataFrame.fillna() Method
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
filled_df = student_df.fillna(0)
print("DataFrame with NaN values")
print(student_df, "\n")
print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")
Output:
DataFrame with NaN values
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 30.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
After applying fillna() to the DataFrame:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 0.0 Bob 0.0 0.0
3 504.0 Emma 30.0 16.0
4 505.0 Luna 0.0 18.0
5 506.0 Anish 0.0 0.0
It replaces all the NaN values in the DataFrame student_df by 0 which is passed as an argument to the DataFrame.fillna() method.
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
filled_df = student_df.fillna(method="ffill")
print("DataFrame with NaN values")
print(student_df, "\n")
print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")
Output:
DataFrame with NaN values
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 30.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
After applying fillna() to the DataFrame:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 502.0 Bob 400.0 18.0
3 504.0 Emma 30.0 16.0
4 505.0 Luna 30.0 18.0
5 506.0 Anish 30.0 18.0
It fills all the NaN values in the student_df by the value that comes before the NaN value in the same column as of NaN value.
Fill NaN Values of the Specified Column With a Specified Value
To fill particular values with specified values, we pass a dictionary to the fillna() method with column name as a key and value to be used for NaN values of that column as a value.
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 300, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
filled_df = student_df.fillna({"Age": 17, "Income(in $)": 300})
print("DataFrame with NaN values")
print(student_df, "\n")
print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")
Output:
DataFrame with NaN values
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 300.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
After applying fillna() to the DataFrame:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob 300.0 17.0
3 504.0 Emma 300.0 16.0
4 505.0 Luna 300.0 18.0
5 506.0 Anish 300.0 17.0
It fills all the NaN values in the Age column with value 17 and all the NaN values in the Income(in $) column with 300. The NaN values in the Roll No column are left as they are.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn