Pandas 填充 NaN 值
Suraj Joshi
2023年1月30日
Pandas
Pandas NaN
本教程解釋了我們如何使用 DataFrame.fillna() 方法用指定的值填充 NaN 值。
我們將在本文中使用下面的 DataFrame。
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
print(student_df)
輸出:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 30.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
DataFrame.fillna() 方法
語法
DataFrame.fillna(
value=None, method=None, axis=None, inplace=False, limit=None, downcast=None
)
DataFrame.fillna() 方法使我們能夠用指定的值或方法來填充 DataFrame 中的 NaN 值。
使用 DataFrame.fillna() 方法用指定的值填充整個 DataFrame
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
filled_df = student_df.fillna(0)
print("DataFrame with NaN values")
print(student_df, "\n")
print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")
輸出:
DataFrame with NaN values
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 30.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
After applying fillna() to the DataFrame:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 0.0 Bob 0.0 0.0
3 504.0 Emma 30.0 16.0
4 505.0 Luna 0.0 18.0
5 506.0 Anish 0.0 0.0
它將 DataFrame student_df 中的所有 NaN 值替換為 0,該值作為引數傳遞給 DataFrame.fillna() 方法。
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 30, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
filled_df = student_df.fillna(method="ffill")
print("DataFrame with NaN values")
print(student_df, "\n")
print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")
輸出:
DataFrame with NaN values
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 30.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
After applying fillna() to the DataFrame:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 502.0 Bob 400.0 18.0
3 504.0 Emma 30.0 16.0
4 505.0 Luna 30.0 18.0
5 506.0 Anish 30.0 18.0
它將所有 student_df 中的 NaN 值填入與 NaN 值相同列的 NaN 值之前的值。
用指定的值填充指定列的 NaN 值
為了用指定的值來填充特定的值,我們向 fillna() 方法傳遞一個字典,以列名作為鍵,以該列的 NaN 值作為值。
import numpy as np
import pandas as pd
roll_no = [501, 502, 503, 504, 505]
student_df = pd.DataFrame(
{
"Roll No": [501, 502, np.nan, 504, 505, 506],
"Name": ["Jennifer", "Travis", "Bob", "Emma", "Luna", "Anish"],
"Income(in $)": [200, 400, np.nan, 300, np.nan, np.nan],
"Age": [17, 18, np.nan, 16, 18, np.nan],
}
)
filled_df = student_df.fillna({"Age": 17, "Income(in $)": 300})
print("DataFrame with NaN values")
print(student_df, "\n")
print("After applying fillna() to the DataFrame:")
print(filled_df, "\n")
輸出:
DataFrame with NaN values
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob NaN NaN
3 504.0 Emma 300.0 16.0
4 505.0 Luna NaN 18.0
5 506.0 Anish NaN NaN
After applying fillna() to the DataFrame:
Roll No Name Income(in $) Age
0 501.0 Jennifer 200.0 17.0
1 502.0 Travis 400.0 18.0
2 NaN Bob 300.0 17.0
3 504.0 Emma 300.0 16.0
4 505.0 Luna 300.0 18.0
5 506.0 Anish 300.0 17.0
它將 Age 列中的所有 NaN 值填充為 17,將 Income(in $) 列中的所有 NaN 值填充為 300。Roll No 欄中的 NaN 值保持不變。
Enjoying our tutorials? Subscribe to DelftStack on YouTube to support us in creating more high-quality video guides. Subscribe
作者: Suraj Joshi
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn