Pandas DataFrame DataFrame.fillna() Function

  1. Syntax of pandas.DataFrame.fillna():
  2. Example Codes: Fill all NaN values in DataFrame with DataFrame.fillna() method
  3. Example Codes: DataFrame.fillna() method with the method parameter
  4. Example Codes: DataFrame.fillna() method with limit parameter

pandas.DataFrame.fillna() function replaces NaN values in DataFrame with some certain value.

Syntax of pandas.DataFrame.fillna():

DataFrame.fillna(value=None,
                 method=None, 
                 axis=None, 
                 inplace=False, 
                 limit=None, 
                 downcast=None) 

Parameters

value scalar, dict, Series, or DataFrame. Value used to replace NaN values
method backfill, bfill, pad, ffill or None. Method used for filling NaN values.
axis Fill missing values along the row (axis=0) or column (axis=1)
inplace Boolean. If True, modify the caller DataFrame in-place
limit Integer.
If the method is specified, it is the maximum number of consecutive NaN values to forward/backward fill.
If the method is not given, it is the maximum number of NaN in axis to be filled.
downcast Dictionary. Specifies downcast of datatypes

Return

If inplace is True, a DataFrame replacing all the NaN values by given value; otherwise None.

Example Codes: Fill all NaN values in DataFrame with DataFrame.fillna() method

import pandas as pd
import numpy as np

df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
                   'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)

filled_df = df.fillna(5)

print("Filled DataFrame:")
print(filled_df)

Output:

DataFrame:
     X    Y
0  1.0  4.0
1  2.0  NaN
2  3.0  8.0
3  NaN  NaN
4  3.0  3.0
Filled DataFrame:
     X    Y
0  1.0  4.0
1  2.0  5.0
2  3.0  8.0
3  5.0  5.0
4  3.0  3.0

It fills all NaN values in DataFrame with 5 provided as an argument in the pandas.DataFrame.fillna() method.

DataFrame.fillna() with mean

It would be also good idea to replace NaN values of a column by mean of that column.

import pandas as pd
import numpy as np

df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
                   'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)

df.fillna(df.mean(),inplace=True)

print("Filled DataFrame:")
print(df)

Output:

DataFrame:
     X    Y
0  1.0  4.0
1  2.0  NaN
2  3.0  8.0
3  NaN  NaN
4  3.0  3.0
Filled DataFrame:
      X    Y
0  1.00  4.0
1  2.00  5.0
2  3.00  8.0
3  2.25  5.0
4  3.00  3.0

It fills NaN values of column X by mean of column X and NaN values of column Y by mean of column Y.

Due to inplace=True, the original DataFrame is modified after calling fillna() function.

DataFrame.fillna() with 0

import pandas as pd
import numpy as np

df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
                   'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)

df.fillna(0,inplace=True)

print("Filled DataFrame:")
print(df)

Output:

DataFrame:
     X    Y
0  1.0  4.0
1  2.0  NaN
2  3.0  8.0
3  NaN  NaN
4  3.0  3.0
Filled DataFrame:
     X    Y
0  1.0  4.0
1  2.0  0.0
2  3.0  8.0
3  0.0  0.0
4  3.0  3.0

It fills all NaN with 0.

Example Codes: DataFrame.fillna() method with the method parameter

We can also fill NaN values in DataFrame using different choices of the method parameter.

import pandas as pd
import numpy as np

df = pd.DataFrame({'X': [1, 2, 3, np.nan, 3],
                   'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)

filled_df = df.fillna(method="backfill")

print("Filled DataFrame:")
print(filled_df)

Output:

DataFrame:
     X    Y
0  1.0  4.0
1  2.0  NaN
2  3.0  8.0
3  NaN  NaN
4  3.0  3.0
Filled DataFrame:
     X    Y
0  1.0  4.0
1  2.0  8.0
2  3.0  8.0
3  3.0  3.0
4  3.0  3.0

Setting method="backfill" fills all the NaN values of DataFrame with the value after NaN value in the same column.

We can also use bfill, pad and ffill methods to fill NaN values in DataFrame.

method Description
backfill / bfill fill all the NaN values of DataFrame with the value after NaN value in the same column.
ffill / pad fill all the NaN values of DataFrame with the value before NaN value in the same column.

Example Codes: DataFrame.fillna() method with limit parameter

limit parameter in DataFrame.fillna() method restricts the maximum number of consecutive NaN values to be filled by the method.

import pandas as pd
import numpy as np

df = pd.DataFrame({'X': [1, 2,np.nan, 3,3],
                   'Y': [4, np.nan, 8, np.nan, 3]})
print("DataFrame:")
print(df)

filled_df = df.fillna(3,limit=1)

print("Filled DataFrame:")
print(filled_df)

Output:

DataFrame:
     X    Y
0  1.0  4.0
1  2.0  NaN
2  NaN  8.0
3  3.0  NaN
4  3.0  3.0
Filled DataFrame:
     X    Y
0  1.0  4.0
1  2.0  3.0
2  3.0  8.0
3  3.0  NaN
4  3.0  3.0

Here, once a NaN is filled in a column, the other NaN value in the same column remains as it is.

Related Article - Pandas DataFrame

  • Pandas DataFrame DataFrame.sample() Function
  • Pandas DataFrame DataFrame.query() Function
  • comments powered by Disqus