How to Fill Missing Values in Pandas DataFrame

Fariba Laiq Feb 02, 2024
  1. Syntax of the ffill() Method in Pandas
  2. Fill Missing Values in the DataFrame Using the ffill() Method in Pandas
How to Fill Missing Values in Pandas DataFrame

Sometimes, we may have a dataset with missing values. There are many ways to replace the missing data using some methods.

The ffill() (forward fill) is one of the methods to replace the missing values in the dataframe. This method substitutes NaN with the previous row or column values.

Syntax of the ffill() Method in Pandas

# Python 3.x
dataframe.ffill(axis, inplace, limit, downcast)

The ffill() method takes four optional arguments:

  • axis specifies from where to fill the missing value. Value 0 indicates the row, and 1 represents the column.
  • inplace can either be True or False. True specifies making changes in the current dataframe, whereas False indicates creating a separate copy of the new dataframe with filled values.
  • limit specifies the maximum number of missing values to fill consecutively along the axis.
  • downcast specifies a dictionary of values to fill for a specific data type.

Fill Missing Values in the DataFrame Using the ffill() Method in Pandas

Fill Missing Values Along the Row Axis

We have a dataframe with missing values denoted by None or NaN in the following code. We have displayed the actual dataframe and then applied the ffill() method to that dataframe.

By default, the ffill() method replaces the missing values along the row/ index axis. The NaN is replaced with the values from the previous row of that cell.

The first row still contains NaN in the output because there is no preceding row.

Example code:

# Python 3.x
import pandas as pd

df = pd.DataFrame(
    {
        "C1": [2, 7, None, 4],
        "C2": [None, 2, None, 3],
        "C3": [2, None, 6, 5],
        "C4": [5, 2, 8, None],
    }
)
display(df)
df2 = df.ffill()
display(df2)

Output:

ffill Missing Values Along Row Axis

Fill Missing Values Along the Column Axis

Here, we will specify axis=1. It will fill the missing values by observing the value from the previous column of that corresponding cell.

In the output, all the values are filled except the two values. Because we have no previous column for column 1, that value will still be NaN.

And the value in column 2 is NaN because the corresponding cell from the preceding column is also NaN.

Example code:

# Python 3.x
import pandas as pd

df = pd.DataFrame(
    {
        "C1": [2, 7, None, 4],
        "C2": [None, 2, None, 3],
        "C3": [2, None, 6, 5],
        "C4": [5, 2, 8, None],
    }
)
display(df)
df2 = df.ffill(axis=1)
display(df2)

Output:

ffill Missing Values Along Column Axis

Use limit to Limit the Number of Consecutive NaN to Fill

We can use the limit parameter to limit the number of consecutive missing values to fill along the row or column axis.

In the following code, we have the actual dataframe in which we have consecutive NaN’s in the last three rows. If we specify limit=2, no more than two successive NaN’s can fill along the row axis.

That’s why the NaN in the last row is still not filled.

Example code:

# Python 3.x
import pandas as pd

df = pd.DataFrame(
    {
        "C1": [2, 7, None, 4],
        "C2": [4, None, None, None],
        "C3": [6, 6, 6, 5],
        "C4": [None, 2, 8, None],
    }
)
display(df)
df2 = df.ffill(axis=0, limit=2)
display(df2)

Output:

ffill With Limit Parameter

Use inplace to Fill Values in the Original DataFrame

Suppose we want to make changes in the original dataframe instead of copying the dataframe with filled values in another dataframe. In that case, we can use the inplace parameter with the value True.

Example code:

# Python 3.x
import pandas as pd

df = pd.DataFrame(
    {
        "C1": [2, 7, None, 4],
        "C2": [4, None, None, None],
        "C3": [6, 6, 6, 5],
        "C4": [None, 2, 8, None],
    }
)
display(df)
df.ffill(inplace=True)
display(df)

Output:

ffill With Inplace Parameter

Author: Fariba Laiq
Fariba Laiq avatar Fariba Laiq avatar

I am Fariba Laiq from Pakistan. An android app developer, technical content writer, and coding instructor. Writing has always been one of my passions. I love to learn, implement and convey my knowledge to others.

LinkedIn

Related Article - Pandas DataFrame