Pandas DataFrame DataFrame.interpolate() Function
-
Syntax of
pandas.DataFrame.interpolate(): -
Example Codes: Interpolate All
NaNValues inDataFrameWithDataFrame.interpolate()Method -
Example Codes:
DataFrame.interpolate()Method With themethodParameter -
Example Codes: Pandas
DataFrame.interpolate()Method With theaxisParameter to Interpolate AlongrowAxis -
Example Codes:
DataFrame.interpolate()Method WithlimitParameter -
Example Codes:
DataFrame.interpolate()Method Withlimit_directionParameter -
Interpolate Time-Series Data With
DataFrame.interpolate()Method
The Python Pandas DataFrame.interpolate() function fills NaN values in the DataFrame using the interpolation technique.
Syntax of pandas.DataFrame.interpolate():
DataFrame.interpolate(
method="linear",
axis=0,
limit=None,
inplace=False,
limit_direction="forward",
limit_area=None,
downcast=None,
**kwargs
)
Parameters
method |
linear, time, index, values, nearest, zero, slinear, quadratic, cubic, barycentric, krogh, polynomial, spline, piecewise_polynomial, from_derivatives, pchip, akima or None. Method used for interpolating NaN values. |
axis |
Interpolate missing values along the row (axis=0) or column (axis=1) |
limit |
Integer. maximum number of consecutive NaNs to be interpolated. |
inplace |
Boolean. If True, modify the caller DataFrame in-place |
limit_direction |
forward, backward or both. Direction along NaNs are interpolated when the limit is specified |
limit_area |
None, inside, or outside. Restriction for interpolating when the limit is specified |
downcast |
Dictionary. Specifies downcast of datatypes |
**kwargs |
Keyword arguments for the interpolating function. |
Return
If inplace is True, a DataFrame interpolating all the NaN values using given method; otherwise None.
Example Codes: Interpolate All NaN Values in DataFrame With DataFrame.interpolate() Method
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3, None, 3],
'Y': [4, None, 8, None, 3]})
print("DataFrame:")
print(df)
filled_df = df.interpolate()
print("Interploated DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Interploated DataFrame:
X Y
0 1.0 4.0
1 2.0 6.0
2 3.0 8.0
3 3.0 5.5
4 3.0 3.0
It interpolates all the NaN values in DataFrame using the linear interpolation method.
This method is more intelligent compared to pandas.DataFrame.fillna(), which uses a fixed value to replace all the NaN values in the DataFrame.
Example Codes: DataFrame.interpolate() Method With the method Parameter
We can also interpolate NaN values in DataFrame with different interpolation techniques setting values of method parameter in DataFrame.interpolate() function.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3, None, 3],
'Y': [4, None, 8, None, 3]})
print("DataFrame:")
print(df)
filled_df = df.interpolate(method='polynomial', order=2)
print("Interploated DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Interploated DataFrame:
X Y
0 1.000000 4.000
1 2.000000 7.125
2 3.000000 8.000
3 3.368421 6.625
4 3.000000 3.000
This method interpolates all the NaN values in the DataFrame using the polynomial interpolation method of 2nd order.
Here, order=2 is the keyword argument for the polynomial function.
Example Codes: Pandas DataFrame.interpolate() Method With the axis Parameter to Interpolate Along row Axis
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3, None, 3],
'Y': [4, None, 8, None, 3]})
print("DataFrame:")
print(df)
filled_df = df.interpolate(axis=1)
print("Interploated DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Interploated DataFrame:
X Y
0 1.0 4.0
1 2.0 2.0
2 3.0 8.0
3 NaN NaN
4 3.0 3.0
Here, we set axis=1 to interpolate the NaN values along the row axis. In the 2nd row, NaN value is replaced using linear interpolation along the 2nd row.
However, in the 4th row, the NaN values remain even after interpolation, as both the values in the 4th row are NaN.
Example Codes: DataFrame.interpolate() Method With limit Parameter
The limit parameter in DataFrame.interpolate() method restricts the maximum number of consecutive NaN values to be filled by the method.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3, None, 3],
'Y': [4, None, None, None, 3]})
print("DataFrame:")
print(df)
filled_df = df.interpolate( limit = 1)
print("Interploated DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 NaN
3 NaN NaN
4 3.0 3.0
Interploated DataFrame:
X Y
0 1.0 4.00
1 2.0 3.75
2 3.0 NaN
3 3.0 NaN
4 3.0 3.00
Here, once a NaN is filled in a column from the top, the next consecutive NaN values in the same column remain unchanged.
Example Codes: DataFrame.interpolate() Method With limit_direction Parameter
The limit-direction parameter in DataFrame.interpolate() method controls the direction along a particular axis, in which values are interpolated.
import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3, None, 3],
'Y': [4, None, None, None, 3]})
print("DataFrame:")
print(df)
filled_df = df.interpolate(limit_direction ='backward', limit = 1)
print("Interploated DataFrame:")
print(filled_df)
Output:
DataFrame:
X Y
0 1.0 4.0
1 2.0 NaN
2 3.0 NaN
3 NaN NaN
4 3.0 3.0
Interploated DataFrame:
X Y
0 1.0 4.00
1 2.0 NaN
2 3.0 NaN
3 3.0 3.25
4 3.0 3.00
Here, once a NaN is filled in a column from the bottom, the next consecutive NaN values in the same column remain unchanged.
Interpolate Time-Series Data With DataFrame.interpolate() Method
import pandas as pd
dates=['April-10', 'April-11', 'April-12', 'April-13']
fruits=['Apple', 'Papaya', 'Banana', 'Mango']
prices=[3, None, 2, 4]
df = pd.DataFrame({'Date':dates ,
'Fruit':fruits ,
'Price': prices})
print(df)
df.interpolate(inplace=True)
print("Interploated DataFrame:")
print(df)
Output:
Date Fruit Price
0 April-10 Apple 3.0
1 April-11 Papaya NaN
2 April-12 Banana 2.0
3 April-13 Mango 4.0
Interploated DataFrame:
Date Fruit Price
0 April-10 Apple 3.0
1 April-11 Papaya 2.5
2 April-12 Banana 2.0
3 April-13 Mango 4.0
Due to inplace=True, the original DataFrame is modified after calling interpolate() function.
Suraj Joshi is a backend software engineer at Matrice.ai.
LinkedIn