Pandas DataFrame DataFrame.replace() Function

Suraj Joshi Jan 30, 2023
  1. Syntax of pandas.DataFrame.replace():
  2. Example Codes: Replace Values in DataFrame Using pandas.DataFrame.replace()
  3. Example Codes: Replace Multiple Values in DataFrame Using pandas.DataFrame.replace()
Pandas DataFrame DataFrame.replace() Function

pandas.DataFrame.replace() replaces values in DataFrame with other values, which may be string, regex, list, dictionary, Series, or a number.

Syntax of pandas.DataFrame.replace():

DataFrame.replace(,
                  to_replace=None,
                  value=None,
                  inplace=False,
                  limit=None,
                  regex=False,
                  method='pad')

Parameters

to_replace string, regex, list, dictionary, Series, numeric, or None. Values in DataFrame that need to be replaced
value scalar, dict, list, string, regex, or None. Value to replace any values matching to_replace with
inplace Boolean. If True modify the caller DataFrame
limit Integer. Maximum size gap to forward or backward fill
regex Boolean. Set regex to True if to_replace and/or value is a regex
method Method used for replacement

Return

It returns a DataFrame replacing all the specified fields by given value.

Example Codes: Replace Values in DataFrame Using pandas.DataFrame.replace()

import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print("Before Replacement")
print(df)  
replaced_df=df.replace(1, 5)
print("After Replacement")
print(replaced_df)

Output:

Before Replacement
   X  Y
0  1  4
1  2  1
2  3  8
After Replacement
   X  Y
0  5  4
1  2  5
2  3  8

Here, 1 represents to_replace parameter and 5 represents value parameter in the replace() method. Hence all the entries with value 1 are replaced by 5 in the df.

Example Codes: Replace Multiple Values in DataFrame Using pandas.DataFrame.replace()

Replace Using Lists

import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print("Before Replacement")
print(df)  
replaced_df=df.replace([1,2,3],[1,4,9])
print("After Replacement")
print(replaced_df)

Output:

Before Replacement
   X  Y
0  1  4
1  2  1
2  3  8
After Replacement
   X  Y
0  1  4
1  4  1
2  9  8

Here, [1,2,3] represents to_replace parameter and [1,4,9] represents value parameter in the replace() method. Hence the column [1,2,3] is replaced by [1,4,9] in the df.

Replace Using Dictionaries

import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [3, 1, 8]})
print("Before Replacement")
print(df)  
replaced_df=df.replace({1:10,3:30})
print("After Replacement")
print(replaced_df)

Output:

Before Replacement
   X  Y
0  1  3
1  2  1
2  3  8
After Replacement
    X   Y
0  10  30
1   2  10
2  30   8

It replaces all the elements with value 1 by 10 and all the elements with value 3 by 30.

Replace Using Regex

import pandas as pd
df = pd.DataFrame({'X': ["zeppy", "amid", "amily"],
                   'Y': ["xar", "abc", "among"]})
print("Before Replacement")
print(df)  
df.replace(to_replace=r'^ami.$', value='song', regex=True,inplace=True)
print("After Replacement")
print(df)

Output:

Before Replacement
       X      Y
0  zeppy    xar
1   amid    abc
2  amily  among
After Replacement
       X      Y
0  zeppy    xar
1   song    abc
2  amily  among

It replaces all the elements with the first three characters as ami followed by any one character with song. Here only amid satisfies the given regex and hence only amid is replaced by song. Although amily also has its first three characters ami but there are two characters after ami. So, amily does not satisfy the given regex and hence it remains the same and not replaced. If you are using regex, make sure regex is set to True and inplace=True modifies the original DataFrame after calling the replace() method on it.

Author: Suraj Joshi
Suraj Joshi avatar Suraj Joshi avatar

Suraj Joshi is a backend software engineer at Matrice.ai.

LinkedIn

Related Article - Pandas DataFrame