Pandas DataFrame DataFrame.replace() Function

  1. Syntax of pandas.DataFrame.replace():
  2. Example Codes: Replace Values in DataFrame Using pandas.DataFrame.replace()
  3. Example Codes: Replace Multiple Values in DataFrame Using pandas.DataFrame.replace()

pandas.DataFrame.replace() replaces values in DataFrame by other values, which may be string, regex, list, dictionary, series, or a number.

Syntax of pandas.DataFrame.replace():

DataFrame.replace(, 
                   to_replace=None, 
                   value=None, 
                   inplace=False, 
                   limit=None, 
                   regex=False, 
                   method='pad')

Parameters

to_replace string, regex, list, dictionary, Series, numeric, or None. Values in DataFrame that need to be replaced
value scalar, dict, list, str, regex, or None. Value to replace any values matching to_replace with
inplace Boolean. If True modify the caller DataFrame
limit Integer. Maximum size gap to forward or backward fill
regex Boolean. Set regex to True if to_replace and/or value is a regex
method Method used for replacement

Return

It returns a DataFrame replacing all the specified fields by given value.

Example Codes: Replace Values in DataFrame Using pandas.DataFrame.replace()

import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print("Before Replacement")
print(df)  
replaced_df=df.replace(1, 5)
print("After Replacement")
print(replaced_df)

Output:

Before Replacement
   X  Y
0  1  4
1  2  1
2  3  8
After Replacement
   X  Y
0  5  4
1  2  5
2  3  8

Here, 1 represents to_replace parameter and 5 represents value parameter in the replace() method. Hence all the entries with value 1 are replaced by 5 in the df.

Example Codes: Replace Multiple Values in DataFrame Using pandas.DataFrame.replace()

Replace Using Lists

import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print("Before Replacement")
print(df)  
replaced_df=df.replace([1,2,3],[1,4,9])
print("After Replacement")
print(replaced_df)

Output:

Before Replacement
   X  Y
0  1  4
1  2  1
2  3  8
After Replacement
   X  Y
0  1  4
1  4  1
2  9  8

Here, [1,2,3] represents to_replace parameter and [1,4,9] represents value parameter in the replace() method. Hence the column [1,2,3] is replaced by [1,4,9] in the df.

Replace Using Dictionaries

import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [3, 1, 8]})
print("Before Replacement")
print(df)  
replaced_df=df.replace({1:10,3:30})
print("After Replacement")
print(replaced_df)

Output:

Before Replacement
   X  Y
0  1  3
1  2  1
2  3  8
After Replacement
    X   Y
0  10  30
1   2  10
2  30   8

It replaces all the entries with value 1 by 10 and all the entries with value 3 by 30.

Replace Using Regex

import pandas as pd
df = pd.DataFrame({'X': ["zeppy", "amid", "amily"],
                   'Y': ["xar", "abc", "among"]})
print("Before Replacement")
print(df)  
df.replace(to_replace=r'^ami.$', value='song', regex=True,inplace=True)
print("After Replacement")
print(df)

Output:

Before Replacement
       X      Y
0  zeppy    xar
1   amid    abc
2  amily  among
After Replacement
       X      Y
0  zeppy    xar
1   song    abc
2  amily  among

It replaces all the entries with the first three characters as ami followed by any one character with song. Here amid only satisfies given regex and hence amid is only replaced by song. Although amily also has its first three characters ami but there are two characters after ami. So, amily does not satisfy the given regex and hence it remains the same. If you are using regex, make sure regex is set to True and inplace=True modifies the original DataFrame after calling replace() method on it.

Related Article - Pandas DataFrame

  • Pandas DataFrame DataFrame.median() Function
  • comments powered by Disqus