How to Replace Column Values in Pandas DataFrame

Mohamed Ayman Feb 02, 2024
  1. Use the map() Method to Replace Column Values in Pandas
  2. Use the loc Method to Replace Column’s Value in Pandas
  3. Replace Column Values With Conditions in Pandas DataFrame
  4. Use the replace() Method to Modify Values
How to Replace Column Values in Pandas DataFrame

In this tutorial, we will introduce how to replace column values in Pandas DataFrame. We will cover three different functions to replace column values easily.

Use the map() Method to Replace Column Values in Pandas

DataFrame’s columns are Pandas Series. We can use the Series.map method to replace each value in a column with another value.

Series.map() Syntax

Series.map(arg, na_action=None)
  • Parameters:
  1. arg: this parameter is used for mapping a Series. It could be a collection or a function.
  2. na_action: It is used for dealing with NaN (Not a Number) values. It could take two values - None or ignore. None is the default, and map() will apply the mapping to all values, including Nan values; ignore leaves NaN values as are in the column without passing them to the mapping method.

It returns a Series with the same index.

Now let’s take an example to implement the map method. We will use the same DataFrame in the below examples.

import pandas as pd
import numpy as np

data = {
    "name": ["michael", "louis", "jack", "jasmine"],
    "city": ["berlin", "paris", "roma", np.nan],
}
df = pd.DataFrame(data, columns=["name", "city"])

print(df)

Output:

      name    city
0  michael  berlin
1    louis   paris
2     jack    roma
3  jasmine     NaN

Replace Column Values With Collection in Pandas DataFrame

import pandas as pd
import numpy as np

data = {
    "name": ["michael", "louis", "jack", "jasmine"],
    "city": ["berlin", "paris", "roma", np.nan],
}
df = pd.DataFrame(data, columns=["name", "city"])

# replace column values with collection

df["city"] = df["city"].map(
    {"berlin": "dubai", "paris": "moscow", "roma": "milan", np.nan: "NY"},
    na_action=None,
)

print(df)

Output:

      name    city
0  michael   dubai
1    louis  moscow
2     jack   milan
3  jasmine      NY

The original DataFrame city column values are replaced with the dictionary’s new values as the first parameter in the map() method.

Replace Column Values With Function in Pandas DataFrame

import pandas as pd
import numpy as np

data = {
    "name": ["michael", "louis", "jack", "jasmine"],
    "city": ["berlin", "paris", "roma", np.nan],
}
df = pd.DataFrame(data, columns=["name", "city"])

# replace column values with function

df["city"] = df["city"].map("I am from {}".format)

print(df)

Output:

      name              city
0  michael  I am from berlin
1    louis   I am from paris
2     jack    I am from roma
3  jasmine     I am from nan

The na_action is None by default, so that’s why the NaN in the original column is also replaced with the new string I am from nan.

If you prefer to keep NaN but not replaced, you can set the na_action to be ignore.

import pandas as pd
import numpy as np

data = {
    "name": ["michael", "louis", "jack", "jasmine"],
    "city": ["berlin", "paris", "roma", np.nan],
}
df = pd.DataFrame(data, columns=["name", "city"])

# replace column values excluding NaN

df["city"] = df["city"].map("I am from {}".format, na_action="ignore")

print(df)

Output:

      name              city
0  michael  I am from berlin
1    louis   I am from paris
2     jack    I am from roma
3  jasmine               NaN

Use the loc Method to Replace Column’s Value in Pandas

Another way to replace Pandas DataFrame column’s value is the loc() method of the DataFrame. The loc() method access values through their labels.

DataFrame.loc[] Syntax

pandas.DataFrame.loc[condition, column_label] = new_value
  • Parameters:
  1. condition: this parameter returns the values that make the condition true
  2. column_label: this parameter used to specify the targeted column to update

After determining the value through the parameters, we update it to new_value.

Now let’s take an example to implement the loc method. We will use the below DataFrame as the example.

import pandas as pd

data = {
    "name": ["michael", "louis", "jack", "jasmine"],
    "grades": [30, 70, 40, 80],
    "result": ["N/A", "N/A", "N/A", "N/A"],
}

df = pd.DataFrame(data, columns=["name", "grades", "result"])

print(df)

Output:

      name  grades result
0  michael      30    N/A
1    louis      70    N/A
2     jack      40    N/A
3  jasmine      80    N/A

Replace Column Values With Conditions in Pandas DataFrame

We can use boolean conditions to specify the targeted elements.

import pandas as pd

data = {
    "name": ["michael", "louis", "jack", "jasmine"],
    "grades": [30, 70, 40, 80],
    "result": ["N/A", "N/A", "N/A", "N/A"],
}

df = pd.DataFrame(data, columns=["name", "grades", "result"])

df.loc[df.grades > 50, "result"] = "success"

df.loc[df.grades < 50, "result"] = "fail"

print(df)

Output:

      name  grades   result
0  michael      30     fail
1    louis      70  success
2     jack      40     fail
3  jasmine      80  success

df.loc[df.grades>50, 'result']='success' replaces the values in the grades column with sucess if the values is greather than 50.

df.loc[df.grades<50,'result']='fail' replaces the values in the grades column with fail if the values is smaller than 50.

Use the replace() Method to Modify Values

Another way to replace column values in Pandas DataFrame is the Series.replace() method.

Series.replace() Syntax

  • Replace one single value
df[column_name].replace([old_value], new_value)
  • Replace multiple values with the same value
df[column_name].replace([old_value1, old_value2, old_value3], new_value)
  • Replace multiple values with multiple values
df[column_name].replace(
    [old_value1, old_value2, old_value3], [new_value1, new_value2, new_value3]
)
  • Replace a value with a new value for the entire DataFrame
df.replace([old_value], new_value)

We will use the below DataFrame for the rest of examples.

import pandas as pd

data = {
    "name": ["michael", "louis", "jack", "jasmine"],
    "salary": [700, 800, 1000, 1200],
}

df = pd.DataFrame(data, columns=["name", "salary"])

print(df)

Output:

      name  salary
0  michael     700
1    louis     800
2     jack    1000
3  jasmine    1200

Replace Column Values With Multiple Values in Pandas DataFrame

import pandas as pd

data = {
    "name": ["michael", "louis", "jack", "jasmine"],
    "salary": [700, 800, 1000, 1200],
}

df = pd.DataFrame(data, columns=["name", "salary"])

df["name"] = df["name"].replace(["michael", "louis"], ["karl", "lionel"])

print(df)

Output:

      name  salary
0     karl     700
1   lionel     800
2     jack    1000
3  jasmine    1200

Replace Column Values With Only the Same Value in Pandas DataFrame

import pandas as pd

data = {
    "name": ["michael", "louis", "jack", "jasmine"],
    "salary": [700, 800, 1000, 1200],
}

df = pd.DataFrame(data, columns=["name", "salary"])

df["salary"] = df["salary"].replace([1000, 1200], 1500)

print(df)

Output:

      name  salary
0     karl     700
1   lionel     800
2     jack    1500
3  jasmine    1500

Replace Column Value With One Value in Pandas DataFrame

import pandas as pd

data = {
    "name": ["michael", "louis", "jack", "jasmine"],
    "salary": [700, 800, 1000, 1200],
}

df = pd.DataFrame(data, columns=["name", "salary"])

df["salary"] = df["salary"].replace([700], 750)

print(df)

Output:

      name  salary
0     karl     750
1   lionel     800
2     jack    1000
3  jasmine    1200

Replace Values in the Entire Pandas DataFrame

import pandas as pd

data = {
    "name": ["michael", "louis", "jack", "jasmine"],
    "salary": [700, 800, 1000, 1000],
}

df = pd.DataFrame(data, columns=["name", "salary"])


df = df.replace([1000], 1400)

print(df)

Output:

      name  salary
0     karl     750
1   lionel     800
2     jack    1400
3  jasmine    1400

Related Article - Pandas DataFrame