How to Add New Column to Existing DataFrame in Python Pandas
-
[]Operator Method to Add a New Column in Pandas -
df.insert()Method to Add New Column in Pandas -
df.assign()Method to Add New Column in Pandas -
df.loc()Method to Add New Column in Pandas
Adding a new column to existing DataFrame is used very frequently when working with large data sets. For example, the existing DataFrame has First, Last, and Age columns, and we need to add a new column city to it. Listed below are the different ways to achieve this task.
[]operator Methoddf.insert()Methoddf.assign()Methoddf.loc()Method
We will use the same DataFrame in the next sections as follows,
import pandas as pd
data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df = pd.DataFrame(data, columns=["First", "Last", "Age"])
print(df)
Output:
First Last Age
0 Ali Azmat 30
1 Sharukh Khan 40
2 Linus Torvalds 70
[] Operator Method to Add a New Column in Pandas
We could use the [] operator to add a new column to the existing DataFrame.
import pandas as pd
data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df = pd.DataFrame(data, columns=["First", "Last", "Age"])
city = ["Lahore", "Dehli", "New York"]
df["city"] = city
print(df)
Output:
First Last Age city
0 Ali Azmat 30 Lahore
1 Sharukh Khan 40 Dehli
2 Linus Torvalds 70 New York
df.insert() Method to Add New Column in Pandas
You can use df.insert() function if you want to add the new column at a specific index. The first parameter of df.insert() function is the insertion index starting from zero.
import pandas as pd
data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df = pd.DataFrame(data, columns=["First", "Last", "Age"])
df.insert(3, "city", ["Lahore", "Dehli", "New York"], True)
print(df)
Output:
First Last Age city
0 Ali Azmat 30 Lahore
1 Sharukh Khan 40 Dehli
2 Linus Torvalds 70 New York
df.assign() Method to Add New Column in Pandas
df.assign() can also be used to add a new column to an existing DataFrame.
import pandas as pd
data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df = pd.DataFrame(data, columns=["First", "Last", "Age"])
df = df.assign(city=["Lahore", "Dehli", "New York"])
print(df)
Output:
First Last Age city
0 Ali Azmat 30 Lahore
1 Sharukh Khan 40 Dehli
2 Linus Torvalds 70 New York
Let’s see how to add multiple columns using df.assign(). The below example will add city and score columns.
import pandas as pd
data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df = pd.DataFrame(data, columns=["First", "Last", "Age"])
df = df.assign(city=["Lahore", "Dehli", "New York"], score=[20, 30, 40])
print(df)
Output:
First Last Age city score
0 Ali Azmat 30 Lahore 20
1 Sharukh Khan 40 Dehli 30
2 Linus Torvalds 70 New York 40
df.loc() Method to Add New Column in Pandas
df.loc() method can also add a new column into an existing DataFrame.
import pandas as pd
data = [["Ali", "Azmat", "30"], ["Sharukh", "Khan", "40"], ["Linus", "Torvalds", "70"]]
df = pd.DataFrame(data, columns=["First", "Last", "Age"])
df.loc[:, "city"] = ["Lahore", "Dehli", "New York"]
print(df)
Output:
First Last Age city
0 Ali Azmat 30 Lahore
1 Sharukh Khan 40 Dehli
2 Linus Torvalds 70 New York