Pandas Insert Method

Suraj Joshi Jan 30, 2023
  1. pandas.DataFrame.insert() Method in Python
  2. Set allow_duplicates = True in the insert() Method to Add Already Existing Column
Pandas Insert Method

This tutorial explains how we can use the insert() method for a Pandas DataFrame to insert a column in the DataFrame.

import pandas as pd

countries_df = pd.DataFrame(
    {
        "Country": ["Nepal", "Switzerland", "Germany", "Canada"],
        "Continent": ["Asia", "Europe", "Europe", "North America"],
        "Primary Language": ["Nepali", "French", "German", "English"],
    }
)
print("Countries DataFrame:")
print(countries_df, "\n")

Output:

Countries DataFrame:
       Country      Continent Primary Language
0        Nepal           Asia           Nepali
1  Switzerland         Europe           French
2      Germany         Europe           German
3       Canada  North America          English

We will use the countries_df DataFrame shown in the above example to explain how we can use the insert() method for a Pandas DataFrame to insert a column in the DataFrame.

pandas.DataFrame.insert() Method in Python

Syntax

DataFrame.insert(loc, column, value, allow_duplicates=False)

It inserts the column named column into the DataFrame with values specified by value at location loc.

Insert a Column Having the Same Value for All Rows Using the insert() Method

import pandas as pd

countries_df = pd.DataFrame(
    {
        "Country": ["Nepal", "Switzerland", "Germany", "Canada"],
        "Continent": ["Asia", "Europe", "Europe", "North America"],
        "Primary Language": ["Nepali", "French", "German", "English"],
    }
)
print("Countries DataFrame:")
print(countries_df, "\n")

countries_df.insert(3, "Capital", "Unknown")

print("Countries DataFrame after inserting Capital column:")
print(countries_df)

Output:

Countries DataFrame:
       Country      Continent Primary Language
0        Nepal           Asia           Nepali
1  Switzerland         Europe           French
2      Germany         Europe           German
3       Canada  North America          English

Countries DataFrame after inserting Capital column:
       Country      Continent Primary Language  Capital
0        Nepal           Asia           Nepali  Unknown
1  Switzerland         Europe           French  Unknown
2      Germany         Europe           German  Unknown
3       Canada  North America          English  Unknown

It inserts the column Capital in the countries_df DataFrame at position 3 with the same value of the column for all rows set to Unknown.

The position starts from 0 and hence position 3 refers to the 4th column in the DataFrame.

Insert a Column in a DataFrame Specifying Value for Each Row

If we want to specify the values of each row for the column to be inserted using the insert() method, we can pass a list of values as a value argument in the insert() method.

import pandas as pd

countries_df = pd.DataFrame(
    {
        "Country": ["Nepal", "Switzerland", "Germany", "Canada"],
        "Continent": ["Asia", "Europe", "Europe", "North America"],
        "Primary Language": ["Nepali", "French", "German", "English"],
    }
)
print("Countries DataFrame:")
print(countries_df, "\n")

capitals = ["Kathmandu", "Zurich", "Berlin", "Ottawa"]

countries_df.insert(2, "Capital", capitals)

print("Countries DataFrame after inserting Capital column:")
print(countries_df)

Output:

Countries DataFrame:
       Country      Continent Primary Language
0        Nepal           Asia           Nepali
1  Switzerland         Europe           French
2      Germany         Europe           German
3       Canada  North America          English

Countries DataFrame after inserting Capital column:
       Country      Continent    Capital Primary Language
0        Nepal           Asia  Kathmandu           Nepali
1  Switzerland         Europe     Zurich           French
2      Germany         Europe     Berlin           German
3       Canada  North America     Ottawa          English

It inserts the column Capital in the DataFrame countries_df at position 2 with specified values of each row for the Capital column in the DataFrame.

Set allow_duplicates = True in the insert() Method to Add Already Existing Column

import pandas as pd

countries_df = pd.DataFrame(
    {
        "Country": ["Nepal", "Switzerland", "Germany", "Canada"],
        "Continent": ["Asia", "Europe", "Europe", "North America"],
        "Primary Language": ["Nepali", "French", "German", "English"],
        "Capital": ["Kathmandu", "Zurich", "Berlin", "Ottawa"],
    }
)
print("Countries DataFrame:")
print(countries_df, "\n")

capitals = ["Kathmandu", "Zurich", "Berlin", "Ottawa"]

countries_df.insert(4, "Capital", capitals, allow_duplicates=True)

print("Countries DataFrame after inserting Capital column:")
print(countries_df)

Output:

Countries DataFrame:
       Country      Continent Primary Language    Capital
0        Nepal           Asia           Nepali  Kathmandu
1  Switzerland         Europe           French     Zurich
2      Germany         Europe           German     Berlin
3       Canada  North America          English     Ottawa

Countries DataFrame after inserting Capital column:
       Country      Continent Primary Language    Capital    Capital
0        Nepal           Asia           Nepali  Kathmandu  Kathmandu
1  Switzerland         Europe           French     Zurich     Zurich
2      Germany         Europe           German     Berlin     Berlin
3       Canada  North America          English     Ottawa     Ottawa

It adds the column Capital to the countries_df DataFrame even though the column Capital already exists in the countries_df DataFrame.

If we try to insert the column that already exists in the DataFrame without setting allow_duplicates = True in the insert() method, it will throw us an error with the message: ValueError: cannot insert column, already exists.

Author: Suraj Joshi
Suraj Joshi avatar Suraj Joshi avatar

Suraj Joshi is a backend software engineer at Matrice.ai.

LinkedIn

Related Article - Pandas DataFrame Column