How to Iterate Through Rows of a DataFrame in Pandas

Suraj Joshi Feb 02, 2024
  1. index Attribute to Iterate Through Rows in Pandas DataFrame
  2. loc[] Method to Iterate Through Rows of DataFrame in Python
  3. iloc[] Method to Iterate Through Rows of DataFrame in Python
  4. pandas.DataFrame.iterrows() to Iterate Over Rows Pandas
  5. pandas.DataFrame.itertuples to Iterate Over Rows Pandas
  6. pandas.DataFrame.apply to Iterate Over Rows Pandas
How to Iterate Through Rows of a DataFrame in Pandas

We can loop through rows of a Pandas DataFrame using the index attribute of the DataFrame. We can also iterate through rows of DataFrame Pandas using loc(), iloc(), iterrows(), itertuples(), iteritems() and apply() methods of DataFrame objects.

We will use the below dataframe as an example in the following sections.

import pandas as pd

dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]

df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})

print(df)

Output:

       Date  Income_1  Income_2
0  April-10        10        20
1  April-11        20        30
2  April-12        10        10
3  April-13        15         5
4  April-14        10        40
5  April-16        12        13

index Attribute to Iterate Through Rows in Pandas DataFrame

Pandas DataFrame index attribute gives a range object from the top row to the bottom row of a DataFrame. We can use the range to iterate over rows in Pandas.

import pandas as pd

dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]

df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})

for i in df.index:
    print(
        "Total income in "
        + df["Date"][i]
        + " is:"
        + str(df["Income_1"][i] + df["Income_2"][i])
    )

Output:

Total income in April-10 is:30
Total income in April-11 is:50
Total income in April-12 is:20
Total income in April-13 is:20
Total income in April-14 is:50
Total income in April-16 is:25

It adds Income_1 and Income_2 of each row and prints total income.

loc[] Method to Iterate Through Rows of DataFrame in Python

The loc[] method is used to access one row at a time. When we use the loc[] method inside the loop through DataFrame, we can iterate through rows of DataFrame.

import pandas as pd

dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]

df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})

for i in range(len(df)):
    print(
        "Total income in "
        + df.loc[i, "Date"]
        + " is:"
        + str(df.loc[i, "Income_1"] + df.loc[i, "Income_2"])
    )

Output:

Total income in April-10 is:30
Total income in April-11 is:50
Total income in April-12 is:20
Total income in April-13 is:20
Total income in April-14 is:50
Total income in April-16 is:25

Here, range(len(df)) generates a range object to loop over entire rows in the DataFrame.

iloc[] Method to Iterate Through Rows of DataFrame in Python

Pandas DataFrame iloc attribute is also very similar to loc attribute. The only difference between loc and iloc is that in loc we have to specify the name of row or column to be accessed while in iloc we specify the index of the row or column to be accessed.

import pandas as pd

dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]

df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})

for i in range(len(df)):
    print(
        "Total income in " + df.iloc[i, 0] + " is:" + str(df.iloc[i, 1] + df.iloc[i, 2])
    )

Output:

Total income in April-10 is:30
Total income in April-11 is:50
Total income in April-12 is:20
Total income in April-13 is:20
Total income in April-14 is:50
Total income in April-16 is:25

Here the index 0 represents the 1st column of DataFrame i.e. Date, the index 1 represents the Income_1 column and index 2 represents the Income_2 column.

pandas.DataFrame.iterrows() to Iterate Over Rows Pandas

pandas.DataFrame.iterrows() returns the index of the row and the entire data of the row as a Series. Hence, we could use this function to iterate over rows in Pandas DataFrame.

import pandas as pd

dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]

df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})


for index, row in df.iterrows():
    print(
        "Total income in "
        + row["Date"]
        + " is:"
        + str(row["Income_1"] + row["Income_2"])
    )

Output:

Total income in April-10 is:30
Total income in April-11 is:50
Total income in April-12 is:20
Total income in April-13 is:20
Total income in April-14 is:50
Total income in April-16 is:25

pandas.DataFrame.itertuples to Iterate Over Rows Pandas

pandas.DataFrame.itertuples returns an object to iterate over tuples for each row with the first field as an index and remaining fields as column values. Hence, we could also use this function to iterate over rows in Pandas DataFrame.

import pandas as pd

dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]

df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})


for row in df.itertuples():
    print("Total income in " + row.Date + " is:" + str(row.Income_1 + row.Income_2))

Output:

Total income in April-10 is:30
Total income in April-11 is:50
Total income in April-12 is:20
Total income in April-13 is:20
Total income in April-14 is:50
Total income in April-16 is:25

pandas.DataFrame.apply to Iterate Over Rows Pandas

pandas.DataFrame.apply returns a DataFrame
as a result of applying the given function along the given axis of the DataFrame.

Syntax:

DataFrame.apply(self, func, axis=0, raw=False, result_type=None, args=(), **kwds)

Where, func represents the function to be applied and axis represents the axis along which the function is applied. We can use axis=1 or axis = 'columns' to apply function to each row.

import pandas as pd

dates = ["April-10", "April-11", "April-12", "April-13", "April-14", "April-16"]
income1 = [10, 20, 10, 15, 10, 12]
income2 = [20, 30, 10, 5, 40, 13]

df = pd.DataFrame({"Date": dates, "Income_1": income1, "Income_2": income2})


print(
    df.apply(
        lambda row: "Total income in "
        + row["Date"]
        + " is:"
        + str(row["Income_1"] + row["Income_2"]),
        axis=1,
    )
)

Output:

0    Total income in April-10 is:30
1    Total income in April-11 is:50
2    Total income in April-12 is:20
3    Total income in April-13 is:20
4    Total income in April-14 is:50
5    Total income in April-16 is:25
dtype: object

Here, lambda keyword is used to define an inline function that is applied to each row.

Author: Suraj Joshi
Suraj Joshi avatar Suraj Joshi avatar

Suraj Joshi is a backend software engineer at Matrice.ai.

LinkedIn

Related Article - Pandas DataFrame

Related Article - Pandas DataFrame Row