Pandas loc vs iloc

Suraj Joshi Jan 30, 2023
  1. Select Particular Value From DataFrame Specifying Index and Column Label Using .loc() Method
  2. Select Particular Columns From the DataFrame Using the .loc() Method
  3. Filter Rows by Applying Condition to Columns Using .loc() Method
  4. Filter Rows With Indices Using iloc
  5. Filter Particular Rows and Columns From the DataFrame
  6. Filter Range of Rows and Columns From DataFrame Using iloc
  7. Pandas loc vs iloc
Pandas loc vs iloc

This tutorial explains how we can filter data from a Pandas DataFrame using loc and iloc in Python. To filter entries from the DataFrame using iloc we use the integer index for rows and columns, and to filter entries from the DataFrame using loc, we use row and column names.

To demonstrate data filtering using loc, we will use the DataFrame described in the following example.

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print(student_df)

Output:

        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

Select Particular Value From DataFrame Specifying Index and Column Label Using .loc() Method

We can pass an index label and column label as an argument to the .loc() method to extract the value corresponding to the given index and column label.

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)
print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("The Grade of student with Roll No. 504 is:")
value = student_df.loc[504, "Grade"]
print(value)

Output:

The DataFrame of students with marks is:
        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

The Grade of student with Roll No. 504 is:
A-

It selects the value in the DataFrame with index label as 504 and column label Grade. The first argument to the .loc() method represents the index name, while the second argument refers to the column name.

Select Particular Columns From the DataFrame Using the .loc() Method

We can also filter the required columns from the DataFrame using the .loc() method. We pass the list of required column names as a second argument to the .loc() method to filter specified columns.

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("The name and age of students in the DataFrame are:")
value = student_df.loc[:, ["Name", "Age"]]
print(value)

Output:

The DataFrame of students with marks is:
        Name Age      City Grade
501    Alice   17 New York     A
502   Steven   20 Portland    B-
503 Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

The name and age of students in the DataFrame are:
        Name Age
501    Alice   17
502   Steven   20
503 Neesham   18
504    Chris   21
505    Alice   15

The first argument to the .loc() is :, which denotes all the rows in the DataFrame. Similarly we pass ["Name", "Age"] as the second argument to the .loc() method which represents to select only Name and Age columns from the DataFrame.

Filter Rows by Applying Condition to Columns Using .loc() Method

We can also filter rows satisfying the specified condition for column values using the .loc() method.

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Students with Grade A are:")
value = student_df.loc[student_df.Grade == "A"]
print(value)

Output:

The DataFrame of students with marks is:
        Name Age      City Grade
501    Alice   17 New York     A
502   Steven   20 Portland    B-
503 Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

Students with Grade A are:
      Name Age      City Grade
501 Alice   17 New York     A
505 Alice   15    Austin     A

It selects all the students in the DataFrame with grade A.

Filter Rows With Indices Using iloc

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("2nd and 3rd rows in the DataFrame:")
filtered_rows = student_df.iloc[[1, 2]]
print(filtered_rows)

Output:

The DataFrame of students with marks is:
        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

2nd and 3rd rows in the DataFrame:
        Name  Age      City Grade
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+

It filters the second and third rows from the DataFrame.

We pass the rows’ integer index as an argument to the iloc method to filter rows from the DataFrame. Here, the integer index for the second and third rows are 1 and 2 respectively, as the index starts from 0.

Filter Particular Rows and Columns From the DataFrame

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Filtered values from the DataFrame:")
filtered_values = student_df.iloc[[1, 2, 3], [0, 3]]
print(filtered_values)

Output:

The DataFrame of students with marks is:
        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

Filtered values from the DataFrame:
        Name Grade
502   Steven    B-
503  Neesham    B+
504    Chris    A-

It filters the first and last column i.e. Name and Grade of the second, third and fourth row from the DataFrame. We pass the list with integer indices of the row as the first argument and the list with integer indices of the column as the second argument to the iloc method.

Filter Range of Rows and Columns From DataFrame Using iloc

To filter the range of rows and columns, we can use list slicing and pass the slices for each row and column as an argument to the iloc method.

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Filtered values from the DataFrame:")
filtered_values = student_df.iloc[1:4, 0:2]
print(filtered_values)

Output:

The DataFrame of students with marks is:
        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

Filtered values from the DataFrame:
        Name  Age
502   Steven   20
503  Neesham   18
504    Chris   21

It selects the second, third and fourth rows and the first and second columns from the DataFrame. 1:4 represents the rows with an index ranging from 1 to 3 and 4 is exclusive in the range. Similarly, 0:2 represents columns with an index ranging from 0 to 1.

Pandas loc vs iloc

To filter the rows and columns from the DataFrame using loc(), we need to pass the name of rows and columns to be filtered out. Similarly, we need to pass the integer indices of rows and columns to be filtered out to filter the values using iloc().

import pandas as pd

roll_no = [501, 502, 503, 504, 505]

student_df = pd.DataFrame(
    {
        "Name": ["Alice", "Steven", "Neesham", "Chris", "Alice"],
        "Age": [17, 20, 18, 21, 15],
        "City": ["New York", "Portland", "Boston", "Seattle", "Austin"],
        "Grade": ["A", "B-", "B+", "A-", "A"],
    },
    index=roll_no,
)

print("The DataFrame of students with marks is:")
print(student_df)
print("")
print("Filtered values from the DataFrame using loc:")
iloc_filtered_values = student_df.loc[[502, 503, 504], ["Name", "Age"]]
print(iloc_filtered_values)
print("")
print("Filtered values from the DataFrame using iloc:")
iloc_filtered_values = student_df.iloc[[1, 2, 3], [0, 3]]
print(iloc_filtered_values)
The DataFrame of students with marks is:
        Name  Age      City Grade
501    Alice   17  New York     A
502   Steven   20  Portland    B-
503  Neesham   18    Boston    B+
504    Chris   21   Seattle    A-
505    Alice   15    Austin     A

Filtered values from the DataFrame using loc:
        Name  Age
502   Steven   20
503  Neesham   18
504    Chris   21

Filtered values from the DataFrame using iloc:
        Name Grade
502   Steven    B-
503  Neesham    B+
504    Chris    A-

It displays how we can filter the same values from DataFrame using loc and iloc.

Author: Suraj Joshi
Suraj Joshi avatar Suraj Joshi avatar

Suraj Joshi is a backend software engineer at Matrice.ai.

LinkedIn

Related Article - Pandas Filter