Pandas DataFrame DataFrame.apply() Function

Suraj Joshi Jan 30, 2023
  1. Syntax of pandas.DataFrame.apply():
  2. Example Codes: DataFrame.apply() Method
  3. Example Codes: Apply Function to Each Column With DataFrame.apply()
  4. Example Codes: Apply Function to Each Row With DataFrame.apply() Method
  5. Example Codes: DataFrame.apply() Method With result_type Parameter
Pandas DataFrame DataFrame.apply() Function

pandas.DataFrame.apply() function applies the input function to every element along row or column of the caller Pandas DataFrame.

Syntax of pandas.DataFrame.apply():

DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwds)

Parameters

func The function to be applied to each row or column
axis apply function along the row (axis=0) or column (axis=1)
raw Boolean. Row/Column passed as a Series object(raw=False) or ndarray object(raw=True)
result_type {'expand', 'reduce', 'broadcast', 'None'}
type of the output of operation only applicable for axis=1 (columns)
New in version 0.23.0
args Positional arguments for the function func.
**kwds Keyword arguments for the function func.

Return

It returns the DataFrame after applying the input function along the specified axis.

Example Codes: DataFrame.apply() Method

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print(df)
modified_df=df.apply(lambda x: x**2)
print(modified_df)

Output:

   X  Y
0  1  4
1  2  1
2  3  8
   X   Y
0  1  16
1  4   1
2  9  64

We apply a lambda function lambda x: x**2 to all the elements of DataFrame using DataFrame.apply() method.

Lambda functions are simpler ways to define functions in Python.

lambda x: x**2 represents the function that takes x as an input and returns x**2 as output.

Example Codes: Apply Function to Each Column With DataFrame.apply()

import pandas as pd
import numpy as np

df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print("Original DataFrame")
print(df)
modified_df=df.apply(np.sum)
print("Modified DataFrame")
print(modified_df)

Output:

Original DataFrame
   X  Y
0  1  4
1  2  1
2  3  8
Modified DataFrame
X     6
Y    13
dtype: int64

Here, np.sum is applied to each column because axis=0 (default value) in this case.

So, we get the sum of elements in each column after using the df.apply() method.

import pandas as pd
df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print(df)
modified_df=df.apply(lambda x: (x**2)  if x.name == 'X' else x)
print(modified_df)

Output:

   X  Y
0  1  4
1  2  1
2  3  8
   X  Y
0  1  4
1  4  1
2  9  8

If we wish to apply the function only to certain columns, we modify our function definition using the if statement to filter columns. In the example, the function modifies the value of only the columns with the column name X.

Example Codes: Apply Function to Each Row With DataFrame.apply() Method

import pandas as pd
import numpy as np

df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print("Original DataFrame")
print(df)
modified_df=df.apply(np.sum, axis=1)
print("Modified DataFrame")
print(modified_df)

Output:

Original DataFrame
   X  Y
0  1  4
1  2  1
2  3  8
Modified DataFrame
0     5
1     3
2    11
dtype: int64

Here, np.sum is applied to each row at a time as we have set axis=1 in this case.

So, we get the sum of individual elements of all rows after using the df.apply() method.

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print(df)
modified_df=df.apply(lambda x: (x**2)  if x.name in [0,1] else x,
                     axis=1)
print(modified_df)

Output:

   X  Y
0  1  4
1  2  1
2  3  8
   X   Y
0  1  16
1  4   1
2  3   8

If we wish to apply the function only to certain rows, we modify our function definition using the if statement to filter rows. In the example, the function modifies the values of only the rows with index 0 and 1 i.e. the first and second rows only.

Example Codes: DataFrame.apply() Method With result_type Parameter

If we use the default value of result_type parameter i.e. None, it will return the DataFrame without any modification.

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print("Original DataFrame")
print(df)

modified_df=df.apply(lambda x:[1,1],axis=1)
print("Modified DataFrame")
print(modified_df)

Output:

Original DataFrame
   X  Y
0  1  4
1  2  1
2  3  8
Modified DataFrame
0    [1, 1]
1    [1, 1]
2    [1, 1]
dtype: object

In the above example, each row is passed into function at a time, and the value of row is set to [1,1].

If we wish to modify the type of result after function operates on DataFrame, we can set values for result_type according to our needs.

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 3,],
                   'Y': [4, 1, 8]})
print("Original DataFrame")
print(df)

modified_df=df.apply(lambda x:[1,1],
                     axis=1,
                     result_type='expand')
print("Modified DataFrame")
print(modified_df)

Output:

Original DataFrame
   X  Y
0  1  4
1  2  1
2  3  8
Modified DataFrame
   0  1
0  1  1
1  1  1
2  1  1

Setting result_type='expand' will expand all list-like values to columns of a Dataframe.

Author: Suraj Joshi
Suraj Joshi avatar Suraj Joshi avatar

Suraj Joshi is a backend software engineer at Matrice.ai.

LinkedIn

Related Article - Pandas DataFrame