Apply Square Root Function on a Column of Pandas Data Frame

Apply Square Root Function on a Column of Pandas Data Frame

  1. Introduction to Square Root
  2. Apply Square Root Function on a Column of Pandas Data Frame
  3. Use .astype(int) to Determine Integer Square Roots in Pandas

This tutorial teaches how to apply the square root function on a column of Pandas Data Frame using the exponentiation operator, np.sqrt(), lambda, and apply() functions. Further, we will learn how to use .astype(int) to determine integer square roots.

Introduction to Square Root

Before moving toward the square root, one must understand what a square is and how we can calculate it. Let’s start with that.

In mathematics, we have learned that the square of a number is calculated by multiplying the specified number by itself, for instance, square of 3 = 3x3 = 9.

The square of any number n is represented by a superscript 2, which we can write as n^2; it must fulfill the following two properties:

  1. The square of the specified number can be a floating point number or an integer.
  2. The square of a specified number will always be a positive number because two negative numbers’ product produces a positive number.

Now, we are ready to learn square roots. The square root of n^2 is n, which is represented as √n (also represented as n^(1/2)).

It is useful for various scientific and mathematical functions.

Now we have a strong understanding of square root, let’s learn how we can calculate it using Python, specifically, how we can apply the square root function on a column of the Pandas’ data frame.

Apply Square Root Function on a Column of Pandas Data Frame

We can apply the square root function using various approaches; some of them are given below. To use all of them, we must have a data frame; for example, we have as follows:

import pandas as pd

data = {'years': [2020, 2021, 2022],
         'teams': ['Bears', 'Packers', 'Lions'],
         'wins': [25, 10, 6],
         'losses': [5, 5, 16]}

df = pd.DataFrame(data, columns=['years', 'teams', 'wins', 'losses'])
df['wins+losses'] = df[['wins', 'losses']].sum(axis=1)

df

Here we have a dictionary containing key-value pairs that will be converted to a Python data frame using pd.DataFrame(), which takes the data and an array of column names as parameters.

Then, we add a new column to the data frame, wins+losses, containing the sum of the wins and losses columns. To understand it better, observe the following output.

years teams wins losses wins+losses
0 2020 Bears 25 5 30
1 2021 Packers 10 5 15
2 2022 Lions 6 16 22

This data frame will be used in the following methods, where we will find the square root of the wins, losses, and wins+losses columns.

Method 1: Use Exponentiation Operator to Calculate Square Root

Example Code:

import pandas as pd

data = {'years': [2020, 2021, 2022],
         'teams': ['Bears', 'Packers', 'Lions'],
         'wins': [25, 10, 6],
         'losses': [5, 5, 16]}

df = pd.DataFrame(data, columns=['years', 'teams', 'wins', 'losses'])
df['wins+losses'] = df[['wins', 'losses']].sum(axis=1)


df['sqrt(wins)'] = df[['wins']]** .5
df['sqrt(losses)'] = df[['losses']]** .5
df['sqrt(wins+losses)'] = df[['wins+losses']]** .5

df

Output:

years teams wins losses wins+losses sqrt(wins) sqrt(losses) sqrt(wins+losses)
0 2020 Bears 25 5 30 5.000000 2.236068 5.477226
1 2021 Packers 10 5 15 3.162278 2.236068 3.872983
2 2022 Lions 6 16 22 2.449490 4.000000 4.690416

The above code iterates over the specified data frame’s column and uses exponentiation (**), an arithmetic operator known as a power operator.

We have already learned that the square root of the number n is represented as √n, which is equal to n^(1/2), also written as n**0.5 in Python. Here, n is being replaced by each value of the specified column of a Pandas data frame.

Method 2: Use np.sqrt() to Calculate Square Root

Example Code:

import pandas as pd
import numpy as np

data = {'years': [2020, 2021, 2022],
         'teams': ['Bears', 'Packers', 'Lions'],
         'wins': [25, 10, 6],
         'losses': [5, 5, 16]}

df = pd.DataFrame(data, columns=['years', 'teams', 'wins', 'losses'])
df['wins+losses'] = df[['wins', 'losses']].sum(axis=1)

df['sqrt(wins)'] = np.sqrt(df[['wins']])
df['sqrt(losses)'] = np.sqrt(df[['losses']])
df['sqrt(wins+losses)'] = np.sqrt(df[['wins+losses']])

df

Output:

years teams wins losses wins+losses sqrt(wins) sqrt(losses) sqrt(wins+losses)
0 2020 Bears 25 5 30 5.000000 2.236068 5.477226
1 2021 Packers 10 5 15 3.162278 2.236068 3.872983
2 2022 Lions 6 16 22 2.449490 4.000000 4.690416

This code snippet is using sqrt() function of the NumPy library, which takes an array of input values whose square roots have to be determined.

Method 3: Use the lambda Expression to Calculate Square Root

Example Code:

import pandas as pd

data = {'years': [2020, 2021, 2022],
         'teams': ['Bears', 'Packers', 'Lions'],
         'wins': [25, 10, 6],
         'losses': [5, 5, 16]}

df = pd.DataFrame(data, columns=['years', 'teams', 'wins', 'losses'])
df['wins+losses'] = df[['wins', 'losses']].sum(axis=1)

df['sqrt(wins)'] = df.transform(lambda x: (df[['wins']]**0.5))
df['sqrt(losses)'] = df.transform(lambda x: (df[['losses']])**0.5)
df['sqrt(wins+losses)'] = df.transform(lambda x: (df[['wins+losses']])**0.5)

df

Output:

years teams wins losses wins+losses sqrt(wins) sqrt(losses) sqrt(wins+losses)
0 2020 Bears 25 5 30 5.000000 2.236068 5.477226
1 2021 Packers 10 5 15 3.162278 2.236068 3.872983
2 2022 Lions 6 16 22 2.449490 4.000000 4.690416

Here, we are using the lambda expression (which is a function) with exponentiation (**) to determine the square roots of the specified columns. We use lambda expressions where we prefer the practicality and simplicity of the code.

We also use transform() method, which calls a function on self producing a DataFrame with transformed items/values. It returns a DataFrame containing the same length as self.

Method 4: Use apply() to Calculate Square Root

Example Code:

import pandas as pd
import numpy as np

data = {'years': [2020, 2021, 2022],
          'teams': ['Bears', 'Packers', 'Lions'],
          'wins': [25, 10, 6],
          'losses': [5, 5, 16]}

df = pd.DataFrame(data, columns=['years', 'teams', 'wins', 'losses'])
df['wins+losses'] = df[['wins', 'losses']].sum(axis=1)


df['sqrt(wins)'] = df[['wins']].apply(np.sqrt)
df['sqrt(losses)'] = df[['losses']].apply(np.sqrt)
df['sqrt(wins+losses)'] = df[['wins+losses']].apply(np.sqrt)

df

Output:

years teams wins losses wins+losses sqrt(wins) sqrt(losses) sqrt(wins+losses)
0 2020 Bears 25 5 30 5.000000 2.236068 5.477226
1 2021 Packers 10 5 15 3.162278 2.236068 3.872983
2 2022 Lions 6 16 22 2.449490 4.000000 4.690416

This code fence is using apply() method from the Pandas library, which takes np.sqrt as a parameter and returns a DataFrame of square-root values.

You may have noticed that all approaches above return square roots as float values. What if we want them as integer values?

Use .astype(int) to Determine Integer Square Roots in Pandas

Example Code:

import pandas as pd
import numpy as np

data = {'years': [2020, 2021, 2022],
          'teams': ['Bears', 'Packers', 'Lions'],
          'wins': [25, 10, 6],
          'losses': [5, 5, 16]}

df = pd.DataFrame(data, columns=['years', 'teams', 'wins', 'losses'])
df['wins+losses'] = df[['wins', 'losses']].sum(axis=1)


df['sqrt(wins)'] = df[['wins']].apply(np.sqrt).astype(int)
df['sqrt(losses)'] = df[['losses']].apply(np.sqrt).astype(int)
df['sqrt(wins+losses)'] = df[['wins+losses']].apply(np.sqrt).astype(int)

df

Output:

years teams wins losses wins+losses sqrt(wins) sqrt(losses) sqrt(wins+losses)
0 2020 Bears 25 5 30 5 2 5
1 2021 Packers 10 5 15 3 2 3
2 2022 Lions 6 16 22 2 4 4

Similarly, we can use .astype(int) with other approaches. Remember, finding the square of 0 will not cause any error because 0 raised to the power of anything would also be 0, but you may get ValueError or NaN if you try to find the square root of a negative number.

Mehvish Ashiq avatar Mehvish Ashiq avatar

Mehvish Ashiq is a former Java Programmer and a Data Science enthusiast who leverages her expertise to help others to learn and grow by creating interesting, useful, and reader-friendly content in Computer Programming, Data Science, and Technology.

LinkedIn GitHub Facebook