Pandas DataFrame.corr() Function

Minahil Noor Jan 30, 2023
  1. Syntax of pandas.DataFrame.corr():
  2. Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using Pearson Method
  3. Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using the kendall Method
  4. Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using the spearman Method With More Column Value Pairs
Pandas DataFrame.corr() Function

Python Pandas DataFrame.corr() function finds the correlation between the columns of the dataframe.

Syntax of pandas.DataFrame.corr():

DataFrame.corr(method="pearson", min_periods=1)

Parameters

method It is the method of correlation. It can be pearson, kendall and spearman. pearson is the default.
min_periods This parameter specifies the minimum number of observations required per pair of columns to have a valid result. It is only available for pearson and spearman correlation currently.

Return

It returns the Dataframe with the computed correlation between columns.

Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using Pearson Method

import pandas as pd

dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                        'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                        'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print("The Original Data frame is: \n")
print(dataframe)

dataframe1 = dataframe.corr()
print("The Correlation Matrix is: \n")
print(dataframe1)

Output:

The Original Data frame is: 

   Attendance    Name  Obtained Marks
0          60  Olivia              90
1         100    John              75
2          80   Laura              82
3          78     Ben              64
4          95   Kevin              45
The Correlation Matrix is: 

                Attendance  Obtained Marks
Attendance         1.00000        -0.61515
Obtained Marks    -0.61515         1.00000

The function has returned the correlation matrix. It has ignored the non-numeric column. It has computed the correlation using the Pearson method and one pair of values of columns (min_position= 1).

Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using the kendall Method

To find the correlation using the Kendall method, we will call the corr() function for using method= "kendall".

import pandas as pd
dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                        'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                        'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print("The Original Data frame is: \n")
print(dataframe)

dataframe1 = dataframe.corr(method= "kendall")
print("The Correlation Matrix is: \n")
print(dataframe1)

Output:

The Original Data frame is: 

   Attendance    Name  Obtained Marks
0          60  Olivia              90
1         100    John              75
2          80   Laura              82
3          78     Ben              64
4          95   Kevin              45
The Correlation Matrix is: 

                Attendance  Obtained Marks
Attendance             1.0            -0.4
Obtained Marks        -0.4             1.0

The function has returned the correlation matrix. It has computed the correlation using the Kendall method and one pair of values of columns (min_position= 1).

Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using the spearman Method With More Column Value Pairs

Now we will set the value of min_periods to 2 using the spearman method. The parameter min_periods is only available for the pearson and spearman methods.

import pandas as pd
dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                        'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                        'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print("The Original Data frame is: \n")
print(dataframe)

dataframe1 = dataframe.corr(method= "spearman", min_periods = 2)
print("The Correlation Matrix is: \n")
print(dataframe1)

Output:

The Original Data frame is: 

   Attendance    Name  Obtained Marks
0          60  Olivia              90
1         100    John              75
2          80   Laura              82
3          78     Ben              64
4          95   Kevin              45
The Correlation Matrix is: 

                Attendance  Obtained Marks
Attendance             1.0            -0.5
Obtained Marks        -0.5             1.0

Now the function has computed correlation using 2 pairs of values of columns.

Related Article - Pandas DataFrame