Pandas DataFrame.corr() Function

  1. Syntax of pandas.DataFrame.corr():
  2. Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using Pearson Method
  3. Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using the kendall Method
  4. Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using Spearman Method With More Column Value Pairs

Python Pandas DataFrame.corr() function finds the correlation between the columns of the data frame.

Syntax of pandas.DataFrame.corr():

DataFrame.corr(method='pearson', 
               min_periods=1)

Parameters

method It is the method of correlation. It can be pearson, kendall and spearman. pearson is the default.
min_periods This parameter specifies the minimum number of observations required per pair of columns to have a valid result. It is only available for pearson and spearman correlation currently.

Return

It returns the Dataframe with the computed correlation between columns.

Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using Pearson Method

import pandas as pd

dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                        'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                        'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print("The Original Data frame is: \n")
print(dataframe)

dataframe1 = dataframe.corr()
print("The Correlation Matrix is: \n")
print(dataframe1)

Output:

The Original Data frame is: 

   Attendance    Name  Obtained Marks
0          60  Olivia              90
1         100    John              75
2          80   Laura              82
3          78     Ben              64
4          95   Kevin              45
The Correlation Matrix is: 

                Attendance  Obtained Marks
Attendance         1.00000        -0.61515
Obtained Marks    -0.61515         1.00000

The function has returned the correlation matrix. It has ignored the non-numeric column. It has computed the correlation using the Pearson method and one pair of values of columns (min_position= 1).

Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using the kendall Method

To find the correlation using Kendall method, we will call the corr() function for using method= "kendall".

import pandas as pd
dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                        'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                        'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print("The Original Data frame is: \n")
print(dataframe)

dataframe1 = dataframe.corr(method= "kendall")
print("The Correlation Matrix is: \n")
print(dataframe1)

Output:

The Original Data frame is: 

   Attendance    Name  Obtained Marks
0          60  Olivia              90
1         100    John              75
2          80   Laura              82
3          78     Ben              64
4          95   Kevin              45
The Correlation Matrix is: 

                Attendance  Obtained Marks
Attendance             1.0            -0.4
Obtained Marks        -0.4             1.0

The function has returned the correlation matrix. It has computed the correlation using the Kendall method and one pair of values of columns (min_position= 1).

Example Codes: DataFrame.corr() Method to Find Correlation Matrix Using Spearman Method With More Column Value Pairs

Now we will set the value of min_periods to 2 using the spearman method. The parameter min_periods is only available for the pearson and spearman methods.

import pandas as pd
dataframe=pd.DataFrame({'Attendance': {0: 60, 1: 100, 2: 80,3: 78,4: 95},
                        'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                        'Obtained Marks': {0: 90, 1: 75, 2: 82, 3: 64, 4: 45}})
print("The Original Data frame is: \n")
print(dataframe)

dataframe1 = dataframe.corr(method= "spearman", min_periods = 2)
print("The Correlation Matrix is: \n")
print(dataframe1)

Output:

The Original Data frame is: 

   Attendance    Name  Obtained Marks
0          60  Olivia              90
1         100    John              75
2          80   Laura              82
3          78     Ben              64
4          95   Kevin              45
The Correlation Matrix is: 

                Attendance  Obtained Marks
Attendance             1.0            -0.5
Obtained Marks        -0.5             1.0

Now the function has computed correlation using 2 pairs of values of columns.

Related Article - Pandas DataFrame

  • Pandas DataFrame DataFrame.max() Function
  • Pandas DataFrame DataFrame.plot.hist() Function