Pandas DataFrame DataFrame.median() Function

  1. Syntax of pandas.DataFrame.median():
  2. Example Codes: DataFrame.median() Method to Find Median Along Column Axis
  3. Example Codes: DataFrame.median() Method to Find Median Along Row Axis
  4. Example Codes: DataFrame.median() Method to Find Median Ignoring NaN Values

Python Pandas DataFrame.median() function calculates the median of elements of DataFrame object along the specified axis.

The median is not mean, but the middle of the values in the list of numbers.

Pandas DataFrame median

Syntax of pandas.DataFrame.median():

DataFrame.median( axis=None, 
                skipna=None, 
                level=None, 
                numeric_only=None, 
                **kwargs)

Parameters

axis find median along the row (axis=0) or column (axis=1)
skipna Boolean. Exclude NaN values (skipna=True) or include NaN values (skipna=False)
level Count along with particular level if the axis is MultiIndex
numeric_only Boolean. For numeric_only=True, include only float,int, and boolean columns
**kwargs Additional keyword arguments to the function.

Return

If the level is not specified, return Series of the median of the values for the requested axis, else return DataFrame of median values.

Example Codes: DataFrame.median() Method to Find Median Along Column Axis

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 7, 5, 10],
                   'Y': [4, 3, 8, 2, 9]})
print("DataFrame:")
print(df)

medians=df.median()
print("medians of Each Column:")
print(medians)

Output:

DataFrame:
    X  Y
0   1  4
1   2  3
2   7  8
3   5  2
4  10  9
medians of Each Column:
X    5.0
Y    4.0
dtype: float64

It calculates the median for both columns X and Y and finally returns a Series object with the median of each column.

To find the median of a particular column of DataFrame in Pandas, we call the median() function for that column only.

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 7, 5, 10],
                   'Y': [4, 3, 8, 2, 9]})
print("DataFrame:")
print(df)

medians=df["X"].median()
print("medians of Each Column:")
print(medians)

Output:

DataFrame:
    X  Y
0   1  4
1   2  3
2   7  8
3   5  2
4  10  9
medians of Each Column:
5.0

It only gives the median of values of column X of DataFrame.

Example Codes: DataFrame.median() Method to Find Median Along Row Axis

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 7, 5, 10],
                   'Y': [4, 3, 8, 2, 9],
                   'Z': [2, 7, 6, 10, 5]})
print("DataFrame:")
print(df)

medians=df.median(axis=1)
print("medians of Each Row:")
print(medians)

Output:

DataFrame:
    X  Y   Z
0   1  4   2
1   2  3   7
2   7  8   6
3   5  2  10
4  10  9   5
medians of Each Row:
0    2.0
1    3.0
2    7.0
3    5.0
4    9.0
dtype: float64

It calculates the median for all the rows and finally returns a Series object with the median of each row.

To find the median of a particular row of DataFrame in Pandas, we call the median() function for that row only.

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 7, 5, 10],
                   'Y': [4, 3, 8, 2, 9],
                   'Z': [2, 7, 6, 10, 5]})

print("DataFrame:")
print(df)

median=df.iloc[[0]].median(axis=1)
print("median of 1st Row:")
print(median)

Output:

DataFrame:
    X  Y   Z
0   1  4   2
1   2  3   7
2   7  8   6
3   5  2  10
4  10  9   5
median of 1st Row:
0    2.0
dtype: float64

It only gives the median of values of 1st row of DataFrame.

We use iloc method to select rows based on the index.

Example Codes: DataFrame.median() Method to Find Median Ignoring NaN Values

We use the default value of skipna parameter i.e. skipna=True to find the median of DataFrame along the specified axis by ignoring NaN values.

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 7, None, 10, 8],
                   'Y': [None, 3, 8, 2, 9, 6],
                   'Z': [2, 7, 6, 10, None, 5]})

print("DataFrame:")
print(df)

median=df.median(skipna=True)
print("medians of Each Row:")
print(median)

Output:

DataFrame:
      X    Y     Z
0   1.0  NaN   2.0
1   2.0  3.0   7.0
2   7.0  8.0   6.0
3   NaN  2.0  10.0
4  10.0  9.0   NaN
5   8.0  6.0   5.0
medians of Each Row:
X    7.0
Y    6.0
Z    6.0
dtype: float64

If we set skipna=True, it ignores the NaN in the dataframe. It allows us to calculate the median of DataFrame along the column axis by ignoring NaN values.

import pandas as pd

df = pd.DataFrame({'X': [1, 2, 7, None, 10],
                   'Y': [5, 3, 8, 2, 9],
                   'Z': [2, 7, 6, 10, 4]})

print("DataFrame:")
print(df)

median=df.median(skipna=False)
print("medians of Each Row:")
print(median)

Output:

DataFrame:
      X  Y   Z
0   1.0  5   2
1   2.0  3   7
2   7.0  8   6
3   NaN  2  10
4  10.0  9   4
medians of Each Row:
X    NaN
Y    5.0
Z    6.0
dtype: float64

Here, we get NaN value for the median of the column X as column X has NaN value present in it.

Related Article - Pandas DataFrame

  • Pandas DataFrame DataFrame.reindex() Function
  • Pandas DataFrame DataFrame.apply() Function
  • comments powered by Disqus