How to Create Boxplot in Matplotlib

Suraj Joshi Feb 02, 2024
How to Create Boxplot in Matplotlib

This tutorial explains how we can create a boxplot using the matplotlib.pyplot.boxplot() function in Python.

The boxplot helps us gain insights about the data by giving information about the position of minimum, 1st quartile, median, 3rd quartile, and the maximum values of the data.

boxplot in Python Matplotlib

import matplotlib.pyplot as plt

x = [4, 5, 6, 8, 9, 10, 10, 11, 11, 12, 13, 14, 15, 15, 15, 17, 18, 19, 22, 23, 25]

plt.boxplot(x)
plt.title("Boxplot Using Matplotlib")
plt.show()

Output:

Boxplot in Python using Matplotlib

It plots a boxplot from the given data x. In the boxplot, the box will extend from Q1 to Q3; and the horizontal line inside the box represents the median of the data. The whiskers in the boxplot extend from Q3 to maximum value in the data and from the minimum value of the data to Q1 of the data.

The data’s minimum value is determined by the value of Q1-1.5(Q3-Q1) while the maximum value of the data is determined by the formula Q3+1.5(Q3-Q1).

import matplotlib.pyplot as plt

x = [
    1,
    4,
    5,
    6,
    8,
    9,
    10,
    10,
    11,
    11,
    12,
    12,
    13,
    14,
    15,
    15,
    15,
    17,
    18,
    18,
    19,
    22,
    23,
    25,
    30,
    33,
    35,
]

plt.boxplot(x)
plt.title("Boxplot Using Matplotlib")
plt.show()

Output:

Boxplot in Python using Matplotlib with outliers

It plots the boxplot of the given data x. We can also notice two outliers at the top of the boxplot represented by circles in the plot.

A data point is plotted as an outlier if either its value is smaller than Q1-1.5(Q3-Q1) or greater than Q3+ 1.5(Q3-Q1).

If we pass a 2D array as an argument to the matplotlib.pyplot.boxplot() function, the boxplot() function makes boxplot for each array or the list in the 2D array.

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(100)

data_a = np.random.randint(2, 15, size=15)
data_b = np.random.randint(5, 18, size=20)
data_c = np.random.randint(2, 20, size=30)
data_d = np.random.randint(1, 30, size=40)

data_2d = [data_a, data_b, data_c, data_d]

plt.boxplot(data_2d)
plt.title("Boxplot Using Matplotlib")
plt.show()

Output:

Multiple Boxplots in Python using Matplotlib

It creates boxplot for each NumPy array inside the list data_2d. Hence, we get 4 boxplots in a single figure sharing common axes.

Author: Suraj Joshi
Suraj Joshi avatar Suraj Joshi avatar

Suraj Joshi is a backend software engineer at Matrice.ai.

LinkedIn