How to Manually Set the Size of the Bins in Matplotlib Histogram

Suraj Joshi Feb 02, 2024
  1. Bin Boundaries as a Parameter to hist() Function
  2. Compute the Number of Bins From Desired Width
How to Manually Set the Size of the Bins in Matplotlib Histogram

To draw the histogram, we use hist2d() function where the number of bins n is passed as a parameter. We can set the size of bins by calculating the required number of bins in order to maintain the required size.

Bin Boundaries as a Parameter to hist() Function

Syntax for hist function:

hist(x,
     bins: NoneType=None,
     range: NoneType=None,
     density: NoneType=None,
     weights: NoneType=None,
     cumulative: bool=False,
     bottom: NoneType=None,
     histtype: str=built-ins.str,
     align: str=built-ins.str,
     orientation: str=built-ins.str,
     rwidth: NoneType=None,
     log: bool=False,
     color: NoneType=None,
     label: NoneType=None,
     stacked: bool=False,
     normed: NoneType=None,
     data: NoneType=None,
     **kwargs)

To set the size of the bins in Matplotlib, we pass a list with the bin boundaries instead of the number of bins as the bin parameter.

import numpy as np
import numpy.random as random
import matplotlib.pyplot as plt

data = np.random.random_sample(100) * 100.0
plt.hist(data, bins=[0, 10, 20, 30, 40, 50, 60, 80, 100])
plt.xlabel("Value")
plt.ylabel("Counts")
plt.title("Histogram Plot of Data")
plt.grid(True)
plt.show()

set the size of the bins in Matplotlib passing list as parameter

We manually set the bin boundaries, and indirectly bin width, in the above example. We could also use np.arange to find equally spaced boundaries.

To make the bins equally spaced, we can use np.arange to find equally spaced boundaries

import numpy as np
import numpy.random as random
import matplotlib.pyplot as plt

binwidth = 10
data = np.random.random_sample(100) * 100.0
plt.hist(data, bins=np.arange(min(data), max(data) + binwidth, binwidth))
plt.xlabel("Data")
plt.ylabel("Counts")
plt.title("Histogram Plot of Data")
plt.grid(True)
plt.show()

Equally Distributed bins in Matplotlib passing list as parameter

Warning
The second parameter of np.arange shall be max(data) + binwidth but not max(data), because the interval created by np.arange(start, stop, step) includes start but excludes stop. Therefore, we need to add the interval binwidth to max(data) to make the actual stop as max(data).

Compute the Number of Bins From Desired Width

To find the number of bins, we calculate the result of maximum value-minimum value divided by the desired bin width.

import numpy as np
import matplotlib.pyplot as plt


def find_bins(observations, width):
    minimmum = np.min(observations)
    maximmum = np.max(observations)
    bound_min = -1.0 * (minimmum % width - minimmum)
    bound_max = maximmum - maximmum % width + width
    n = int((bound_max - bound_min) / width) + 1
    bins = np.linspace(bound_min, bound_max, n)
    return bins


data = np.random.random_sample(120) * 100
bins = find_bins(data, 10.0)
plt.hist(data, bins=bins)
plt.xlabel("Data")
plt.ylabel("Counts")
plt.title("Histogram Plot")
plt.show()

find number of bins from given width

Author: Suraj Joshi
Suraj Joshi avatar Suraj Joshi avatar

Suraj Joshi is a backend software engineer at Matrice.ai.

LinkedIn

Related Article - Matplotlib Histogram