NumPy mean() vs average()

Manav Narula Nov 26, 2021
NumPy mean() vs average()

The mean is the central value of a set of observations. There can be many forms of mean, like geometric, harmonic, arithmetic mean. In the world of statistics, both arithmetic mean and average are used interchangeably. They both are calculated using the same formula i.e. sum of total observations divided by the total number of observations.

In Python, we have two functions that can calculate the arithmetic mean or average. These are the numpy.mean() and the numpy.average() functions available in the NumPy module.

Both these functions can be used to calculate the arithmetic mean or average as shown below:

import numpy as np

arr = np.array([12, 15, 18, 19, 20])

print("Average Function: ", np.average(arr))
print("Type returned: ", type(np.average(arr)))

print("Mean Function: ", np.mean(arr))
print("Type returned: ", type(np.mean(arr)))

Output:

Average Function:  16.8
Type returned:  <class 'numpy.float64'>
Mean Function:  16.8
Type returned:  <class 'numpy.float64'>

Note that they both even return the final output in the same type, and it may seem that both these functions are equivalent.

However, there are a few differences between them. The numpy.average() function can also calculate the weighted average of an array, something which is not possible in the numpy.mean() funtion. For this we simply pass the weights as a parameter to the function as shown below:

import numpy as np

arr = np.array([12, 15, 18, 19, 20])
arr_w = np.array([0.1, 0.1, 0.1, 0.2, 0.5])

print("Weighted Average Function: ", np.average(arr, weights=arr_w))

Output:

Weighted Average Function:  18.3

Another notable difference is that the np.mean() function can have many other parameters like dtype, out, where, keepdims, and more which are not available in the np.average() function.

Such additional parameters can be beneficial; for example, we might have a situation where the type of object whose average is to be calculated is unknown or ambiguous. In such cases, we can specify the type using the dtype parameter. The out parameter can specify if we want to store the result in an alternate array. We can also set the axis along which we want to calculate the mean using the axis parameter. The following code shows the use of some parameters in the np.mean() function.

import numpy as np

arr = np.array([[12, 15, 18, 19, 20], [10, 16, 7, 18, 20], [20, 12, 24, 11, 14]])

x = np.arange(3)
print("Mean Function: ", np.mean(arr, dtype=int, axis=1, out=x))
print("Output array: ", x)

Output:

Mean Function:  [16 14 16]
Output array:  [16 14 16]
Author: Manav Narula
Manav Narula avatar Manav Narula avatar

Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.

LinkedIn