This tutorial will explain how to implement the softmax function using the NumPy library in Python. The softmax function is a generalized multidimensional form of the logistic function. It is used in multinomial logistic regression and as an activation function in artificial neural networks.
The softmax function normalizes all the elements of the array in the interval
(0,1) so that they can be treated as probabilities. The softmax function is defined by the following formula:
We will look at the methods to implement the softmax function on one and two-dimensional arrays in Python using the Numpy library.
NumPy Softmax Function for 1D Arrays in Python
Suppose we need to define a softmax function that takes a 1D array as input and returns the required normalized array.
The common problem which can occur while applying softmax is the numeric stability problem, which means that the
∑j e^(z_j) may become very large due to the exponential and overflow error that may occur. This overflow error can be solved by subtracting each value of the array with its max value.
The code examples below demonstrate the softmax function’s original implementation and the implementation with max subtraction using the Numpy library in Python.
- Original softmax:
import numpy as np def softmax(x): f_x = np.exp(x) / np.sum(np.exp(x)) return f_x
- Numerically Stable softmax:
import numpy as np def softmax(x): y = np.exp(x - np.max(x)) f_x = y / np.sum(np.exp(x)) return f_x
NumPy Softmax Function for 2D Arrays in Python
The softmax function for a 2D array will perform the softmax transformation along the rows, which means the max and sum will be calculated along the rows. In the case of the 1D array, we did not have to worry about these things; we just needed to apply all the operations on the complete array.
The code example below demonstrates how the softmax transformation will be transformed on a 2D array input using the Numpy library in Python.
import numpy as np def softmax(x): max = np.max(x,axis=1,keepdims=True) #returns max of each row and keeps same dims e_x = np.exp(x - max) #subtracts each row with its max value sum = np.sum(e_x,axis=1,keepdims=True) #returns sum of each row and keeps same dims f_x = e_x / sum return f_x
Suppose we need to perform softmax transformation along the columns of 2D array; we can do it by just taking transport of input and output of the
softmax() method described above.