How to Find Binary Cross Entropy Loss Value Using TensorFlow

Hafiz Muhammad Zohaib Feb 02, 2024
How to Find Binary Cross Entropy Loss Value Using TensorFlow

This short article explains the two methods for calculating the Binary Cross-Entropy (i.e., the TensorFlow framework’s built-in functions and the custom implementation of the formula in raw Python). It further proves that both the methods yield the same results.

Before calculating the cross-entropy loss, we first need to understand the binary cross entropy loss and why it is estimated. Then we will implement its formula using raw Python and as per the TensorFlow framework.

Find Binary Cross Entropy Loss Value Using TensorFlow

The loss function is used in Machine Learning to measure the model performance. If the loss is high, the model is performing poorly.

On the contrary, if it is low, the model performs well and generates results near the ground truth.

Cross Entropy is also a measure of loss (also known as log loss). It is generally used to compute the loss in binary classification problems.

Binary Cross Entropy is the negative average of the log of corrected predicted probabilities.

We calculate the binary cross entropy using the following formula.

$$
\text { Log loss }=\frac{1}{N} \sum_{i=1}^{N}-\left(y_{i} * \log \left(p_{i}\right)+\left(1-y_{i}\right) * \log \left(1-p_{i}\right)\right)
$$

Let’s implement the above formula using Python.

import numpy as np


def BinaryCrossEntropy(y_true, y_pred):
    y_pred = np.clip(y_pred, 1e-7, 1 - 1e-7)
    term_0 = y_true * np.log(y_pred + 1e-7)
    term_1 = (1 - y_true) * np.log(1 - y_pred + 1e-7)
    return -np.mean(term_0 + term_1, axis=0)


print(
    BinaryCrossEntropy(
        np.array([1, 0, 1]).reshape(-1, 1), np.array([0, 0, 1]).reshape(-1, 1)
    )
)

Let’s understand the above code line by line. We defined a function BinaryCrossEntropy which takes two arguments, y_true and y_pred.

These arguments are 1D-arrays in binary classification. The y_true are actual values, and the y_pred are the ML model’s predicted values.

The np.clip(array, min_val, max_val) call just clips the input array. For example, [0,0,1] will be clipped to [1e^-7, 1e^-7, 0.9999999].

The np.mean() finds the mean of the input array by dividing it by the batch size N.

Why do we use a minimal value like 1*e^-7 for clipping?

The above formula contains some logarithmic terms. Since log(0) (i.e., natural logarithm of zero) produces undefined (infinity).

When infinity is divided by N (i.e., batch-size of predicted/true values), it gives an error. Therefore, we used a minimal value of 1*e^-7 for clipping.

The above code gives the following output.

[5.14164949]

Now, we will use TensorFlow to find binary cross entropy loss values. Let’s look into the below code.

import tensorflow as tf
import numpy as np

y_true = np.array([1.0, 1.0, 1.0]).reshape(-1, 1)
y_pred = np.array([1.0, 1.0, 0.0]).reshape(-1, 1)

bce = tf.keras.losses.BinaryCrossentropy(
    from_logits=False, reduction=tf.keras.losses.Reduction.SUM_OVER_BATCH_SIZE
)
loss = bce(y_true, y_pred)

print(loss.numpy())

The built-in function tf.keras.losses.BinaryCrossentropy( from_logits=False , reduction=tf.keras.losses.Reduction.SUM_OVER_BATCH_SIZE) computes the cross-entropy loss between true labels and predicted labels.

In the above code, bce( y_true, y_pred) takes two arguments.

  1. y_true (true label): This is either 0 or 1.
  2. y_pred (predicted value): This is the model’s prediction, i.e., a single floating-point value which either represents a logit, (i.e, value in [-inf, inf] when from_logits=True) or a probability (i.e, value in [0., 1.] when from_logits=False).

You can find more details on Binary Cross-Entropy here.

The above code gives the following binary cross entropy value.

5.1416497230529785

This is evident by the results that Binary Cross Entropy Loss values using TensorFlow and from the Formula are equal.