How to Normalize a List of Numbers in Python

Fariba Laiq Feb 02, 2024
  1. the Formula for Normalization
  2. Normalize a List of Numbers Using the MinMaxScaler() Function in Python sklearn
  3. Normalize a List of Numbers Manually in Python
  4. Conclusion
How to Normalize a List of Numbers in Python

Normalization is a crucial data preprocessing technique that involves converting data into a standardized scale. The goal is to rescale the data to fit within a specified range, often [0, 1], for optimal performance in various applications, such as machine learning algorithms.

This article will delve into the concept of normalization, its formula, and methods for achieving it, both using built-in functions and manually.

the Formula for Normalization

Normalization is the process of transforming data into a specific scale, typically between two defined values, like 0 and 1. The motivation behind normalization is to enhance the performance of machine learning algorithms, which tend to operate more effectively when working with smaller data values.

For example, consider a simple list of numbers: {1, 2, 3}. After normalizing to a scale of 0 to 1, the list becomes {0, 0.5, 1}. We can also customize the normalization scale, like normalizing to a range between 2 and 6, resulting in {2, 4, 6}.

To understand how normalization works, let’s have a look at its formula. We subtract the minimum value from every number and divide it by the range i-e: max-min. So, in output, we get the normalized value of that specific number.

$$ X_{norm} = {x-x_{min}\over x_{max}-x_{min}} $$

Where:

  • (x): The original value.
  • (x_{\text{min}}): The minimum value in the dataset.
  • (x_{\text{max}}): The maximum value in the dataset.

This formula is a fundamental representation of how min-max normalization is calculated, ensuring that the scaled values fall within the range [0, 1].

We can use two methods to normalize a list. Either we can use the built-in function, which is available in the preprocessing module of the sklearn package, or we can make our logic for it, which works on the same formula as discussed above.

Normalize a List of Numbers Using the MinMaxScaler() Function in Python sklearn

The MinMaxScaler() function within the preprocessing module of the scikit-learn library is a powerful tool for normalizing a list of numbers. This process involves scaling the values to a specified range, which is often set between 0 and 1, although the range can be customized to suit your specific needs.

The syntax for creating an instance of MinMaxScaler() is as follows:

sklearn.preprocessing.MinMaxScaler(feature_range=(min, max), copy=True)

Parameters:

  • feature_range: The desired range for the transformed features. By default, it’s set to (0, 1). You can specify a different range by providing a tuple of minimum and maximum values (e.g., (new_min, new_max)).
  • copy: A boolean (True by default) indicating whether a copy of the original array should be created or not.

Once you have created an instance of MinMaxScaler with the desired parameters, you can use its fit() and transform() methods to scale the data accordingly. The fit() method computes the minimum and maximum values needed for scaling, while the transform() method applies the scaling based on the computed minimum and maximum values.

Let’s take a look at an example demonstrating the usage of MinMaxScaler():

import numpy as np
from sklearn import preprocessing

numbers = np.array([6, 1, 0, 2, 7, 3, 8, 1, 5]).reshape(-1, 1)
print("Original List:", numbers)

scaler = preprocessing.MinMaxScaler()

normalized_numbers = scaler.fit_transform(numbers)
print("Normalized List:", normalized_numbers)

The output will be:

Original List: [[6]
 [1]
 [0]
 [2]
 [7]
 [3]
 [8]
 [1]
 [5]]
Normalized List: [[0.75 ]
 [0.125]
 [0.   ]
 [0.25 ]
 [0.875]
 [0.375]
 [1.   ]
 [0.125]
 [0.625]]

In the provided example code, we first import the necessary libraries: numpy for numerical array handling and preprocessing from scikit-learn for the MinMaxScaler function.

Next, we create a sample list of numbers using NumPy, reshaping it into a column vector for consistent processing. The original list of numbers is printed to provide a reference.

We then create an instance of MinMaxScaler(). Using the fit_transform method of the scaler instance, we normalize the original list of numbers, and the resulting normalized list is printed for examination.

Customizing the Normalization Range

If you want to define a specific range for the normalization, you can achieve this by specifying the feature_range parameter in MinMaxScaler(). By default, the range is set to 0 and 1, but you have the flexibility to tailor it to your requirements.

Here’s an example where we set the range to 0 and 3:

import numpy as np
from sklearn import preprocessing

numbers = np.array([6, 1, 0, 2, 7, 3, 8, 1, 5]).reshape(-1, 1)
print("Original List:", numbers)

scaler = preprocessing.MinMaxScaler(feature_range=(0, 3))

normalized_numbers = scaler.fit_transform(numbers)
print("Normalized List:", normalized_numbers)

Output:

Original List: [[6]
 [1]
 [0]
 [2]
 [7]
 [3]
 [8]
 [1]
 [5]]
Normalized List: [[2.25 ]
 [0.375]
 [0.   ]
 [0.75 ]
 [2.625]
 [1.125]
 [3.   ]
 [0.375]
 [1.875]]

In the additional example provided, we import the necessary libraries and create a sample list of numbers, similar to the initial example. We then print the original list of numbers.

To customize the normalization range, we create a MinMaxScaler() instance with a custom range defined as 0 to 3 using the feature_range parameter.

We proceed to normalize the list using this custom range with the fit_transform method, and the resulting normalized list within the specified range is printed.

Normalize a List of Numbers Manually in Python

Normalizing a list of numbers can also be achieved manually using a simple formula.

Recall the following formula for Min-Max scaling:

$$ X_{norm} = {x-x_{min}\over x_{max}-x_{min}} $$

Where (x) is the original value, (x_{\text{min}}) is the minimum value in the dataset, and (x_{\text{max}}) is the maximum value in the dataset.

Let’s demonstrate how to manually normalize a list of numbers in Python using the given formula.

numbers = [6, 1, 0, 2, 7, 3, 8, 1, 5]
print("Original List:", numbers)

xmin = min(numbers)
xmax = max(numbers)

normalized_numbers = [(x - xmin) / (xmax - xmin) for x in numbers]

print("Normalized List:", normalized_numbers)

Output:

Original List: [6, 1, 0, 2, 7, 3, 8, 1, 5]
Normalized List: [0.75, 0.125, 0.0, 0.25, 0.875, 0.375, 1.0, 0.125, 0.625]

Here, we define the original list of numbers, numbers. We then determine the minimum (xmin) and maximum (xmax) values in the list.

Using list comprehension, we normalize each number using the provided formula. The normalized values are stored in the list normalized_numbers, and we print both the original and normalized lists.

We can also manually normalize a list of numbers in Python using NumPy. Let’s demonstrate this with an example:

import numpy as np

def min_max_normalization(data):
    xmin = np.min(data)
    xmax = np.max(data)

    normalized_data = [(x - xmin) / (xmax - xmin) for x in data]

    return normalized_data

numbers = [6, 1, 0, 2, 7, 3, 8, 1, 5]
print("Original List:", numbers)

normalized_numbers = min_max_normalization(numbers)
print("Normalized List:", normalized_numbers)

Output:

Original List: [6, 1, 0, 2, 7, 3, 8, 1, 5]
Normalized List: [0.75, 0.125, 0.0, 0.25, 0.875, 0.375, 1.0, 0.125, 0.625]

Here, we first import the NumPy library to utilize its functions for numerical operations. Then, we create a function, min_max_normalization, to perform the Min-Max scaling.

We then use the NumPy functions np.min and np.max to find the minimum and maximum values in the list. After this, we use a list comprehension to apply the Min-Max scaling formula to normalize each number based on the minimum and maximum values.

Lastly, we print both the original and normalized lists to showcase the effect of the normalization.

Conclusion

Normalization is a powerful tool in data preprocessing, aiding in achieving consistency and optimal performance in various data-driven applications. Whether using built-in functions like MinMaxScaler() or implementing custom normalization, understanding and applying this technique is crucial for effective data analysis and modeling.

Author: Fariba Laiq
Fariba Laiq avatar Fariba Laiq avatar

I am Fariba Laiq from Pakistan. An android app developer, technical content writer, and coding instructor. Writing has always been one of my passions. I love to learn, implement and convey my knowledge to others.

LinkedIn

Related Article - Python List