How to Shuffle an Array in Python

Muhammad Waiz Khan Feb 02, 2024
  1. Shuffle an Array in Python Using the random.shuffle() Method
  2. Shuffle an Array in Python Using the shuffle() Method of sklearn Module
How to Shuffle an Array in Python

In this tutorial, we will look into the various methods to shuffle an array in Python. The shuffling of an array means rearranging the positions of the elements in the array. One of the array shuffling applications is in model training, where we need to shuffle our dataset to improve the model’s training quality. It could also be used in many applications of statistics.

Shuffle an Array in Python Using the random.shuffle() Method

The random.shuffle() method takes a sequence as input and shuffles it. The important thing to note here is that the random.shuffle() does not return a new sequence as output but instead shuffles the original sequence. Therefore the valid input sequence can only be mutable data types like an array or a list etc.

The random.shuffle() method only works on 1D sequences. The below example code demonstrates how to use the random.shuffle() to shuffle an array in Python.

import random
import numpy as np

mylist = ["apple", "banana", "cherry"]
x = np.array((2, 3, 21, 312, 31, 31, 3123, 131))

print(x)
print(mylist)

random.shuffle(mylist)
random.shuffle(x)

print(x)
print(mylist)

Output:

[   2    3   21  312   31   31 3123  131]
['apple', 'banana', 'cherry']
[3123   21  312    3    2  131   31   31]
['banana', 'apple', 'cherry']

Shuffle an Array in Python Using the shuffle() Method of sklearn Module

The sklearn.utils.shuffle(array, random_state, n_samples) method takes indexable sequences like arrays, lists, or dataframes, etc. with the same first dimension as input and returns the copies of the shuffled sequences provided as input.

The sklearn.utils.shuffle() does not change the original input but returns the input’s shuffled copy. The input can be single or multiple sequences. The random_state parameter is used to control the random generation of numbers. If it is set to some integer, the method will return the same shuffled sequence every time. The n_samples represents the number of samples, and its default value is equal to the first dimension of the input default and should not be greater than the length of the input array(s).

Note
If the input is 2D, sklearn.utils.shuffle() method will only shuffle the rows.

The below example code demonstrates how to use the sklearn.utils.shuffle() method to get a shuffled array(s) in Python.

from sklearn.utils import shuffle
import numpy as np

x = np.array([[1, 2, 3], [6, 7, 8], [9, 10, 12]])
y = ["one", "two", "three"]
z = [4, 5, 6]

print(x)
print(y)
print(z)

x, y, z = shuffle(x, y, z, random_state=0)

print(x)
print(y)
print(z)

Output:

[[ 1  2  3]
 [ 6  7  8]
 [ 9 10 12]]
['one', 'two', 'three']
[4, 5, 6]
[[ 9 10 12]
 [ 6  7  8]
 [ 1  2  3]]
['three', 'two', 'one']
[6, 5, 4]

Related Article - Python Array