# Use Axis Argument to Manipulate a NumPy Array in Python

Salman Mehmood Oct 05, 2022

This article explains how to work with NumPy `axis` arguments and see what an axis is in NumPy. We will also learn how to use an `axis` argument as a powerful operation to manipulate a NumPy array in Python quickly.

## Use an `axis` Argument to Manipulate a NumPy Array in Python

To demonstrate, we need some data to work with, but we do not want anything too large and complicated; that is why we have done something we do very frequently. When we are learning something about NumPy, the first thing that comes up is called arrays, so in this case, we already created out to test arrays.

``````import numpy as np

Temperature_Array =np.array([[[26,25,24],
[24,25,26]],
[[27,25,23],
[25,28,24]],
[[27,24,26],
[24,27,25]],
[[27,26,24],
[28,25,26]]])

Timeseries_Temperature =np.array([[23,24,25,24,23,25],
[25,26,27,29,25,23],
[20,23,21,22,25,29]])
``````

The first test array, `Temperature_Array`, is supposed to represent a gridded forecast. Let’s say we have our gridded observations and are trying to emulate four-time steps with six stations arranged in two rows and three columns.

The following sub-array would be the first time-step and so on.

``````[[26,25,24],[24,25,26]]
``````

You will notice that we made a different number of elements for each of these axes, and there are four-time steps and three columns or three stations in each row and two rows per time step.

A common mistake when trying something with NumPy is making a 3x3 array or a 3x3x3 array, and you think you know what is going on, but when you try it in the real world, it does not work.

That is because your real data does have different numbers of elements in these different directions, and you did not have your slice or whatever you are trying to do.

The second called `Timeseries_Temperature`, is simpler. It represents three stations that observe temperature every hour and have six hours.

Our rows are stations, and columns are time.

If you have a five-dimensional array, then you will have a row, column, and depth may be time, but these dimensions are axes of the array. The axis is just an individual part of this NumPy array; it is a direction to go through it.

Let’s look at our `Timeseries_Temperature` to get its dimension using the `ndim` attribute, which is the number of dimensions of an array.

``````Timeseries_Temperature.ndim
``````

Output:

``````2
``````

Let’s say we want to get some information about minimum values. Then we will do something like that:

``````Timeseries_Temperature.min()
``````

And we get 20 back because 20 is indeed the lowest value in this array, but that is probably not what we want. We want to know what station experienced the lowest temperature at any time in the data, and maybe we want to know the lowest temperature each station experienced.

Or, maybe we want to know the minimum temperature at each time and where it was the coldest at any given time in those 6 hours. This is where the `axis` argument can come in and help us a lot.

We do not have to do looping do not have to do manual slicing.

But, to understand it, let’s make a couple of slices here.

``````Timeseries_Temperature[0,:]
``````

We will get the 0th element in the 0th dimension or the 0th axis that gives us the first row.

``````array([23, 24, 25, 24, 23, 25])
``````

Let’s look at what happens if we say give us everything, the colon indicates along the zeroth axis and gives us the zeroth items along the one axis.

``````Timeseries_Temperature[:,0]
``````

This gives us the 0th column and all rows.

``````array([23, 25, 20])
``````

Now let’s work with `Timeseries_Temperature` again and call the `min()` function. If we press the shift + tab, we see that we have an `axis` argument, and by default it is `None`.

Now we are going to pass the `axis` equals 0.

``````Timeseries_Temperature.min(axis=0)
``````

This gives us the minimum value in the array but the individual element.

``````array([20, 23, 21, 22, 23, 23])
``````

We had the same shapes in both cases, but instead of using slices, we used an `axis` argument, which is the column-wise minimum temperature of any station at every hour.

Now we will collapse axis 1, represented as columns, and get a minimum hour of each.

``````Timeseries_Temperature.min(axis=1)
``````

Output:

``````array([23, 23, 20])
``````

Now let’s look at the more complicated case, so we will print out our `Temperature_Array` again to show you what it looks like.

``````Temperature_Array
``````

Output:

``````array([[[26, 25, 24],
[24, 25, 26]],

[[27, 25, 23],
[25, 28, 24]],

[[27, 24, 26],
[24, 27, 25]],

[[27, 26, 24],
[28, 25, 26]]])
``````

In `Temperature_Array`, we have three dimensions row, column, and depth. If we type `Temperature_Array[0,:,:]`, then we get the first block, the 0th axis representing the time steps in this case, and each square bracket effectively is an axis.

``````array([[26, 25, 24],
[24, 25, 26]])
``````

This time, instead of using minimum, we will take some means of `Temperature_Array` using the `mean()` function.

``````Temperature_Array.mean()
``````

Output:

``````25.458333333333332
``````

Now, we will use an axis equal to 0, which means we will collapse the 0th axis, which was our time step’s outermost set of square brackets.

``````Temperature_Array.mean(axis=0)
``````

We got two-row and three-column arrays which is the overall average of time steps from `Temperature_Array`.

``````array([[26.75, 25.  , 24.25],
[25.25, 26.25, 25.25]])
``````

If your data are arranged differently, we might have to use a different axis; in our case, we use `axis` equals 1.

``````Temperature_Array.mean(axis=1)
``````

Here we collapse row numbers which is why we are getting the mean at all time steps of the columns.

``````array([[25. , 25. , 25. ],
[26. , 26.5, 23.5],
[25.5, 25.5, 25.5],
[27.5, 25.5, 25. ]])
``````

Now we will pass 2 to the `axis` argument, and using `axis` equals 2, we are collapsing the innermost dimension, represented by columns. It is a row-wise average at each time step or a 4x2 array.

``````Temperature_Array.mean(axis=2)
``````

Output:

``````array([[25.        , 25.        ],
[25.        , 25.66666667],
[25.66666667, 25.33333333],
[25.66666667, 26.33333333]])
``````

Full Code:

``````# In[1]:

import numpy as np

Temperature_Array =np.array([[[26,25,24],
[24,25,26]],
[[27,25,23],
[25,28,24]],
[[27,24,26],
[24,27,25]],
[[27,26,24],
[28,25,26]]])

Timeseries_Temperature =np.array([[23,24,25,24,23,25],
[25,26,27,29,25,23],
[20,23,21,22,25,29]])

# In[2]:

Timeseries_Temperature.ndim

# In[3]:

Timeseries_Temperature.min()

# In[4]:

Timeseries_Temperature[0,:]

# In[5]:

Timeseries_Temperature[:,0]

# In[6]:

Timeseries_Temperature.min(axis=0)

# In[7]:

Timeseries_Temperature.min(axis=1)

# In[8]:

Temperature_Array

# In[9]:

Temperature_Array.ndim

# In[10]:

Temperature_Array[0,:,:]

# In[11]:

Temperature_Array.mean()

# In[12]:

Temperature_Array.mean(axis=0)

# In[13]:

Temperature_Array.mean(axis=1)

# In[14]:

Temperature_Array.mean(axis=2)
``````

Hello! I am Salman Bin Mehmood(Baum), a software developer and I help organizations, address complex problems. My expertise lies within back-end, data science and machine learning. I am a lifelong learner, currently working on metaverse, and enrolled in a course building an AI application with python. I love solving problems and developing bug-free software for people. I write content related to python and hot Technologies.