Matrix Multiplication in Pandas

Matrix Multiplication in Pandas

  1. Multiplication of Matrices
  2. Check if DataFrames Are Aligned in Pandas
  3. Use the dot Function to Carry Out Matrix Multiplication in Pandas

Matrix multiplication is used widely for understanding networks relation, coordinate system transformation, number modeling, and inventory calculations, among other things. With row-column based numerical data, we can perform matrix multiplication and uses the result in whatever applicable areas.

Pandas and Numpy have tools and functions that enable matrix usage and operations such as multiplication, inversion, etc. Matrix multiplication in Pandas can be a little confusing (and lead to errors) if you don’t know the underlying mathematics that powers it.

In this article, we will discuss how to do matrix multiplication in pandas and how to avoid errors.

Multiplication of Matrices

To carry out the multiplication of matrices, we must ensure that the two matrices involved are aligned (or appropriate) for the operation. A matrix has rows and columns; when we want to multiply 2 matrices, the number of columns and rows matters for it to be possible.

We describe matrices to their rows and columns, e.g., a 2 x 4 matrix has 2 rows and 4 columns. With all this information, the first matrix’s(left matrix) number of columns must equal the 2nd matrix’s (right matrix) number of rows for matrix multiplication to be possible.

Therefore, a 2 x 3 matrix can be multiplied by 3 x 2 because there are 3 columns in the first matrix and 3 rows in the second matrix. Also, a 3 x 4 matrix can be multiplied by a 4 x 23 matrix because the number of columns in the 1st matrix equals the number of rows in the 2nd matrix - 4.

However, if we change (or reverse) which matrix is first, the matrix multiplication might not be possible. Using the same examples as earlier, the 3 x 2 matrix can be multiplied by the 2 x 3 matrix because the number of columns of the first column equals the number of rows of the second matrix.

For the second example, the 4 x 23 matrix can be multiplied by the 3 x 4 matrix because the number of columns - 23 - of the 1st matrix is not equal to the number of rows - 3 - of the second matrix.

Check if DataFrames Are Aligned in Pandas

We can check if the data frames we have can carry out matrix operations by checking if the shape of the data frames (matrix) fits the stated rule for matrix multiplication. To achieve this, we will access the shape property (a tuple with two elements) of the dataframe and compare the column value (the second value within the tuple) of the first dataframe (matrix) to the row value (the first value within the tuple) for the second dataframe (matrix).

Let’s create two dataframes, df and other, check for their shape and compare it.

Code:

import pandas as pd
import numpy as np

df = pd.DataFrame([[23, 33], [33, 41]])
other = pd.DataFrame([[31, 0], [20, 1]])

print(df)
print(other)

Output:

    0   1
0  23  33
1  33  41

    0  1
0  31  0
1  20  1

Now, let’s check the shape and compare to see if the dataframes can carry out matrix multiplication calculations.

print(df.shape)
print(other.shape)

if (df.shape[1] == other.shape[0]):
    print("DataFrames (matrices) align and therefore matrix multiplication possible.")
else:
    print("DataFrames (matrices) don't align and therefore matrix multiplication not possible.")

Output:

(2, 2)
(2, 2)
DataFrames (matrices) align and therefore matrix multiplication is possible.

As you can see, the dataframes align because the numbers of columns in df are equal to the rows in other. Now we can use the designed function for matrix multiplication - dot().

Use the dot Function to Carry Out Matrix Multiplication in Pandas

Pandas and Numpy have a dot() function that we can use for matrix multiplication. We will use both to showcase how to carry out matrix multiplication.

Using the dataframes we created in the previous section, we can illustrate how to use the dot() function. Let’s get cracking on the matrix multiplication on df and other.

Using the pandas dot() function where the function is applied on the first matrix - df - and the second matrix - other - is passed as an argument to the dot() function as below.

print(df.dot(other))

Output:

      0   1
0  1373  33
1  1843  41

If we are to use the numpy dot() function, we pass two arguments - the two matrices - but the first matrix is passed first.

print(np.dot(df, other))

Output:

[[1373   33]
 [1843   41]]

Let’s work with another two dataframes - df1 and df2 - created randomly using the numpy library and carry out the matrix multiplication using the two dot() functions.

Code:

import pandas as pd
import numpy as np

df1 = pd.DataFrame(np.random.randn(3, 3), columns=list('ABC'), index=[1, 2, 3])
df2 = pd.DataFrame(np.random.randn(3, 3), columns=list('ABC'), index=[1, 2, 3])

print(np.dot(df1, df2))
print(df1.dot(df2))

Output:

[[ 1.28220783 -1.36789201  0.16335459]
 [-0.8039172   0.87851003 -0.32282877]
 [ 1.09767978 -0.71870817 -0.23485835]]

-----
...
ValueError: matrices are not aligned

The first dot() function using the numpy library worked without errors, but the second dot() function using the pandas library didn’t give a ValueError: matrices are not aligned error message.

The reason for this error message is that when pandas dot() function executes, it re-indexes df1 and df2 in such a way that the column order of df1 and the row (index) order of df2 doesn’t match resulting to a misalignment of matrices. The Numpy dot() function doesn’t do much and has no errors.

To deal with this error, we will need to align the two dataframes by assigning the index of the second dataframe - df2 - to the columns of the first dataframe - df1.

Code:

import pandas as pd
import numpy as np

df1 = pd.DataFrame(np.random.randn(3, 3), columns=list('ABC'), index=[1, 2, 3])
df2 = pd.DataFrame(np.random.randn(3, 3), columns=list('ABC'), index=[1, 2, 3])

print(np.dot(df1, df2))

df2.index = df1.columns
print(df1.dot(df2))

Output:

[[ 1.28220783 -1.36789201  0.16335459]
 [-0.8039172   0.87851003 -0.32282877]
 [ 1.09767978 -0.71870817 -0.23485835]]

          A         B         C
1  1.282208 -1.367892  0.163355
2 -0.803917  0.878510 -0.322829
3  1.097680 -0.718708 -0.234858

Now, we are errorless, and both matrix multiplication computation work regardless.

Olorunfemi Akinlua avatar Olorunfemi Akinlua avatar

Olorunfemi is a lover of technology and computers. In addition, I write technology and coding content for developers and hobbyists. When not working, I learn to design, among other things.

LinkedIn