# Calculate the Mean Squared Error in Python

We will learn, with this explanation, how to calculate the mean squared error with the help of an algorithm, Numpy, and Scikit-Learn in Python.

## Calculate the Mean Squared Error With the Help of an Algorithm in Python

The MSE tells us how close a regression line is to a set of points by taking the distances from those points to the regression line. These distances are called the errors, and those errors are squared to remove any negative signs.

The Mean Squared Error is an important function in machine learning, especially linear regression. We will calculate the MSE (Mean Squared Error) using two approaches, and in the first approach, we will calculate the MSE step-by-step.

In a second approach, we will calculate the MSE in a single line using `numpy`.

First, we need to import the Numpy, and to demonstrate, we will calculate the Mean Squared Error for two arrays, the first array is `original_marks`, and the second is `estimated_marks`. We will execute both arrays.

``````original_marks=np.array([87,64,77,91])
estimated_marks=np.array([67,55,71,80])
``````

To display `original_marks`:

``````original_marks
``````

Output:

``````array([87, 64, 77, 91])
``````

To display `estimated_marks`:

``````estimated_marks
``````

Output:

``````array([67, 55, 71, 80])
``````

Now we will proceed according to the formula of MSE. First, we need to subtract the `original_marks` from `estimated_marks`, then we need to square, and then we need to calculate the mean.

That is why first, we need to calculate the difference between the `original_marks` and `estimated_marks` using the `subtract()` method.

``````diff_marks=np.subtract(original_marks,estimated_marks)
diff_marks
``````

Output:

``````array([20,  9,  6, 11])
``````

Now we need to take the square the `diff_marks`. To do this, we will use the `square()` method, and we need to provide the difference we calculated.

``````sqr_marks=np.square(diff_marks)
sqr_marks
``````

Output:

``````array([400,  81,  36, 121], dtype=int32)
``````

We will apply the mean to this array, the Mean Squared Error or MSE of marks. We will use the `mean()` method.

``````mse_marks=sqr_marks.mean()
mse_marks
``````

Output:

``````159.5
``````

Complete Python Code:

``````import numpy as np

original_marks=np.array([87,64,77,91])
estimated_marks=np.array([67,55,71,80])

diff_marks=np.subtract(original_marks,estimated_marks)

sqr_marks=np.square(diff_marks)

mse_marks=sqr_marks.mean()
``````

## Calculate the Mean Squared Error With the Help of the Numpy Module in Python

Now we will calculate the Mean Squared Error in a single line, and again, we will use the same function to calculate the MSE.

``````mse_marks=np.square(original_marks-estimated_marks).mean()
mse_marks
``````

We can see the output is the same:

``````159.5
``````

Complete Python Code:

``````import numpy as np

original_marks=np.array([87,64,77,91])
estimated_marks=np.array([67,55,71,80])

#using numpy

mse_marks=np.square(original_marks-estimated_marks).mean()
``````

## Calculate the Mean Squared Error With the Help of Scikit-Learn in Python

Now, we will obtain the Mean Squared Error using the `scikit-learn` library. Let’s import `numpy`, prepare the data with the `ndmin` as two that is the dimension, and then reshape it; so we have five rows and one column.

In the next line, we will define an array that would be the y-value for testing data, and then we will import the `LogisticRegression` class from `linear_model` using the `sklearn` module. We will then create an instance of this class.

``````import numpy as np
from sklearn.linear_model import LogisticRegression

x_training_data=np.array([166,151,194,140,139],ndmin=2)
x_training_data=x_training_data.reshape((5,1))
y_training_data=np.array([62,71,67,44,91])
MD=LogisticRegression()
``````

Now, we will see whether the model fits with the training data or not, so we will declare a variable called `y_pr_data`. It will be equal to `MD.predict()` then we will feed it the `x_training_data`.

``````y_pr_data=MD.predict(x_training_data)
y_pr_data
``````

Output:

``````array([76, 76, 77, 83, 76])
``````

Now, we will find the Mean Squared Error. We know the formula of the Mean Squared Error, so we will apply it to calculate the error between the predicted value and the actual value.

``````mse=np.mean(((y_training_data-y_pr_data)**2))
mse
``````

Output:

``````3.2
``````

There is a much simpler way to implement Mean Squared Error using the `mean_squared_d_errorrror()` function. We will import it from the `metrics` class and then feed the actual and predicted data as we feed above.

``````from sklearn.metrics import mean_squared_d_errorrror
mean_squared_error(y_training_data,y_pr_data)
``````

When we run this cell, we get the same result as the above.

``````3.2
``````

Complete Python Code:

``````import numpy as np
from sklearn.linear_model import LogisticRegression

x_training_data=np.array([166,151,194,140,139],ndmin=2)
x_training_data=x_training_data.reshape((5,1))
y_training_data=np.array([62,71,67,44,91])
MD=LogisticRegression()

y_pr_data=MD.predict(x_training_data)
# y_pr_data
mse=np.mean(((y_training_data-y_pr_data)**2))
# mse

from sklearn.metrics import mean_squared_d_errorrror
mean_squared_error(y_training_data,y_pr_data)
``````

Hello! I am Salman Bin Mehmood(Baum), a software developer and I help organizations, address complex problems. My expertise lies within back-end, data science and machine learning. I am a lifelong learner, currently working on metaverse, and enrolled in a course building an AI application with python. I love solving problems and developing bug-free software for people. I write content related to python and hot Technologies.

LinkedIn

## Related Article - Python Math

• Calculate Factorial in Python
• Calculate Inverse of Cosine in Python
• Calculate Modular Multiplicative Inverse in Python
• Fit Poisson Distribution to Different Datasets in Python
• Reduce Fractions in Python
• Define an Infinite Value in Python
• ## Related Article - Python Error

• Python PermissionError: [WinError 5] Access Is Denied
• Python TypeError: 'DataFrame' Object Is Not Callable
• Python TypeError: Can't Convert 'List' Object to STR
• Local Variable Referenced Before Assignment Error in Python
• Python Handling Socket.Error: [Errno 104] Connection Reset by Peer
• Python Is Not Recognized in Windows 10