# Calculate the Mean Squared Error in Python

- Calculate the Mean Squared Error With the Help of an Algorithm in Python
- Calculate the Mean Squared Error With the Help of the Numpy Module in Python
- Calculate the Mean Squared Error With the Help of Scikit-Learn in Python

We will learn, with this explanation, how to calculate the mean squared error with the help of an algorithm, Numpy, and Scikit-Learn in Python.

## Calculate the Mean Squared Error With the Help of an Algorithm in Python

The MSE tells us how close a regression line is to a set of points by taking the distances from those points to the regression line. These distances are called the errors, and those errors are squared to remove any negative signs.

The Mean Squared Error is an important function in machine learning, especially linear regression. We will calculate the MSE (Mean Squared Error) using two approaches, and in the first approach, we will calculate the MSE step-by-step.

In a second approach, we will calculate the MSE in a single line using `numpy`

.

First, we need to import the Numpy, and to demonstrate, we will calculate the Mean Squared Error for two arrays, the first array is `original_marks`

, and the second is `estimated_marks`

. We will execute both arrays.

```
original_marks=np.array([87,64,77,91])
estimated_marks=np.array([67,55,71,80])
```

To display `original_marks`

:

```
original_marks
```

Output:

```
array([87, 64, 77, 91])
```

To display `estimated_marks`

:

```
estimated_marks
```

Output:

```
array([67, 55, 71, 80])
```

Now we will proceed according to the formula of MSE. First, we need to subtract the `original_marks`

from `estimated_marks`

, then we need to square, and then we need to calculate the mean.

That is why first, we need to calculate the difference between the `original_marks`

and `estimated_marks`

using the `subtract()`

method.

```
diff_marks=np.subtract(original_marks,estimated_marks)
diff_marks
```

Output:

```
array([20, 9, 6, 11])
```

Now we need to take the square the `diff_marks`

. To do this, we will use the `square()`

method, and we need to provide the difference we calculated.

```
sqr_marks=np.square(diff_marks)
sqr_marks
```

Output:

```
array([400, 81, 36, 121], dtype=int32)
```

We will apply the mean to this array, the Mean Squared Error or MSE of marks. We will use the `mean()`

method.

```
mse_marks=sqr_marks.mean()
mse_marks
```

Output:

```
159.5
```

Complete Python Code:

```
import numpy as np
original_marks=np.array([87,64,77,91])
estimated_marks=np.array([67,55,71,80])
diff_marks=np.subtract(original_marks,estimated_marks)
sqr_marks=np.square(diff_marks)
mse_marks=sqr_marks.mean()
```

## Calculate the Mean Squared Error With the Help of the Numpy Module in Python

Now we will calculate the Mean Squared Error in a single line, and again, we will use the same function to calculate the MSE.

```
mse_marks=np.square(original_marks-estimated_marks).mean()
mse_marks
```

We can see the output is the same:

```
159.5
```

Complete Python Code:

```
import numpy as np
original_marks=np.array([87,64,77,91])
estimated_marks=np.array([67,55,71,80])
#using numpy
mse_marks=np.square(original_marks-estimated_marks).mean()
```

## Calculate the Mean Squared Error With the Help of Scikit-Learn in Python

Now, we will obtain the Mean Squared Error using the `scikit-learn`

library. Let’s import `numpy`

, prepare the data with the `ndmin`

as two that is the dimension, and then reshape it; so we have five rows and one column.

In the next line, we will define an array that would be the y-value for testing data, and then we will import the `LogisticRegression`

class from `linear_model`

using the `sklearn`

module. We will then create an instance of this class.

```
import numpy as np
from sklearn.linear_model import LogisticRegression
x_training_data=np.array([166,151,194,140,139],ndmin=2)
x_training_data=x_training_data.reshape((5,1))
y_training_data=np.array([62,71,67,44,91])
MD=LogisticRegression()
```

Now, we will see whether the model fits with the training data or not, so we will declare a variable called `y_pr_data`

. It will be equal to `MD.predict()`

then we will feed it the `x_training_data`

.

```
y_pr_data=MD.predict(x_training_data)
y_pr_data
```

Output:

```
array([76, 76, 77, 83, 76])
```

Now, we will find the Mean Squared Error. We know the formula of the Mean Squared Error, so we will apply it to calculate the error between the predicted value and the actual value.

```
mse=np.mean(((y_training_data-y_pr_data)**2))
mse
```

Output:

```
3.2
```

There is a much simpler way to implement Mean Squared Error using the `mean_squared_d_errorrror()`

function. We will import it from the `metrics`

class and then feed the actual and predicted data as we feed above.

```
from sklearn.metrics import mean_squared_d_errorrror
mean_squared_error(y_training_data,y_pr_data)
```

When we run this cell, we get the same result as the above.

```
3.2
```

Complete Python Code:

```
import numpy as np
from sklearn.linear_model import LogisticRegression
x_training_data=np.array([166,151,194,140,139],ndmin=2)
x_training_data=x_training_data.reshape((5,1))
y_training_data=np.array([62,71,67,44,91])
MD=LogisticRegression()
y_pr_data=MD.predict(x_training_data)
# y_pr_data
mse=np.mean(((y_training_data-y_pr_data)**2))
# mse
from sklearn.metrics import mean_squared_d_errorrror
mean_squared_error(y_training_data,y_pr_data)
```

**Salman Mehmood**

Hello! I am Salman Bin Mehmood(Baum), a software developer and I help organizations, address complex problems. My expertise lies within back-end, data science and machine learning. I am a lifelong learner, currently working on metaverse, and enrolled in a course building an AI application with python. I love solving problems and developing bug-free software for people. I write content related to python and hot Technologies.

LinkedIn