How to Perform Chi-Square Test in Python

Preet Sanghavi Feb 02, 2024
How to Perform Chi-Square Test in Python

The Chi-square test is used to determine independence between two categorical data variables. We will perform this test in Python using the SciPy module in this tutorial.

We will use the chi2_contingency() function from the SciPy module to perform the test. Let us start by importing the SciPy module.

Perform Chi-Square Test in Python

Import SciPy:

from scipy.stats import chi2_contingency

The chi2_contingency function takes a contingency table in the 2D format as an input. A contingency table is used in statistics to summarize the relationship between categorical variables.

So let us create this contingency table.

data = [[207, 282, 241], [234, 242, 232]]

Let us pass this array to the function.

stat, p, dof1, expected = chi2_contingency(data)

The chi2_contingency() function will return a tuple containing test statistics, the p-value, degrees of freedom, and the expected table. We will compare the obtained p-value with the alpha value of 0.05.

Let’s now interpret the p-value using the below code.

alpha = 0.05
print("p val is " + str(p))
if p <= alpha:
    print("Dependent")
else:
    print("Independent")

The output for the above code would be:

p val is 0.1031971404730939
Independent

If the p-value is greater than the alpha value, which is 0.05, both variables are not significantly related and can be considered independent.

In our case, we have a p-value greater than alpha, and therefore we can conclude that both our variables are independent. Therefore, we can perform the chi-square test in Python using the above technique.

Preet Sanghavi avatar Preet Sanghavi avatar

Preet writes his thoughts about programming in a simplified manner to help others learn better. With thorough research, his articles offer descriptive and easy to understand solutions.

LinkedIn GitHub

Related Article - Python Test