How to Perform T-Test in Pandas

Preet Sanghavi Feb 02, 2024
How to Perform T-Test in Pandas

This tutorial will discuss how we can find T-test values in Pandas.

Steps to Perform T-Test in Pandas

The following are the steps to perform a T-test in Pandas.

Import Pertinent Libraries

We must import the Pandas library and ttest_ind from scipy.stats to get started.

import pandas as pd
from scipy.stats import ttest_ind

Create a Pandas DataFrame

Let us create a sample dataframe to perform the T-test operation on the same dataframe.

data = {
    "Category": [
        "type2",
        "type1",
        "type2",
        "type1",
        "type2",
        "type1",
        "type2",
        "type1",
        "type1",
        "type1",
        "type2",
    ],
    "values": [1, 2, 3, 1, 2, 3, 1, 2, 3, 5, 1],
}
df = pd.DataFrame(data)

We created a dataframe with a category column with two types of categories and assigned a value to each category instance.

Let us view our dataframe below.

print(df)

Output:

   Category  values
0     type2       1
1     type1       2
2     type2       3
3     type1       1
4     type2       2
5     type1       3
6     type2       1
7     type1       2
8     type1       3
9     type1       5
10    type2       1

We will now create a separate data frame for both category types using the below code. This step facilitates the T-test finding procedure.

type1 = my_data[my_data["Category"] == "type1"]
type2 = my_data[my_data["Category"] == "type2"]

Obtain T-Test Values in Pandas

We will now find the T-test results and store them in a variable using the ttest_ind() function. We use this function in the following way.

res = ttest_ind(type1["values"], type2["values"])

In the above code, we passed our data frames to the function as a parameter, and we got the T-test results, including a tuple with the t-statistic & the p-value.

Let us now print the res variable to see the results.

print(res)

Output:

Ttest_indResult(statistic=1.4927289925706944, pvalue=0.16970867501294376)

In the above output, we have found the T-test values with the t-statistic and the p-value. Thus, we can successfully find the T-test values in Pandas with the above method.

Preet Sanghavi avatar Preet Sanghavi avatar

Preet writes his thoughts about programming in a simplified manner to help others learn better. With thorough research, his articles offer descriptive and easy to understand solutions.

LinkedIn GitHub

Related Article - Pandas DataFrame