How to run a one-sample t-test in Python using SciPy

Einblick Content Team - April 25th, 2023

A one-sample t-test is a statistical test used to compare a single sample of data with a known or hypothesized mean value. It determines whether the sample data deviates significantly from the theoretical population mean. In the below example, we’re examining shoe sale data collected from Adidas retailers, with a focus on the operating margin. We’ll go over the basic syntax for using the ttest_1samp() function from SciPy, and some further information about t-tests. If you want to learn more about how to run a two-sample t-test in SciPy, check out our other Python code post.

Basic syntax: ttest_1samp(a, popmean)

To run a one-sample t-test, we first need to state our null and alternative hypotheses, as with any hypothesis test.

H0:μ=0.4H1:μ0.4H0:The mean operating margin is 0.4.H1:The mean operating margin is not 0.4.H_0: \mu = 0.4 \newline H_1: \mu \neq 0.4 \newline \newline H_0: \text{The mean operating margin is 0.4.} \newline H_1: \text{The mean operating margin is not 0.4.}

We’re testing whether the sample data has a population mean of 0.4 or not. To restate this within the context of our data, we’re testing whether or not the mean of the operating margin of sales is 0.4 or not.

The two arguments we need to run ttest_1samp() are:

  • a: the sample data
  • popmean: the theoretical population mean against which we’re testing
from scipy import stats


stats.ttest_1samp(a = df["Operating Margin"], popmean = 0.40)

Output:

TtestResult(statistic=6.0474151057026955, pvalue=1.704807082569083e-09, df=2375)

We can see that the test yielded a t-statistic of 6.047, and a p-value of 1.7e-9. If we assume an alpha value of 0.05, since the p-value is much less than 0.05, we can reject the null hypothesis that the sample mean is equal to 0.4.

More technical information about t-tests

In the above example, we ran one of two kinds of one-sample t-tests. There are two types of one-sample t-tests:

  • One-sided (or one-tailed) one-sample t-test
  • Two-sided (or two-tailed) one-sample t-test

One-sample two-tailed t-test

The naming references how many “sides” or “tails” of the distribution that we care about. In the case of a two-sided or two-tailed one-sample t-test, as we just ran above, we divide the 5% significance level between both tails or sides of the distribution, as pictured below.

One-sample two-sided t-test example distributionOne-sample two-sided t-test example distribution

One-sample one-tailed t-test

But in the case of a one-sided or one-tailed one-sample t-test, we assume that the 5% significance level is all in one tail, as seen below:

One-sample one-sided t-test example distributionOne-sample one-sided t-test example distribution

So for one-tailed one-sample t-tests, we can test the alternative hypothesis that the mean of the sample data is greater than a theoretical value or less than a theoretical value. Let’s take a look at an example using the same data.

Running a one-tailed one-sample t-test

First, we need to set up our null and alternative hypotheses:

H0:μ<0.4H1:μ0.4H0:The mean operating margin is less than 0.4.H1:The mean operating margin is not less than 0.4.H_0: \mu < 0.4 \newline H_1: \mu \nless 0.4 \newline \newline H_0: \text{The mean operating margin is less than 0.4.} \newline H_1: \text{The mean operating margin is not less than 0.4.}

Then we can run the corresponding code, which leverages the alternative argument. We can set alternative equal to ‘two-sided’, ‘less’, or ‘greater’ to represent the alternative hypothesis we’re testing.

  • The default value is ‘two-sided’.
  • If alternative = ‘less’, this means we’re testing the alternative hypothesis that the mean of the distribution underlying the sample is less than the theoretical population mean.
  • If alternative = ‘greater’, this means we’re testing the alternative hypothesis that the mean of the distribution underlying the sample is greater than the theoretical population mean.
stats.ttest_1samp(a = df["Operating Margin"], popmean = 0.40, alternative = "less")

Output:

TtestResult(statistic=6.0474151057026955, pvalue=0.9999999991475965, df=2375)

The test yielded a t-statistic of 6.047 with a p-value of 0.99. In this case, we fail to reject the null hypothesis that the mean operating margin is less than 0.4.

About

Einblick is an AI-native data science platform that provides data teams with an agile workflow to swiftly explore data, build predictive models, and deploy data apps. Founded in 2020, Einblick was developed based on six years of research at MIT and Brown University. Einblick is funded by Amplify Partners, Flybridge, Samsung Next, Dell Technologies Capital, and Intel Capital. For more information, please visit www.einblick.ai and follow us on LinkedIn and Twitter.

Start using Einblick

Pull all your data sources together, and build actionable insights on a single unified platform.

  • All connectors
  • Unlimited teammates
  • All operators