How to Interpret Confidence Intervals

Date Posted: June 25, 2020

How do you interpret a confidence interval?

If you look online, there are many resources that are technically inaccurate. If you search for any of the following:

What is a confidence interval
What does a confidence interval tell you
What is a confidence interval in simple terms
What does 95% confidence level mean

Chances are the first few results are wrong. I spent years trying to find a simple, yet accurate explanation, and couldn’t find much. However, I think I finally did, so let’s get right to it.

Confidence Intervals

Note: Frequentist and Bayesian intervals have different explanations, this is for the frequentist version. Chances are, if those words are unfamiliar, you probably used a frequentist method.

What is it?

From A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research (funny that it comes from a Bayesian paper):

… 95% of these confidence intervals capture the true parameter under the null hypothesis.

… the correct interpretation is that 95 of 100 replications of exactly the same experiment capture the fixed but unknown parameter, assuming the alternative hypothesis about that parameter is true.

In layman’s terms, if your alternative hypothesis is true, and you repeat the same experiment 100 times, 95 out of 100 of those confidence intervals will contain the true parameter (mean or whatever you’re testing).

A visualization of this is provided by a 2015 Minitab blog post - Understanding Hypothesis Tests: Confidence Intervals and Confidence Levels.

Correct Interpretation

From Khan Academy - Interpreting confidence levels and confidence intervals:

Correct Interpretation: We are 95% confident that the interval ( , ) captured the true mean pitch speed.

Combine with the previous explanation about intervals if necessary, and congratulations, we’re done!

What it is not

The interpretation is not equivalent to "we are 95% confident that the true [mean] lies between ___ and ___." This is the interpretation for a Posterior Probability Interval (PPI), frequently also known as a “credible interval”, the Bayesian counterpart.

As explained by Khan Academy:

We shouldn’t say there is a 95%, percent chance that this specific interval contains the true mean, because it implies that the mean may be within this interval, or it may be somewhere else. This phrasing makes it seem as if the population mean is variable, but it’s not. This interval either captured the mean or didn’t. Intervals change from sample to sample, but the population parameter we’re trying to capture does not.

This point, along with other misconceptions, are described in the 2016 paper Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Other misconceptions explained include:

Wrong - If two confidence intervals overlap, the difference between two estimates or studies is not significant.
Wrong - An effect size outside the 95% confidence interval has been refuted (or excluded) by the data.

Resources

Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations

I highly recommend everyone to at least skim through this paper at least once. It addresses many extremely common misconceptions.

A Gentle Introduction to Bayesian Analysis: Applications to Developmental Research

Briefly discusses the difference between frequentist and Bayesian intervals.

Khan Academy - Interpreting confidence levels and confidence intervals

Provides a simple quiz and explanation to check for the correct interpretation.