Skip to main content
Supplement ScienceSupplementScience

What a P-Value Actually Tells You (and What It Doesn't)

This content is for informational purposes only and does not constitute medical advice. Statements about dietary supplements have not been evaluated by the FDA and are not intended to diagnose, treat, cure, or prevent any disease. Individual results may vary — consult your healthcare provider before starting any supplement. Full disclaimer

A p-value is the probability of seeing a result at least as extreme as the one observed if the treatment actually had...

A p-value is the probability of seeing a result at least as extreme as the one observed if the treatment actually had no effect. The common p < 0.05 threshold is just a convention — a low p-value is not proof a supplement works, and it says nothing about how large or meaningful the effect is.

Key Takeaways

  • A p-value is the chance of seeing a result this extreme if the supplement truly had no effect — not the chance that it works.
  • The p < 0.05 cutoff is a convention, not a guarantee that a result is important.
  • A p-value ignores effect size; a 'significant' result can still be trivially small.
  • Testing many outcomes and reporting only the significant ones (p-hacking) can manufacture significance.
  • Trust findings repeated across well-designed human studies, not a single p-value.

Get the free evidence-based What a P-Value Actually Tells You (and What It Doesn't) guide — delivered in 60 seconds.

No spam. Unsubscribe anytime.

The one-sentence definition

A p-value answers a narrow question: *if this supplement truly did nothing, how often would we see a result this strong (or stronger) by chance alone?* A p-value of 0.04 means that, assuming no real effect, you would expect a result this extreme about 4% of the time [1].

What 'statistically significant' means

Researchers often call a result 'statistically significant' when the p-value falls below 0.05. That 0.05 line is a long-standing convention, not a law of nature. A result at p = 0.049 is not meaningfully different from one at p = 0.051, even though only one clears the bar [1].

Four things a p-value does NOT tell you

  • How big the effect is. A tiny, unimportant change can be 'significant' in a large study. For that, look at the effect size and the [confidence interval](/learn/relative-risk-and-confidence-intervals).
  • Whether the result matters in real life. Statistical significance is not the same as practical importance.
  • The probability your idea is correct. A p-value is calculated *assuming* no effect, so it cannot tell you the odds that the hypothesis is true.
  • That the study was well designed. Bias, small samples, or the wrong [study design](/learn/observational-vs-rct) can produce a low p-value that later fails to replicate.

Why this matters for supplements

Marketing often leans on the phrase 'statistically significant' to imply a product is effective. But significance can be manufactured by testing many outcomes and reporting only the ones that cross 0.05 — a practice critics call p-hacking. NCCIH's guidance on evaluating research stresses looking past a single number to the whole study: its size, its design, and whether independent trials agree [2].

How to read a p-value sensibly

Treat a low p-value as one signal among several. Ask how large the effect was, how many people were studied, whether the finding has been repeated, and who funded the work. A trustworthy claim rests on a body of consistent, well-designed human research — not one p-value.

Frequently Asked Questions

Does p < 0.05 mean a supplement definitely works?

No. It means a result this strong would be uncommon if the supplement had no real effect. That is weak-to-moderate evidence at best, and it can arise from chance, bias, or small samples. Confidence grows when independent, well-designed human studies reach the same conclusion.

What is the difference between statistical and practical significance?

Statistical significance asks whether a result is likely due to chance. Practical significance asks whether the effect is large enough to matter to a real person. A very large study can find a statistically significant change that is far too small to notice in daily life.

What is p-hacking?

P-hacking refers to analyzing data many ways — testing lots of outcomes, subgroups, or time points — until something crosses the p < 0.05 line, then highlighting only that result. It makes weak findings look stronger than they are, which is why pre-registered study plans are valued.

What should I look at besides the p-value?

Look at the effect size, the confidence interval, the number of participants, the study design, whether the result has been replicated, and the funding source. Together these tell you far more than a single p-value about whether a supplement is genuinely helpful.

References

  1. National Center for Complementary and Integrative Health (2026). Know the Science. U.S. National Institutes of Health.
  2. National Center for Complementary and Integrative Health (2026). How To Make Sense of a Scientific Journal Article. U.S. National Institutes of Health.