Evidence Grading System — A+ through F
What the Grades Mean
Every health claim on every ingredient page receives an independent evidence grade from A+ (strongest) to F (insufficient). Here is what each tier represents:
Multiple meta-analyses or systematic reviews with consistent findings.
Multiple RCTs with positive results.
Limited RCTs or observational data only.
Preliminary human data or pilot studies.
No human evidence or contradicted by research.
Grading Criteria
The table below details the specific evidence threshold for each grade:
| Grade | Criteria |
|---|---|
| A+ | Multiple concordant meta-analyses or Cochrane reviews; large clinically meaningful effect size; consistent across diverse populations |
| A | At least 1 high-quality meta-analysis with significant results + 3 or more RCTs confirming |
| A- | Meta-analysis present but with moderate heterogeneity, OR 5+ concordant RCTs |
| B+ | 3+ high-quality RCTs with consistent positive results |
| B | 2-3 RCTs with positive results; moderate sample sizes (n>50 each) |
| B- | 1-2 RCTs positive, but small samples or mixed effect sizes |
| C+ | 1 positive RCT + supporting observational data |
| C | Multiple observational studies with consistent results; no RCTs |
| C- | Limited observational data; small cohorts |
| D+ | Pilot RCT or single small observational study with positive signal |
| D | Case series, open-label trials, or epidemiological correlation only |
| D- | Single case report or very preliminary human data |
| F | No human evidence; animal/in-vitro only; or contradicted by evidence |
Study Type Hierarchy
We weight evidence according to the established research hierarchy. Higher-ranked study types carry more influence on the final grade:
- Systematic reviews and meta-analyses — highest level of evidence
- Randomized controlled trials (RCTs) — gold standard individual studies
- Observational studies — cohort, cross-sectional
- Case series and case reports — descriptive evidence
- Expert opinion and mechanistic reasoning — lowest level
Grade Modifiers
Within each letter grade, the + and - modifiers reflect additional quality considerations:
+ modifier (upgrade)
- Results replicated across 2+ populations
- Clinically meaningful effect size
- High-quality study designs with low risk of bias
- modifier (downgrade)
- Conflicting results across studies
- Narrow population (e.g., only elderly or only athletes)
- Industry-funded studies only
- High heterogeneity in meta-analyses
Base letter (no modifier)
Standard assessment without significant upgrading or downgrading factors.
Our Hard Rules
- When in doubt, grade DOWN — an honest D+ on a hyped supplement builds more authority than an inflated B
- No animal-only or in-vitro evidence grades above D
- F is reserved for claims actively contradicted by evidence or with zero human data
- Grades are reviewed when new meta-analyses or large RCTs publish
- All PubMed citations are verified — we never fabricate references
Worked Example: Magnesium for Sleep
To illustrate how the grading system works in practice, here is a step-by-step walkthrough for a specific health claim:
Relationship to SupplScore
Evidence grades and SupplScore are complementary systems that serve different purposes:
- SupplScore (0–100) rates the whole ingredient across 4 dimensions: evidence level, safety profile, dose accessibility, and form bioavailability
- Evidence grades (A–F) rate individual health claims — one ingredient can have different grades for different benefits
- They complement each other — an ingredient with SupplScore 85 might have A+ claims for some benefits and C claims for others
Update Policy
- Grades are reviewed when significant new research publishes
- Changes are reflected in the dateModified timestamp on the ingredient page
- We track publication of major meta-analyses and Cochrane reviews quarterly