Khayyam Math

The 68–95–99.7 rule for the normal distribution

Why almost every observation from a normal distribution sits within three standard deviations of the mean — and how the percentages stack up.

-4σ-3σ-2σ-1σμ+1σ+2σ+3σ+4σ68%95% within ±2σ99.7% within ±3σNormal distribution — the 68 / 95 / 99.7 rule

Try this live →

What this shows

The normal (or Gaussian) distribution is the symmetric bell-shaped curve described by f(x) = (1/(σ√(2π)))·exp(−(x − μ)²/(2σ²)), with two parameters: the mean μ (where the peak sits) and the standard deviation σ (how wide the bell is).

For ANY normal distribution — regardless of μ and σ — the following always holds:

    within ±1σ of the mean:  about 68.27% of the probability
    within ±2σ of the mean:  about 95.45% of the probability
    within ±3σ of the mean:  about 99.73% of the probability

These are exact integrals of the normal density, rounded to the familiar 68 / 95 / 99.7 percentages. The figure shades the three bands in nested shades around the mean.

The rule is the reason σ is such a useful unit for the normal: distances from the mean measured in σ have distribution-independent meaning. "Three sigma" is shorthand for "an event so rare it should happen well under 1% of the time".

Where it shows up

The 68–95–99.7 rule lets you eyeball whether a measurement is unusual without consulting a normal-distribution table. A body temperature 2σ above your personal mean is in the top 2.5% of your distribution — worth taking seriously. An industrial process yielding parts 3σ outside spec is in the bottom 0.135%, the classical Six Sigma quality threshold (which targets even tighter bounds at 4.5σ).

In physics, three-sigma is the threshold for an "evidence"- level result and five-sigma is required for a discovery claim (probability under a chance fluctuation: 0.00006%). Both conventions are direct applications of the same tail-area integrals.

Frequently asked questions

Does this rule apply to non-normal distributions?

Only approximately, and only for distributions that resemble a normal — symmetric, bell-shaped, with light tails. For skewed distributions (income, waiting times) the rule overestimates the central mass. For heavy-tailed distributions (stock returns, earthquake magnitudes) the rule dramatically underestimates the tail probabilities.

Why these particular percentages?

They come from integrating the normal density between ±1σ, ±2σ, ±3σ. The exact values are 68.27%, 95.45%, 99.73% — but the rounded 68, 95, 99.7 are easier to memorise and accurate enough for back-of-envelope work.

Where does the normal distribution come from?

From the Central Limit Theorem: the sum (or average) of many independent random variables, each with finite variance, converges to a normal distribution. That's why measurement errors, test scores, and many natural quantities turn out to be approximately normal.

What's beyond 3σ?

About 0.27% — roughly 1 in 370 observations. Beyond 4σ: 1 in 16,000. Beyond 5σ: 1 in 1.7 million. The tails of a normal are spectacularly thin, which is why real-world data with rare extreme events (financial crashes, viral content) is poorly modelled as normal.

Related topics