Statistics guide

Summary Statistics Basics

Summary statistics compress a dataset into a few numbers. The right number depends on whether you need the center, spread, rank or shape of the data.

Center

The mean is the arithmetic average. The median is the middle value after sorting. The median is often more stable when data has extreme outliers.

Spread

Variance and standard deviation measure how far values tend to be from the mean. Sample standard deviation uses a different denominator than population standard deviation.

Rank and distribution

Percentiles describe relative position. A percentile is not the same as a percentage score; it describes how a value compares with other values in a dataset.

Common mistakes

  • Using population formulas for sample data without noticing.
  • Reporting too many decimals for noisy data.
  • Using the mean for heavily skewed data without also checking the median.
  • Interpreting correlation or summary statistics as causation.

Small dataset example

For 2, 4, 4, 9, 11, the mean is 6, median is 4, mode is 4 and range is 9. The mean is pulled upward by the larger values, while the median shows the middle sorted value. A useful result shows multiple summary statistics because one number rarely describes a dataset well.

Outliers can change interpretation. If 11 is replaced by 101, the median remains 4 but the mean jumps to 24. That is why robust summaries such as median, quartiles and interquartile range can be more informative for skewed data.

Useful calculators

FAQ

When should I use median instead of mean?

Use median when outliers or skew would make the mean misleading.

What is the sample standard deviation?

It estimates spread in a wider population from a sample and uses n - 1 in the denominator.

Can summary statistics replace a chart?

No. A chart can show outliers, clusters and distribution shape that one number may hide.

Last reviewed: 2026-05-16.