Representing data numerically

From OpenWetWare
Revision as of 17:58, 2 September 2007 by Barry Canton (talk | contribs) (→‎References)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
back to stats portal

Once you obtained experimental data it is important to decide how to represent it. You want to summarise without misleading readers. Graphical representation is faster to take in for most people. On the other hand, numerical description is often more compact and quantities may be more accurately represented.

Mean, median, or mode

The central tendency of a data set is commonly described either by the mean, the median, or the mode.

Take these numbers for example: 1, 3, 3, 4, 6, 7, 18;

  • The mean is adversely affected by outliers, like 18. Here the mean is 6.
  • The median ignores the values and just represents the middle position in an ordered series. Here the median is 4.
  • The mode is only useful for integers or categories. Here the mode is 3.

Standard deviation or standard error

The standard deviation is a way to represent how much your data is spread. The larger the standard deviation, the further apart are your data points. You will often find the standard deviation abbreviated as SD or s.

The standard error of the mean, on the other hand, is an easy way to represent the precision of the measured mean. This is probably what you want to do most of the time, unless you are interested in describing the breadth of spread. The standard error is often abbreviated as SE, SEM, or SE.

Both concepts are interconnected. The standard deviation is in fact in the numerator of the formulae to calculate the standard error of the mean. But the latter is further reduced by sample size N and thus smaller than the standard deviation.

Notation

Typically, you will state the calculated mean to represent your data. Accompany this with the most appropriate measure of dispersion - either the standard deviation or the standard error of the mean. For the numerical example above this would look like this:

  • mean & standard deviation: mean=6 (SD=5.7, n=7) or mean = 6 +/- 5.7 (SD, n=7)
  • mean & standard error:       mean=6 (SEM=2.1, n=7) or mean = 6 +/- 2.1 (SEM, n=7)

Don't forget to state which method you used. This is forgotten in every 7th publication on average (Altman & Bland 2005). Also, state the sample number n to give readers an impression of how many measurements you took.

References

Wikipedia entries:

Tutorials:


Papers: