Representing data numerically: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
 
Line 34: Line 34:


Tutorials:
Tutorials:
* [http://graphpad.com/FAQ/images/Ci%20of%20quotient.pdf Confidence interval of a ratio of two means]
* [http://www.cdc.gov/descd/MiniModules/SD_SEM/page01.htm standard deviation and standard error tutorial by CDC]
* [http://www.cdc.gov/descd/MiniModules/SD_SEM/page01.htm standard deviation and standard error tutorial by CDC]
* [http://www.gifted.uconn.edu/siegle/research/Normal/stdexcel.htm mean & standard deviation with Excel]
* [http://www.gifted.uconn.edu/siegle/research/Normal/stdexcel.htm mean & standard deviation with Excel]
* [http://support.microsoft.com/kb/214076 standard error of the mean with Excel]
* [http://support.microsoft.com/kb/214076 standard error of the mean with Excel]


Papers:
Papers:
* [http://www.bmj.com/cgi/content/full/331/7521/903 "Standard deviations and standard errors", Altman and Bland, BMJ 2005]
* [http://www.bmj.com/cgi/content/full/331/7521/903 "Standard deviations and standard errors", Altman and Bland, BMJ 2005]

Latest revision as of 17:58, 2 September 2007

back to stats portal

Once you obtained experimental data it is important to decide how to represent it. You want to summarise without misleading readers. Graphical representation is faster to take in for most people. On the other hand, numerical description is often more compact and quantities may be more accurately represented.

Mean, median, or mode

The central tendency of a data set is commonly described either by the mean, the median, or the mode.

Take these numbers for example: 1, 3, 3, 4, 6, 7, 18;

  • The mean is adversely affected by outliers, like 18. Here the mean is 6.
  • The median ignores the values and just represents the middle position in an ordered series. Here the median is 4.
  • The mode is only useful for integers or categories. Here the mode is 3.

Standard deviation or standard error

The standard deviation is a way to represent how much your data is spread. The larger the standard deviation, the further apart are your data points. You will often find the standard deviation abbreviated as SD or s.

The standard error of the mean, on the other hand, is an easy way to represent the precision of the measured mean. This is probably what you want to do most of the time, unless you are interested in describing the breadth of spread. The standard error is often abbreviated as SE, SEM, or SE.

Both concepts are interconnected. The standard deviation is in fact in the numerator of the formulae to calculate the standard error of the mean. But the latter is further reduced by sample size N and thus smaller than the standard deviation.

Notation

Typically, you will state the calculated mean to represent your data. Accompany this with the most appropriate measure of dispersion - either the standard deviation or the standard error of the mean. For the numerical example above this would look like this:

  • mean & standard deviation: mean=6 (SD=5.7, n=7) or mean = 6 +/- 5.7 (SD, n=7)
  • mean & standard error:       mean=6 (SEM=2.1, n=7) or mean = 6 +/- 2.1 (SEM, n=7)

Don't forget to state which method you used. This is forgotten in every 7th publication on average (Altman & Bland 2005). Also, state the sample number n to give readers an impression of how many measurements you took.

References

Wikipedia entries:

Tutorials:


Papers: