# Holcombe:Statistics

(Difference between revisions)
 Revision as of 14:35, 23 March 2010 (view source)← Previous diff Current revision (20:08, 26 January 2011) (view source) (2 intermediate revisions not shown.) Line 21: Line 21: *Using MacCurveFit for OS9; rarely used *Using MacCurveFit for OS9; rarely used ==Bootstrapping== ==Bootstrapping== + how I [[Holcombe:fit psychometric functions]] and bootstrap [http://www.uvm.edu/~dhowell/StatPages/Resampling/BootstMeans/bootstrapping_means.html Howell's] pages [http://www.uvm.edu/~dhowell/StatPages/Resampling/BootstMeans/bootstrapping_means.html Howell's] pages == == == == [[Holcombe:CircularStatistics|Circular Statistics]] [[Holcombe:CircularStatistics|Circular Statistics]] + ==Effect size rant== + in psychology training and in psychology journals, you may have heard of formulas for "effect size" like Cohen's d.  An issue that came up in the lab yesterday provoked me into a long rant about this, which might help some of you if you ever wonder whether you should be calculating it: + + If you look up "effect size" in a psychology context, you'll find stuff like Cohen's d.  Cohen's d was invented for areas of psychology where the actual raw measure of the size of the effect—in our case the mean number of degrees per cycle of the error, or the ms of the lag— doesn't mean anything, so they had to invent a measure where they scale their relatively meaningless raw measure by its variability within conditions.  This can make sense in paradigms like an implicit association test ,  where the raw measure is  number of milliseconds difference between two conditions.  Here noone actually knows what it means if the brain is 5 ms slower in doing something or 10 ms slower in doing something.  So they want to divide it by some measure of how much bigger it is than random fluctuations.  So if you get something like a Cohen's d of 10, you can say wow , that's ten times bigger than the random fluctuations you get. + Fortunately in vision science our numbers are more meaningful in that having a lag of a certain number of milliseconds means we are that many milliseconds behind the visual world.  The errors are similar. It has a more definite meaning than the differences in response time  of two arbitrary tasks. We have the luxury of comparing these numbers across experiments directly. + Researchers studying other things aren't that lucky- if they did three experiments, in varying the task slightly from one experiment to the next , the baseline of their dependent measure might change a lot, like if the number of choices in a task increased, the baseline response time could  increase a lot.  Then, the response time difference in the two experiments could not be directly compared because in elevating the baseline,  the difference you get might change (for instance, if the relationship between the underlying mental quantity and RT was not linear), so then it would be better to compare something like Cohen's d, in the hope that dividing by the uncontrolled fluctuation would  account for the consequences of the baseline change.

## Current revision

### Members

Alex Holcombe
Sarah McIntyre
Fahed Jbarah
• Shih-Yu Lo
• Patrick Goodbourn
Lizzy Nguyen
Alumni

### Other

The picturing of data allows us to be sensitive not only to the multiple hypotheses that we hold, but to the many more we have not yet thought of, regard as unlikely, or think impossible -- Tukey, 1974

The great fun of information visualization is that it gives you answers to questions you didn’t know you had -- Ben Shneiderman

Jody Culham error bars lecture "Rule of thumb for 95% CIs: If the overlap is about half of one one-sided error bar, the difference is significant at ~ p < .05 If the error bars just abut, the difference is significant at ~ p< .01 works if n >= 10 and error bars don’t differ by more than a factor of 2 "

"If events are dependent (whether causal or not), the aggregate is not going to be Gaussian. "- why?

the sum of two independent random variables is distributed according to the convolution of their individual distributions

## Fitting curves to data

• R is often used in the lab
• Python alone and with SciPy can be used easily, example here
• MATLAB is sometimes used
• Using MacCurveFit for OS9; rarely used

## Bootstrapping

how I Holcombe:fit psychometric functions and bootstrap Howell's pages

## Effect size rant

in psychology training and in psychology journals, you may have heard of formulas for "effect size" like Cohen's d. An issue that came up in the lab yesterday provoked me into a long rant about this, which might help some of you if you ever wonder whether you should be calculating it:

If you look up "effect size" in a psychology context, you'll find stuff like Cohen's d. Cohen's d was invented for areas of psychology where the actual raw measure of the size of the effect—in our case the mean number of degrees per cycle of the error, or the ms of the lag— doesn't mean anything, so they had to invent a measure where they scale their relatively meaningless raw measure by its variability within conditions. This can make sense in paradigms like an implicit association test , where the raw measure is number of milliseconds difference between two conditions. Here noone actually knows what it means if the brain is 5 ms slower in doing something or 10 ms slower in doing something. So they want to divide it by some measure of how much bigger it is than random fluctuations. So if you get something like a Cohen's d of 10, you can say wow , that's ten times bigger than the random fluctuations you get. Fortunately in vision science our numbers are more meaningful in that having a lag of a certain number of milliseconds means we are that many milliseconds behind the visual world. The errors are similar. It has a more definite meaning than the differences in response time of two arbitrary tasks. We have the luxury of comparing these numbers across experiments directly.

Researchers studying other things aren't that lucky- if they did three experiments, in varying the task slightly from one experiment to the next , the baseline of their dependent measure might change a lot, like if the number of choices in a task increased, the baseline response time could increase a lot. Then, the response time difference in the two experiments could not be directly compared because in elevating the baseline, the difference you get might change (for instance, if the relationship between the underlying mental quantity and RT was not linear), so then it would be better to compare something like Cohen's d, in the hope that dividing by the uncontrolled fluctuation would account for the consequences of the baseline change.