Holcombe:Statistics: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
Line 22: Line 22:
==Bootstrapping==
==Bootstrapping==
how I [[Holcombe:fit psychometric functions]] and bootstrap
how I [[Holcombe:fit psychometric functions]] and bootstrap
== ==
[[Holcombe:CircularStatistics|Circular Statistics]]
==Effect size rant==
in psychology training and in psychology journals, you may have heard of formulas for "effect size" like Cohen's d.  An issue that came up in the lab yesterday provoked me into a long rant about this, which might help some of you if you ever wonder whether you should be calculating it:


If you look up "effect size" in a psychology context, you'll find stuff like Cohen's d.  Cohen's d was invented for areas of psychology where the actual raw measure of the size of the effect—in our case the mean number of degrees per cycle of the error, or the ms of the lag— doesn't mean anything, so they had to invent a measure where they scale their relatively meaningless raw measure by its variability within conditions.  This can make sense in paradigms like an implicit association test ,  where the raw measure is  number of milliseconds difference between two conditions.  Here noone actually knows what it means if the brain is 5 ms slower in doing something or 10 ms slower in doing something.  So they want to divide it by some measure of how much bigger it is than random fluctuations.  So if you get something like a Cohen's d of 10, you can say wow , that's ten times bigger than the random fluctuations you get.
Fortunately in vision science our numbers are more meaningful in that having a lag of a certain number of milliseconds means we are that many milliseconds behind the visual world.  The errors are similar. It has a more definite meaning than the differences in response time  of two arbitrary tasks. We have the luxury of comparing these numbers across experiments directly.
Researchers studying other things aren't that lucky- if they did three experiments, in varying the task slightly from one experiment to the next , the baseline of their dependent measure might change a lot, like if the number of choices in a task increased, the baseline response time could  increase a lot.  Then, the response time difference in the two experiments could not be directly compared because in elevating the baseline,  the difference you get might change (for instance, if the relationship between the underlying mental quantity and RT was not linear), so then it would be better to compare something like Cohen's d, in the hope that dividing by the uncontrolled fluctuation would  account for the consequences of the baseline change. 
[http://www.uvm.edu/~dhowell/StatPages/Resampling/BootstMeans/bootstrapping_means.html Howell's] pages
[http://www.uvm.edu/~dhowell/StatPages/Resampling/BootstMeans/bootstrapping_means.html Howell's] pages
== ==
[[Holcombe:CircularStatistics|Circular Statistics]]

Revision as of 18:07, 26 January 2011

Recent members

Alex Holcombe
• Ryo Nakayama



Technical

Skills Checklist
Python Programming
Psychopy/VisionEgg Installation Notes
R analysis,plot,stats
Statistics
Buttonbox
Buttonbox with photocell
Programming Cheat Sheets


The picturing of data allows us to be sensitive not only to the multiple hypotheses that we hold, but to the many more we have not yet thought of, regard as unlikely, or think impossible -- Tukey, 1974

The great fun of information visualization is that it gives you answers to questions you didn’t know you had -- Ben Shneiderman

Jody Culham error bars lecture "Rule of thumb for 95% CIs: If the overlap is about half of one one-sided error bar, the difference is significant at ~ p < .05 If the error bars just abut, the difference is significant at ~ p< .01 works if n >= 10 and error bars don’t differ by more than a factor of 2 "

"If events are dependent (whether causal or not), the aggregate is not going to be Gaussian. "- why?

the sum of two independent random variables is distributed according to the convolution of their individual distributions

Fitting curves to data

  • R is often used in the lab
  • Python alone and with SciPy can be used easily, example here
  • MATLAB is sometimes used
  • Using MacCurveFit for OS9; rarely used

Bootstrapping

how I Holcombe:fit psychometric functions and bootstrap

Circular Statistics

Effect size rant

in psychology training and in psychology journals, you may have heard of formulas for "effect size" like Cohen's d. An issue that came up in the lab yesterday provoked me into a long rant about this, which might help some of you if you ever wonder whether you should be calculating it:

If you look up "effect size" in a psychology context, you'll find stuff like Cohen's d. Cohen's d was invented for areas of psychology where the actual raw measure of the size of the effect—in our case the mean number of degrees per cycle of the error, or the ms of the lag— doesn't mean anything, so they had to invent a measure where they scale their relatively meaningless raw measure by its variability within conditions. This can make sense in paradigms like an implicit association test , where the raw measure is number of milliseconds difference between two conditions. Here noone actually knows what it means if the brain is 5 ms slower in doing something or 10 ms slower in doing something. So they want to divide it by some measure of how much bigger it is than random fluctuations. So if you get something like a Cohen's d of 10, you can say wow , that's ten times bigger than the random fluctuations you get. Fortunately in vision science our numbers are more meaningful in that having a lag of a certain number of milliseconds means we are that many milliseconds behind the visual world. The errors are similar. It has a more definite meaning than the differences in response time of two arbitrary tasks. We have the luxury of comparing these numbers across experiments directly. Researchers studying other things aren't that lucky- if they did three experiments, in varying the task slightly from one experiment to the next , the baseline of their dependent measure might change a lot, like if the number of choices in a task increased, the baseline response time could increase a lot. Then, the response time difference in the two experiments could not be directly compared because in elevating the baseline, the difference you get might change (for instance, if the relationship between the underlying mental quantity and RT was not linear), so then it would be better to compare something like Cohen's d, in the hope that dividing by the uncontrolled fluctuation would account for the consequences of the baseline change. Howell's pages