Holcombe:Statistics
From OpenWetWare
(→Bootstrapping) 
Current revision (21:08, 26 January 2011) (view source) 

(One intermediate revision not shown.)  
Line 22:  Line 22:  
==Bootstrapping==  ==Bootstrapping==  
how I [[Holcombe:fit psychometric functions]] and bootstrap  how I [[Holcombe:fit psychometric functions]] and bootstrap  
  
[http://www.uvm.edu/~dhowell/StatPages/Resampling/BootstMeans/bootstrapping_means.html Howell's] pages  [http://www.uvm.edu/~dhowell/StatPages/Resampling/BootstMeans/bootstrapping_means.html Howell's] pages  
== ==  == ==  
[[Holcombe:CircularStatisticsCircular Statistics]]  [[Holcombe:CircularStatisticsCircular Statistics]]  
+  ==Effect size rant==  
+  in psychology training and in psychology journals, you may have heard of formulas for "effect size" like Cohen's d. An issue that came up in the lab yesterday provoked me into a long rant about this, which might help some of you if you ever wonder whether you should be calculating it:  
+  
+  If you look up "effect size" in a psychology context, you'll find stuff like Cohen's d. Cohen's d was invented for areas of psychology where the actual raw measure of the size of the effect—in our case the mean number of degrees per cycle of the error, or the ms of the lag— doesn't mean anything, so they had to invent a measure where they scale their relatively meaningless raw measure by its variability within conditions. This can make sense in paradigms like an implicit association test , where the raw measure is number of milliseconds difference between two conditions. Here noone actually knows what it means if the brain is 5 ms slower in doing something or 10 ms slower in doing something. So they want to divide it by some measure of how much bigger it is than random fluctuations. So if you get something like a Cohen's d of 10, you can say wow , that's ten times bigger than the random fluctuations you get.  
+  Fortunately in vision science our numbers are more meaningful in that having a lag of a certain number of milliseconds means we are that many milliseconds behind the visual world. The errors are similar. It has a more definite meaning than the differences in response time of two arbitrary tasks. We have the luxury of comparing these numbers across experiments directly.  
+  Researchers studying other things aren't that lucky if they did three experiments, in varying the task slightly from one experiment to the next , the baseline of their dependent measure might change a lot, like if the number of choices in a task increased, the baseline response time could increase a lot. Then, the response time difference in the two experiments could not be directly compared because in elevating the baseline, the difference you get might change (for instance, if the relationship between the underlying mental quantity and RT was not linear), so then it would be better to compare something like Cohen's d, in the hope that dividing by the uncontrolled fluctuation would account for the consequences of the baseline change. 
Current revision
Members• Alex Holcombe 
Projects• Testing Booth Calendar 

Technical• Skills Checklist 
Other• Plots,Graphs

The picturing of data allows us to be sensitive not only to the multiple hypotheses that we hold, but to the many more we have not yet thought of, regard as unlikely, or think impossible  Tukey, 1974
The great fun of information visualization is that it gives you answers to questions you didn’t know you had  Ben Shneiderman
Jody Culham error bars lecture "Rule of thumb for 95% CIs: If the overlap is about half of one onesided error bar, the difference is significant at ~ p < .05 If the error bars just abut, the difference is significant at ~ p< .01 works if n >= 10 and error bars don’t differ by more than a factor of 2 "
"If events are dependent (whether causal or not), the aggregate is not going to be Gaussian. " why?
the sum of two independent random variables is distributed according to the convolution of their individual distributions
Fitting curves to data
 R is often used in the lab
 Python alone and with SciPy can be used easily, example here
 MATLAB is sometimes used
 Using MacCurveFit for OS9; rarely used
Bootstrapping
how I Holcombe:fit psychometric functions and bootstrap Howell's pages
Effect size rant
in psychology training and in psychology journals, you may have heard of formulas for "effect size" like Cohen's d. An issue that came up in the lab yesterday provoked me into a long rant about this, which might help some of you if you ever wonder whether you should be calculating it:
If you look up "effect size" in a psychology context, you'll find stuff like Cohen's d. Cohen's d was invented for areas of psychology where the actual raw measure of the size of the effect—in our case the mean number of degrees per cycle of the error, or the ms of the lag— doesn't mean anything, so they had to invent a measure where they scale their relatively meaningless raw measure by its variability within conditions. This can make sense in paradigms like an implicit association test , where the raw measure is number of milliseconds difference between two conditions. Here noone actually knows what it means if the brain is 5 ms slower in doing something or 10 ms slower in doing something. So they want to divide it by some measure of how much bigger it is than random fluctuations. So if you get something like a Cohen's d of 10, you can say wow , that's ten times bigger than the random fluctuations you get. Fortunately in vision science our numbers are more meaningful in that having a lag of a certain number of milliseconds means we are that many milliseconds behind the visual world. The errors are similar. It has a more definite meaning than the differences in response time of two arbitrary tasks. We have the luxury of comparing these numbers across experiments directly.
Researchers studying other things aren't that lucky if they did three experiments, in varying the task slightly from one experiment to the next , the baseline of their dependent measure might change a lot, like if the number of choices in a task increased, the baseline response time could increase a lot. Then, the response time difference in the two experiments could not be directly compared because in elevating the baseline, the difference you get might change (for instance, if the relationship between the underlying mental quantity and RT was not linear), so then it would be better to compare something like Cohen's d, in the hope that dividing by the uncontrolled fluctuation would account for the consequences of the baseline change.