Mass Univariate ERP Toolbox: within-subject t-tests

From OpenWetWare
Revision as of 07:34, 24 September 2012 by David M. Groppe (talk | contribs) (→‎cluster mass permutation test)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search


Tutorial Sections

  1. Installing the Mass Univariate Toolbox
  2. Creating GND variables from EEGLAB set or ERPLAB erp files
  3. Within-subject t-tests<-You are here
  4. Between-subject t-tests
  5. General advice


The within-subjects analysis section of the tutorial uses the GND variable in the following file for illustration:Visodbl.GND

The GND variable contains ERPs from eight subjects performing two visual oddball tasks. In both oddball tasks, subjects were told to silently count a rare category of stimulus. In the first "X/O task", the stimuli were X's and O's. In the second "Case task", the stimuli were words composed of all upper case or lower case letters (e.g., "BROWN", "brown").


Within-Subject t-Tests

Creating a difference wave

The first step to detecting within-subject differences between conditions is to construct a difference wave from the ERPs from those two conditions. This is done via the function bin_dif.m. For example, let's compute the difference between targets and standards in the X/O oddball task in the demo GND variable. First download the file and then load it into MATLAB by entering something like the following command:
>> load Visodbl.GND -MAT

Now let's see what's in it:
>> headinfo(GND)
CHANNELS
1:MiPf 2:LLPf 3:RLPf 4:LMPf 5:RMPf 6:LDFr 7:RDFr 8:LLFr
9:RLFr 10:LMFr 11:RMFr 12:LMCe 13:RMCe 14:MiCe 15:MiPa 16:LDCe
17:RDCe 18:LDPa 19:RDPa 20:LMOc 21:RMOc 22:LLTe 23:RLTe 24:LLOc
25:RLOc 26:MiOc

BINS
Bin 1: X/O Task Standard All (X,O)
Bin 2: X/0 Task Target All (X,O)
Bin 3: Case Task Standard All (Up,Lo,Co,An)
Bin 4: Case Task Target All (Up,Lo,Co,An)

t-TESTS
No t-test results have been stored with these data.



To visualize the ERPs in the first two bins simultaneously, you can use the function plot_wave.m:
>> plot_wave(GND,[1 2]);


To compute the difference between targets and standards in the X/O task (i.e., Bins 2 and 1 respectively), enter the following:
>> GND=bin_dif(GND,2,1,'X/O Task Target-Standards');

Which should output the following:
<<New bin successfully created>>
Condition Code 1: Oddball Task P=0.33
Bin 5: X/O Task Target-Standards


You can now visualize the difference wave with the ERP GUI:
>>gui_erp(GND,'bin',5);

which should produce this:

Try clicking on the waveforms at various points to see what the topography of the difference wave looks like. Looking at the t-scores, it appears that there are two prominent effects: a P2 effect around 200 ms and P3b effect that peaks around 400 ms. Note that the P2 effect is actually more reliable across participants (hence the larger t-scores).


Time point by time point analysis

tmax permutation test

To determine when and where the ERPs to standards and targets differ, we will perform a permutation test based on the one-sample/repeated measures t-statistic using every time point at every electrode from 100 to 900 ms post-stimulus (i.e., 5226 comparisons). The permutation test will control the family-wise error rate (i.e., the probability of making one or more false discoveries) across the full set of comparisons. The motivation for choosing this time window is that there are unlikely to be effects before 100 ms and effects after 900 ms are not of interest since they occur long after subjects have made a response. We do this with the function tmaxGND.m:
>>GND=tmaxGND(GND,5,'time_wind',[100 900],'output_file','visodbl_XO.txt');
Note, that "5" refers to the bin number of the difference wave we want to analyze, "time_wind", indicates the time window we want to analyze, and "visodbl_XO.txt" is the name of a space-delimited text file containing the results of the analysis that will be created by the function. At the MATLAB command line you should see the following output:

Experiment: VisOddBall Task Run2
8 out of 8 participants have data in relevant bin.
Time Window #1:
Attempting to use time boundaries of 100 to 900 ms for hypothesis test.
Exact window boundaries are 100 to 900 ms (that's from time point 51 to 251).
Testing null hypothesis that the grand average ERPs in Bin 5 (X/O Task Target-Standards) have a mean of 0.000000 microvolts.
Alternative hypothesis is that the ERPs differ from 0.000000 (i.e., two-tailed test).
mxt_perm1: Number of channels: 26
mxt_perm1: Number of time points: 201
mxt_perm1: Total # of comparisons: 5226
mxt_perm1: Number of participants: 8
t-score degrees of freedom: 7
Executing permutation test with 2500 permutations...
Permutations completed: 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000
1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000
2100, 2200, 2300, 2400, 2500
Desired family-wise alpha level: 0.050000
Estimated actual family-wise alpha level: 0.053200
Critical t-score(s):-9.4493 9.4493
That corresponds to a test-wise alpha level of 0.000031.
Bonferroni test-wise alpha would be 0.000010.
Significant differences from zero (in order of earliest to latest):
192 ms, electrode(s): LMCe, MiCe.
196 ms, electrode(s): LMCe, RMCe, MiCe.
200 ms, electrode(s): RMCe, MiCe.
204 ms, electrode(s): RMCe, MiCe, MiPa.
208 ms, electrode(s): MiCe, MiPa.
240 ms, electrode(s): RMCe.
244 ms, electrode(s): RMCe.
364 ms, electrode(s): LLOc.
368 ms, electrode(s): LLOc.
372 ms, electrode(s): LMOc, LLOc.
376 ms, electrode(s): LMOc, LLOc.
380 ms, electrode(s): LMOc, LLOc.
384 ms, electrode(s): LDPa, LMOc.
388 ms, electrode(s): MiPa, LDPa, LMOc.
392 ms, electrode(s): MiPa, LDPa, LMOc, RMOc.
396 ms, electrode(s): MiPa, LDPa, LMOc.
400 ms, electrode(s): MiPa, LDPa, RDPa, LMOc, RMOc.
404 ms, electrode(s): MiPa, LDPa, RDPa, LMOc, RMOc.
408 ms, electrode(s): MiPa, LDPa, LMOc.
412 ms, electrode(s): MiPa, LDPa, RDPa, LMOc.
416 ms, electrode(s): MiPa, LDPa, RDPa, LMOc.
420 ms, electrode(s): MiPa, LMOc.
428 ms, electrode(s): MiPa.
436 ms, electrode(s): LLTe.
452 ms, electrode(s): MiPa, LLTe.
456 ms, electrode(s): MiPa.
460 ms, electrode(s): LLTe.
468 ms, electrode(s): MiPa.
472 ms, electrode(s): MiCe, MiPa.
All significant p-values<=0.044800
Save GND variable as /Users/dgroppe/Documents/MATLAB/DEMO/Visodbl.GND? (y or n) y
GND variable successfully saved to disk.


Two visualizations of the permutation test results are produced by default. The first of these is a "raster" diagram:

Each indicates box in the raster diagram represents the result of a t-test. If the box is white, the difference wave is significantly positive at that time point and electrode (even after effectively correcting for multiple comparisons). If the box is black, the difference wave is significantly negative at that time point and electrode. If the box is gray, the t-test is not significant (after effectively correcting for multiple comparisons). Note that the electrodes are organized along the y-axis somewhat topographically. Electrodes on the left and right sides of the head are grouped on the figure's top and bottom, respectively. Midline electrodes are shown in the middle. Within those three groupings, y-axis top-to-bottom corresponds to scalp anterior-to-posterior. If you click on a box, the electrode and time point corresponding to that box will appear on the figure. Looking at the raster, you can see that there are primarily two effects: the P2 and P3b effect we had noted when first looking at the t-scores.

To save a copy of the figure (say to include in a manuscript) enter something like the following command:
>> print -f1 -depsc odblXO_raster
MATLAB will create an enhanced postscript copy of Figure 1 called odblXO_raster.eps.

Note, some image viewers (e.g., Apple's Preview) may have trouble viewing postscript raster images since some downsample the image to make the file smaller, which blurs the lines. gv and Adobe Illustrator should be able to view the files just fine. If viewing the raster as a postscript file is producing problems, you can save the figure in jpg format via the disk icon on the figure window or the command:
>>print -f1 -djpeg odblXO_raster

The second visualization produced by tmaxGND by default is the ERP GUI. This the same as before, but now we can see the critical t-scores from the permutation test overlaid on the difference wave t-scores. When the difference wave at an electrode reaches significance, the electrode is represented in white instead of black on the two topographies:


To reproduce these figures without redoing the permutation test, enter the following:
>> sig_raster(GND,1);
>> gui_erp(GND,'t_test',1);

You can also create a black and white butterfly plot of the waveforms for publication via the sig_wave.m function:
>> sig_wave(GND,1,'use_color','no');


This example showed you the very basics of using tmaxGND. The function has many optional arguments that may be of interest (e.g., doing one-tailed tests, performing tests in multiple non-contiguous time windows). To see the full list of options for tmaxGND enter:
>> help tmaxGND

t-tests with control of the false discovery rate

Instead of using the tmax permutation procedure to deal with the large number of comparisons, you can use one of various methods from Benjamini and colleagues to control the false discovery rate (i.e., the mean proportion of significant tests that are false discoveries). You can do this with the function tfdrGND.m which functions very similarly to tmaxGND. To repeat the analysis performed with the tmax procedure above using the Benjamini and Yekutieli algorithm for control of the false discovery rate, enter the following:
>> GND=tfdrGND(GND,5,'method','by','time_wind',[100 900],'output_file','p3xo_fdr.txt');

The results of the analysis are shown in the ERP GUI:

Note that instead of "α=0.05" denoting the critical threshold for significance in the GUI, q=0.05 is used. This follows the convention of referring to FDR corrected "p-values" as "q-values." In other words, if you use an FDR control algorithm to limit FDR to 5%, any test result with a "q-value" less than or equal to 5% is deemed significant.

As with tmaxGND, tfdrGND has many optional arguments that may be of interest (e.g., doing one-tailed tests, performing tests in multiple non-contiguous time windows). To see the full list of options for tfdrGND enter:
>> help tfdrGND


cluster mass permutation test

Yet another approach for correcting for multiple comparisons are cluster-based permutation tests (Bullmore et al, 1999). These tests operate like the tmax test described above, but instead of building a null hypothesis distribution from the most extreme statistic from each single time point/electrode, the cluster-based test first forms clusters of neighboring extreme t-scores and then builds the null hypothesis distribution from the most extreme cluster statistic (e.g., the sum of all the t-scores in the cluster or the number of time point/electrodes in the cluster). This approach capitalizes on the fact that ERP effects are more likely than noise to extend across many adjacent electrodes and time points and is probably the most powerful mass univariate procedure for detecting broadly distributed effects (Groppe et al., 2011; Maris & Oostenveld, 2007).

The Mass Univariate Toolbox function clustGND.m implements the cluster mass permutation test and is called similarly to tmaxGND.m:
>>GND=clustGND(GND,5,'time_wind',[100 900],'chan_hood',.61,'thresh_p',.05);

Note, there are two optional input parameters specific to the cluster-based test. The first of these, 'chan_hood', defines which electrodes are considered neighbors of one other. In this example, we've specified .61, which means that all electrodes within a distance of .61 of an electrode are considered neighbors of that electrode. Note, that for these data, the electrode coordinates are idealized and assume a spherical head of unit radius. To convert this value into centimeters, measure the circumference of a participant's head in centimeters and use the following formulas:

  • radius=circumference/(2*pi);
  • radius*max distance in idealized coordinates=max distance in units of cm

For example, assuming that participants on average have a head circumference of 56 cm, an idealized distance of 0.61 corresponds to a distance of 5.44 cm.

The second cluster test optional input parameter is 'thresh_p'. This specifies the threshold for cluster inclusion. If the t-score at an electrode/time point corresponds to an uncorrected p-value that is greater than thresh_p, it will NOT be included in any clusters and will be assigned a corrected p-value of 1.

The command line output of clustGND provides somewhat different information than the previous non-cluster based tests. Specifically, it can summarize the criteria used for defining spatial neighbors:
max_dist value of 0.61 corresponds to an approximate distance of 5.44 cm (assuming
a 56 cm radius head and that your electrode coordinates are based on an idealized
spherical head with unit radius).
Mean (SD) # of neighbors per channel: 3.8 (1.5)
Median (semi-IQR) # of neighbors per channel: 5.0 (1.5)
Min/max # of neighbors per channel: 2 to 5

And it tells you how many clusters of positive and negative t-scores were found and their p-value range:
# of positive clusters found: 20
# of significant positive clusters found: 1
Positive cluster p-values range from 0.0056 to 1.9552.
# of negative clusters found: 23
# of significant negative clusters found: 0
Negative cluster p-values range from 0.9176 to 1.9552.

Note, as when one uses Bonferroni correction, clusters can take a p-value greater than 1. In this example, only one significant cluster was found, but as you can see from the resulting raster plot, the cluster is quite extensive:


The ERP GUI is also produced by default to visualize the results of the test. As with the tmax test and FDR control results, significant differences are indicated by white electrodes on the GUI. However, unlike those procedures there is no dashed red line to indicate a significance threshold. This is because one cannot generally make such a threshold for cluster based tests.



In addition to the ternary-valued raster diagram, one can also use a temperature scale to represent the graded degree of significance at each time point-channel like so:
>>sig_raster(GND,3,'use_color','rgb');



Mean time window analysis

tmax permutation test

Instead of performing t-tests at every single time point, it is also possible to perform t-test on mean difference wave amplitude in a particular time window. For example, say we had reason to believe that the ERPs to targets and standard should differ in two time windows: 150-250 and 300-500 ms. We can perform t-tests in those two windows at every electrode using the tmax permutation procedure as follows:

>> GND=tmaxGND(GND,5,'time_wind',[150 250; 300 500],'mean_wind','yes');

This should produce the following command line output:
Experiment: VisOddBall Task Run2
8 out of 8 participants have data in relevant bin.
Time Window #1:
Attempting to use time boundaries of 150 to 250 ms for hypothesis test.
Exact window boundaries are 148 to 248 ms (that's from time point 63 to 88).
Time Window #2:
Attempting to use time boundaries of 300 to 500 ms for hypothesis test.
Exact window boundaries are 300 to 500 ms (that's from time point 101 to 151).
Testing null hypothesis that the grand average ERPs in Bin 5 (X/O Task Target-Standards) have a mean of 0.000000 microvolts.
Alternative hypothesis is that the ERPs differ from 0.000000 (i.e., two-tailed test).
mxt_perm1: Number of channels: 26
mxt_perm1: Number of time points: 2
mxt_perm1: Total # of comparisons: 52
mxt_perm1: Number of participants: 8
t-score degrees of freedom: 7
Executing permutation test with 2500 permutations...
Permutations completed: 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000
1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000
2100, 2200, 2300, 2400, 2500
Desired family-wise alpha level: 0.050000
Estimated actual family-wise alpha level: 0.058800
Critical t-score(s):-3.9667 3.9667
That corresponds to a test-wise alpha level of 0.005415.
Bonferroni test-wise alpha would be 0.000962.
Significant differences from zero (in order of earliest to latest):
148 to 248 ms, electrode(s): LMPf, RLFr, LMFr, RMFr, LMCe, RMCe, MiCe, MiPa, RDCe, RDPa, RMOc.
300 to 500 ms, electrode(s): LMFr, RMFr, LMCe, RMCe, MiCe, MiPa, LDCe, RDCe, LDPa, RDPa, LMOc, RMOc, LLTe, RLTe, LLOc, RLOc, MiOc.
All significant p-values<=0.049200
Save GND variable as /Users/dgroppe/Documents/MATLAB/DEMO/Visodbl.GND? (y or n) y
GND variable successfully saved to disk.


The following figures should be produced as well. The first is a raster diagram. White squares indicate electrodes/time windows in which the ERPs to targets are significantly more positive.

The other two figures produced are the topographies of the mean amplitude of the target-standard difference wave in the two time windows analyzed. In one figure the topographies are in units of microvolts, in the other, they are shown as t-scores. White circles indicate electrodes for which the ERPs to targets and standards are significantly different.


To reproduce these figures without redoing the permutation test, enter the following:
>> sig_raster(GND,4);
>> sig_topo(GND,4,'units','uV');
>> sig_topo(GND,4,'units','t');

t-tests with control of the false discovery rate

Again, instead of using the tmax permutation procedure to deal with the large number of comparisons, you can use one of various methods from Benjamini and colleagues to control the false discovery rate using the function tfdrGND.m. To repeat the mean time window analysis performed with the tmax procedure above using the Benjamini and Yekutieli algorithm for control of the false discovery rate, enter the following:

>> GND=tfdrGND(GND,5,'time_wind',[150 250; 300 500],'mean_wind','yes','method','by');

Cluster-based permutation tests

One can also use a cluster mass permutation test on the mean ERP amplitudes in a single time window like so:

>> GND=clustGND(GND,5,'time_wind',[300 500],'chan_hood',.61,'thresh_p',.05,'mean_wind','yes');

Unfortunately, clustGND.m currently only supports the use of a single time window.


Sample methods description

To help you write up your own applications of these methods, below are example methods section descriptions of some of the analyses performed above:

tmax permutation test

For the time point by time point analysis performed here:

LONG VERSION:

    To detect reliable differences between the ERPs to standards and targets in the XO oddball task, the ERPs from these conditions were submitted to a repeated measures, two-tailed permutation test based on the tmax statistic (Blair & Karniski, 1993) using a family-wise alpha level of 0.05. All time points between 100 and 900 ms at all 26 scalp electrodes were included in the test (i.e., 5226 total comparisons). Repeated measures t-tests were performed for each comparison using the original data and 2500 random within-participant permutations of the data. The most extreme t-score in each of the 2501 sets of tests (i.e., the "tmax" of each set of tests) was recorded and used to estimate the tmax distribution of the null hypothesis (i.e., no difference between conditions*). Based on this estimate, critical t-scores of +/- 9.45 (df=7) were derived. In other words, any differences in the original data that exceeded a t-score of +/-9.45 were deemed reliable.
   This permutation test analysis was used in lieu of more conventional mean amplitude ANOVAs because it provides much better spatial and temporal resolution than conventional ANOVAs while maintaining a desired family-wise alpha level (i.e., it corrects for the large number of comparisons). Moreover, the tmax statistic was chosen for this permutation test because it has been shown to have relatively good power for data (like ERPs) whose dimensions are highly correlated (Hemmelman et al., 2004). 2500 permutations were used to estimate the distribution of the null hypothesis as it is over twice the number recommend by Manly (1997) for a family-wise alpha level of 0.05.

*Footnote: More specifically, the null hypothesis of the permutation test is that positive differences between conditions could have just as likely been negative differences and vice-versa. Thus, the distribution of the null hypothesis is symmetric around a difference of 0.

SHORT VERSION:

   To detect reliable differences between the ERPs to standards and targets in the XO oddball task, the ERPs from these conditions were submitted to a repeated measures, two-tailed permutation test based on the tmax statistic (Blair & Karniski, 1993) using a family-wise alpha level of 0.05. All time points between 100 and 900 ms at all 26 scalp electrodes were included in the test (i.e., 5226 total comparisons). 2500 random within-participant permutations of the data were used to estimate the distribution of the null hypothesis (i.e., no difference between conditions). Based on this estimate, critical t-scores of +/- 9.45 (df=7) were derived. In other words, any differences in the original data that exceeded a t-score of +/-9.45 were deemed reliable.

References:
  • Blair, R. C., & Karniski, W. (1993). An alternative method for significance testing of waveform difference potentials. Psychophysiology, 30(5), 518-524.
  • Hemmelmann, C., Horn, M., Reiterer, S., Schack, B., Susse, T., & Weiss, S. (2004). Multivariate tests for the evaluation of high-dimensional EEG data. Journal of Neuroscience Methods, 139(1), 111-120
  • Manly, B. F. J. (1997). Randomization, Bootstrap, and Monte Carlo Methods in Biology (2nd ed.). London: Chapman & Hall.


t-tests with control of the false discovery rate

For the time point by time point analysis performed here:

LONG VERSION:

    To detect reliable differences between the ERPs to standards and targets in the XO oddball task, the ERPs from these conditions were submitted to repeated measures, two-tailed t-tests at all time points between 100 and 900 ms at all 26 scalp electrodes (i.e., 5226 total comparisons). The Benjamini & Yekutieli (2001) procedure for control of the false discovery rate* (FDR) was applied to assess the significance of each test using an FDR level of 5%. This particular FDR procedure guarantees that the true FDR will be less than or equal to the nominal FDR level of 5% regardless of the dependency structure of the multiple tests.
   This massive univariate analysis was used in lieu of more conventional mean amplitude ANOVAs because it provides much better spatial and temporal resolution than conventional ANOVAs while maintaining reasonable limits on the likelihood of false discoveries (i.e., it provides a reasonable correction for the large number of comparisons).

*Footnote: The false discovery rate is the mean proportion of significant test results that are actually false discoveries (i.e., Type I errors).

SHORT VERSION:

    To detect reliable differences between the ERPs to standards and targets in the XO oddball task, the ERPs from these conditions were submitted to repeated measures, two-tailed t-tests at all time points between 100 and 900 ms at all 26 scalp electrodes (i.e., 5226 total comparisons). The Benjamini & Yekutieli (2001) procedure for control of the false discovery rate* (FDR) was applied to assess the significance of each test using an FDR level of 5%. This particular FDR procedure guarantees that the true FDR will be less than or equal to the nominal FDR level of 5% regardless of the dependency structure of the multiple tests.


References:
  • Benjamini, Y., & Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics, 29(4), 1165-1188.



cluster mass permutation test

For the time point by time point analysis performed here:

LONG VERSION:

    To detect reliable differences between the ERPs to standards and targets in the XO oddball task, the ERPs from these conditions were submitted to a repeated measures, two-tailed cluster-based permutation test based on the cluster mass statistic (Bullmore et al., 1999) using a family-wise alpha level of 0.05. All time points between 100 and 900 ms at all 26 scalp electrodes were included in the test (i.e., 5226 total comparisons). Repeated measures t-tests were performed for each comparison using the original data and 2500 random within-participant permutations of the data. For each permutation, all t-scores corresponding to uncorrected p-values of 0.05 of less were formed into clusters with any neighboring such t-scores. Electrodes within approximately 5.44 cm of one another were considered spatial neighbors and adjacent time points were considered temporal neighbors. The sum of the t-scores in each cluster is the "mass" of that cluster and the most extreme cluster mass in each of the 2501 sets of tests was recorded and used to estimate the distribution of the null hypothesis (i.e., no difference between conditions*). The permutation cluster mass percentile ranking of each cluster from the observed data was used to derive its p-value. The p-value of the cluster was assigned to each member of the cluster and t-scores that were not included in a cluster were given a p-value of 1.
   This permutation test analysis was used in lieu of more conventional mean amplitude ANOVAs because it provides much better spatial and temporal resolution than conventional ANOVAs while maintaining weak control of the family-wise alpha level (i.e., it corrects for the large number of comparisons). Moreover, the cluster mass statistic was chosen for this permutation test because it has been shown to have relatively good power for broadly distributed ERP effects like the P300 (Groppe et al., 2011; Maris & Oostenveld, 2007). 2500 permutations were used to estimate the distribution of the null hypothesis as it is over twice the number recommend by Manly (1997) for a family-wise alpha level of 0.05.

*Footnote: More specifically, the null hypothesis of the permutation test is that positive differences between conditions could have just as likely been negative differences and vice-versa. Thus, the distribution of the null hypothesis is symmetric around a difference of 0.

SHORT VERSION:

   To detect reliable differences between the ERPs to standards and targets in the XO oddball task, the ERPs from these conditions were submitted to a repeated measures, two-tailed cluster mass permutation test (Bullmore et al., 1999) using a family-wise alpha level of 0.05. All time points between 100 and 900 ms at all 26 scalp electrodes were included in the test (i.e., 5226 total comparisons) and any electrodes within approximately 5.44 cm of one another were considered spatial neighbors. Repeated measures t-tests were performed for each comparison using the original data and 2500 random within-participant permutations of the data. For each permutation, all t-scores corresponding to uncorrected p-values of 0.05 of less were formed into clusters. The sum of the t-scores in each cluster is the "mass" of that cluster and the most extreme cluster mass in each of the 2501 sets of tests was recorded and used to estimate the distribution of the null hypothesis.

References:
  • Bullmore, E. T., Suckling, J., Overmeyer, S., Rabe-Hesketh, S., Taylor, E., & Brammer, M. J. (1999). Global, voxel, and cluster tests, by theory and permutation, for a difference between two groups of structural MR images of the brain. IEEE Transactions on Medical Imaging, 18(1), 32-42
  • Groppe, D. M., Urbach, T. P., & Kutas, M. (2011). Mass univariate analysis of event-related brain potentials/fields II: Simulation studies. Psychophysiology.
  • Manly, B. F. J. (1997). Randomization, Bootstrap, and Monte Carlo Methods in Biology (2nd ed.). London: Chapman & Hall.
  • Maris, E., & Oostenveld, R. (2007). Nonparametric statistical testing of EEG- and MEG-data. Journal of Neuroscience Methods, 164(1), 177-190.



Testing for an interaction between two factors

If you have two binary factors you might be interested in detecting interactions between those conditions. You can do that with the code covered above. For example, as already mentioned, the GND variable used in the demonstration above contains ERPs from two different oddball tasks (an X vs. O discrimination and an upper case vs. lower case word discrimination). Thus we have two binary factors: oddball task type (XO or letter case) and stimulus type (target or standard). We can detect an interaction between those two factors by deriving a difference wave for each factor and then comparing the two difference waves. This is mathematically equivalent to performing a two-factor ANOVA with an interaction term.

To illustrate, we will continue with the above example. We already have a bin in our GND variable for the effect of stimulus type in the XO oddball task. Now, we need to make a difference wave for the effect of stimulus type in the letter case oddball task:
>>GND=bin_dif(GND,4,3,'Targ-Stand (Case)');

If we look at the letter case task oddball effect in the letter case task we, can see a clear P3b effect that is somewhat delayed relative to the XO task oddball effect:
>>gui_erp(GND,'bin',6);


Now we compute the difference between the oddball effects in the two oddball tasks:
>>GND=bin_dif(GND,5,6,'XO-Case (Target Effect)');

And we perform a tmax permutation test on that difference of difference waves:
>>GND=tmaxGND(GND,7,'time_wind',[100 900],'n_perm',5000);

This shows us that the oddball P2 effect begins earlier in the XO oddball task (around 200 ms) and that there is what appears to be an N2 effect in the XO oddball task that isn't there in the letter case oddball task (around 300 ms).




Summary of relevant functions

  • bin_dif: A function that creates a new bin in a GND variable that is the difference between two existing bins.
  • clustGND: Effectively performs one-sample/repeated-measures t-tests on multiple electrodes and multiple time points and then uses a cluster mass permutation test to assess statistical significance. This procedure provides weak control of the family-wise error rate.
  • decimateGND: Downsamples the ERPs in a GND variable to a lower sampling rate (e.g., from 250 Hz to 125 Hz). Useful for reducing the number of comparisons/increasing test power.
  • fast_t1: The function called by tfdrGND.m that computes one-sample/repeated measures t-tests.
  • gui_erp: Creates a GUI for exploring the ERPs in a GND (or GRP) variable. Visualizes ERP standard error of the mean, t-scores, and global field power as well. Useful for visualizing the results of mass univariate t-tests and exploring your data.
  • headinfo: Summarizes the contents of a GND (or GRP) variable.
  • mxt_perm1: The function called by tmaxGND.m that actually computes a one-sample/repeated measures tmax permutation test.
  • rm_bins: Removes one or more bins (and associated permutation test results) from a GND (or GRP) variable.
  • rm_ttest: Removes one or more permutation/FDR t-test test results from a GND (or GRP) variable.
  • save_erps: Saves a GND (or GRP) struct variable to disk.
  • sets2GND: Takes the data from a set of EEGLAB *.set files and bundles them together into a GND struct variable. Note, that you need to apply the function bin_info2EEG.m to each set file before you can run sets2GND on that set file.
  • sig_raster: Creates a two-dimensional binary "raster" diagram (Channel x Time) for visualizing the results of mass univariate tests.
  • sig_topo: Plots ERP/ERP-effect topographies for t-tests performed on ERPs/difference waves after averaging voltage within one or more time windows.
  • spatial_neighbors: Derives a symmetric binary matrix that indicates which electrodes are spatial neighbors of one another. For use with cluster-based permutation tests.
  • tfdrGND: Performs one-sample/repeated-measures t-tests on multiple electrodes and multiple time points. Significance thresholds are determined by using a procedure to control the false discovery rate. The function currently supports three procedures for FDR control: the original Benjamini & Hochberg algorithm (1995) and two variants of that method.
  • tmaxGND: Effectively performs one-sample/repeated-measures t-tests on multiple electrodes and multiple time points via a tmax permutation test. This procedure controls the family-wise error rate at a specified level (like Bonferroni correction).