User:Timothee Flutre/Notebook/Postdoc/2012/01/22: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(→‎Entry title: first commit)
(→‎Entry title: add gnuplot cmd-line)
Line 34: Line 34:
  plot  "data.txt" using 2:3 title "Data", "" using 2:($3+0.1):1 with labels notitle
  plot  "data.txt" using 2:3 title "Data", "" using 2:($3+0.1):1 with labels notitle
  ## pause 9999
  ## pause 9999
We can obtain the new plot like this:
$ gnuplot < commands.gnuplot > newplot.ps; ps2pdf newplot.ps newplot.pdf


<!-- ##### DO NOT edit below this line unless you know what you are doing. ##### -->
<!-- ##### DO NOT edit below this line unless you know what you are doing. ##### -->

Revision as of 11:30, 22 January 2012

Project name <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page
<html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>Previous entry<html>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</html>Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html>

Entry title

  • Use EIGENSOFT to apply PCA on genotypes in order to check how individuals sampled in the study are related with each other.

Assuming that the genotypes are in the IMPUTE format, we need to convert them into the EIGENSTRAT format thanks to this script:

impute2eigenstrat.py -i genotypes_allchrs.impute.gz -o mystudy -l MyStudy

Now we can run the PCA:

smartpca.perl -i mystudy_eigenstrat_genotypes.txt -a mystudy_eigenstrat_snp.txt -b mystudy_eigenstrat_indiv.txt \
-o mystudy_eigenstrat_pca.txt -p mystudy_eigenstrat_plot -e mystudy_eigenstrat_eigenvalues.txt -l mystudy_eigenstrat_log.txt -s 6.0
  • If GNUPLOT is installed, we now have a plot of the first two principal components (in a pdf document). It may be useful to add the labels of the individuals on this plot. For this, one can modify the file "mystudy_eigenstrat_plot.xtxt" containing the commands for GNUPLOT.

As an example, let's assume that the output data file from EIGENSOFT containing the PCs is like this (3 individuals, 2 PCs):

$ cat data.txt
ind1	2.0	2.0
ind2	2.5	1.8
ind3	1.8	2.0

Here is the file with the commands:

$ cat commands.gnuplot
set terminal postscript color
set title  "" 
set xlabel  "eigenvector 1" 
set ylabel  "eigenvector 2" 
#plot  "mystudy_eigenstrat_pca.txt.evec" using 2:3 title "MyStudy"
plot  "data.txt" using 2:3 title "Data", "" using 2:($3+0.1):1 with labels notitle
## pause 9999

We can obtain the new plot like this:

$ gnuplot < commands.gnuplot > newplot.ps; ps2pdf newplot.ps newplot.pdf