# User:Timothee Flutre/Notebook/Postdoc/2011/11/16

(Difference between revisions)
 Revision as of 15:57, 16 November 2011 (view source) (Autocreate 2011/11/16 Entry for User:Timothee_Flutre/Notebook/Postdoc)← Previous diff Revision as of 16:43, 16 November 2011 (view source) (try pkg snpStats)Next diff → Line 7: Line 7: ==Entry title== ==Entry title== - * Insert content here... + + * try the R/Bioconductor package [http://www.bioconductor.org/packages/devel/bioc/html/snpStats.html snpStats]: + + library(snpStats) + tmp <- matrix(c(1,3,2,1,3,0,1,3,0,1), ncol=2, dimnames=list(paste("snp", 1:5, sep=""), paste("ind", 1:2, sep=""))) + tmp + tmp2 <- new("SnpMatrix", t(tmp)) + tmp2 + summary(tmp2) + print(as(t(tmp2), 'character')) + print(as(t(tmp2), 'numeric')) + + Unfortunately, it doesn't seem possible to convert a matrix of characters into SnpMatrix, assuming 1=AA, 2=AB, 3=BB and 0=NC: + + tmp <- matrix(c("A/A","B/B","A/B","A/A","B/B","","A/A","B/B","","A/A"), ncol=2, dimnames=list(paste("snp", 1:5, sep=""), paste("ind", 1:2, sep=""))) + tmp + tmp2 <- new("SnpMatrix", t(tmp)) + + Thus, in the case where one has a matrix of genotypes obtained by Illumina (whether we have AA or A/A), we need to convert it first to the 1/2/3/0 encoding: + + tmp <- gsub("A/A", 1, tmp) + tmp <- gsub("A/B", 2, tmp) + tmp <- gsub("B/B", 3, tmp) + tmp <- gsub("^\$", 0, tmp) + tmp <- matrix(as.numeric(tmp), ncol=ncol(tmp), dimnames=list(rownames(tmp), colnames(tmp))) + tmp + tmp2 <- new("SnpMatrix", t(tmp)) + tmp2 + summary(tmp2) + + Then, one can easily look at summary statistics, eg. the histogram of minor allele frequencies, of z-score for HWE, etc, and filter data accordingly: + + hist(col.summary(tmp2)\$MAF) + hist(col.summary(tmp2)\$z.HWE)

## Revision as of 16:43, 16 November 2011

Project name Main project page
Previous entry      Next entry

## Entry title

• try the R/Bioconductor package snpStats:
```library(snpStats)
tmp <- matrix(c(1,3,2,1,3,0,1,3,0,1), ncol=2, dimnames=list(paste("snp", 1:5, sep=""), paste("ind", 1:2, sep="")))
tmp
tmp2 <- new("SnpMatrix", t(tmp))
tmp2
summary(tmp2)
print(as(t(tmp2), 'character'))
print(as(t(tmp2), 'numeric'))
```

Unfortunately, it doesn't seem possible to convert a matrix of characters into SnpMatrix, assuming 1=AA, 2=AB, 3=BB and 0=NC:

```tmp <- matrix(c("A/A","B/B","A/B","A/A","B/B","","A/A","B/B","","A/A"), ncol=2, dimnames=list(paste("snp", 1:5, sep=""), paste("ind", 1:2, sep="")))
tmp
tmp2 <- new("SnpMatrix", t(tmp))
```

Thus, in the case where one has a matrix of genotypes obtained by Illumina (whether we have AA or A/A), we need to convert it first to the 1/2/3/0 encoding:

```tmp <- gsub("A/A", 1, tmp)
tmp <- gsub("A/B", 2, tmp)
tmp <- gsub("B/B", 3, tmp)
tmp <- gsub("^\$", 0, tmp)
tmp <- matrix(as.numeric(tmp), ncol=ncol(tmp), dimnames=list(rownames(tmp), colnames(tmp)))
tmp
tmp2 <- new("SnpMatrix", t(tmp))
tmp2
summary(tmp2)
```

Then, one can easily look at summary statistics, eg. the histogram of minor allele frequencies, of z-score for HWE, etc, and filter data accordingly:

```hist(col.summary(tmp2)\$MAF)
hist(col.summary(tmp2)\$z.HWE)
```