User:Timothee Flutre/Notebook/Postdoc/2012/02/01: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(Autocreate 2012/02/01 Entry for User:Timothee_Flutre/Notebook/Postdoc)
 
(→‎Entry title: first version)
Line 6: Line 6:
| colspan="2"|
| colspan="2"|
<!-- ##### DO NOT edit above this line unless you know what you are doing. ##### -->
<!-- ##### DO NOT edit above this line unless you know what you are doing. ##### -->
==Entry title==
==Find SNPs in cis of genes==
* Insert content here...
 
* retrieve annotations from the UCSC:
 
wget -O Ensembl_hg19_UCSC_20111019.txt.gz ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/ensGene.txt.gz
 
* convert transcripts and genes to BED format:
 
zcat Ensembl_hg19_UCSC_20111019.txt.gz | awk '{print $3"\t"$5"\t"$6"\t"$13"|"$2}' | gzip > Ensembl_transcripts.bed.gz
transcripts2genes.py Ensembl_hg19_UCSC_20111019.txt.gz Ensembl_genes.bed.gz
 
* identify SNPs in cis of each gene (500kb in 5' of TSS and 3' of TES):
 
for i in {1..22}; do echo "chr"${i}"..."; awk -v i=${i} -F" " '{print "chr"i"\t"$3-1"\t"$3"\t"$2}' /path/to/chr${i}.impute \
  | windowBed -w 500000 -a links-genes-probes.gz -b stdin \
  | awk '{print $4"\t"$9"|"$8}' \
  | gzip > chr${i}_genes_cisSNPs.txt.gz; done





Revision as of 16:46, 1 February 2012

Project name <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page
<html><img src="/images/c/c3/Resultset_previous.png" border="0" /></html>Previous entry<html>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</html>Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html>

Find SNPs in cis of genes

  • retrieve annotations from the UCSC:
wget -O Ensembl_hg19_UCSC_20111019.txt.gz ftp://hgdownload.cse.ucsc.edu/goldenPath/hg19/database/ensGene.txt.gz
  • convert transcripts and genes to BED format:
zcat Ensembl_hg19_UCSC_20111019.txt.gz | awk '{print $3"\t"$5"\t"$6"\t"$13"|"$2}' | gzip > Ensembl_transcripts.bed.gz
transcripts2genes.py Ensembl_hg19_UCSC_20111019.txt.gz Ensembl_genes.bed.gz
  • identify SNPs in cis of each gene (500kb in 5' of TSS and 3' of TES):
for i in {1..22}; do echo "chr"${i}"..."; awk -v i=${i} -F" " '{print "chr"i"\t"$3-1"\t"$3"\t"$2}' /path/to/chr${i}.impute \
windowBed -w 500000 -a links-genes-probes.gz -b stdin \ "$8}' \ gzip > chr${i}_genes_cisSNPs.txt.gz; done