RRedon:Protocols/Variation pipeline/Reference genome

From OpenWetWare
Jump to navigationJump to search

Home        Contact        Internal        Lab Members        Protocols        Publications        Research        Talks       


Download

Download the hg18/build36 from UCSC: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes

 export http_proxy=${PXYHOST}:{PXYPORT}
 wget "http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/chromFa.zip"
 md5sum chromFa.zip
 7fc7f751134f3800f646118e39f9991d  chromFa.zip ##OK same as http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/md5sum.txt
 unzip  chromFa.zip
 ls chr*.fa | grep -v _hap | xargs cat > hg18.fa
 rm -f chr*.fa

Indexing

MAQ

(main article for MAQ).

 maq fasta2bfa hg18.fa hg18.bfa

(...)

 -- 45 sequences have been converted
 
 ls -lah
 -rw-r--r-- 1 root root 1,5G jun  2 17:13 hg18.bfa

BWA

(main article for BWA).

Index the reference genome:

  bwa index -a bwtsw hg18.fasta 

(....)

  [bwt_gen] Finished constructing BWT in 311 iterations.
  [bwa_index] 2229.02 seconds elapse.
  [bwa_index] Update BWT... 15.79 sec
  [bwa_index] Update reverse BWT... 15.97 sec
  [bwa_index] Construct SA from BWT and Occ... 1001.58 sec
  [bwa_index] Construct SA from reverse BWT and Occ... 987.96 sec
   ls -la
  -rw-r--r-- 1 root root 3,0G jun  2 14:47 hg18.fa
  -rw-r--r-- 1 root root 123K jun  2 15:15 hg18.fa.amb
  -rw-r--r-- 1 root root 1,8K jun  2 15:15 hg18.fa.ann
  -rw-r--r-- 1 root root 1,1G jun  2 16:30 hg18.fa.bwt
  -rw-r--r-- 1 root root 739M jun  2 15:15 hg18.fa.pac
  -rw-r--r-- 1 root root 1,1G jun  2 16:30 hg18.fa.rbwt
  -rw-r--r-- 1 root root 739M jun  2 15:15 hg18.fa.rpac
  -rw-r--r-- 1 root root 370M jun  2 17:03 hg18.fa.rsa
  -rw-r--r-- 1 root root 370M jun  2 16:47 hg18.fa.sa

Samtools

 samtools faidx hg18.fa

will create a file:

 hg18.fa.fai