RRedon:Protocols/Variation pipeline/Reference genome
From OpenWetWare
Jump to navigationJump to search
Download
Download the hg18/build36 from UCSC: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes
export http_proxy=${PXYHOST}:{PXYPORT} wget "http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/chromFa.zip" md5sum chromFa.zip 7fc7f751134f3800f646118e39f9991d chromFa.zip ##OK same as http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/md5sum.txt unzip chromFa.zip ls chr*.fa | grep -v _hap | xargs cat > hg18.fa rm -f chr*.fa
Indexing
MAQ
(main article for MAQ).
maq fasta2bfa hg18.fa hg18.bfa
(...)
-- 45 sequences have been converted ls -lah -rw-r--r-- 1 root root 1,5G jun 2 17:13 hg18.bfa
BWA
(main article for BWA).
Index the reference genome:
bwa index -a bwtsw hg18.fasta
(....)
[bwt_gen] Finished constructing BWT in 311 iterations. [bwa_index] 2229.02 seconds elapse. [bwa_index] Update BWT... 15.79 sec [bwa_index] Update reverse BWT... 15.97 sec [bwa_index] Construct SA from BWT and Occ... 1001.58 sec [bwa_index] Construct SA from reverse BWT and Occ... 987.96 sec
ls -la -rw-r--r-- 1 root root 3,0G jun 2 14:47 hg18.fa -rw-r--r-- 1 root root 123K jun 2 15:15 hg18.fa.amb -rw-r--r-- 1 root root 1,8K jun 2 15:15 hg18.fa.ann -rw-r--r-- 1 root root 1,1G jun 2 16:30 hg18.fa.bwt -rw-r--r-- 1 root root 739M jun 2 15:15 hg18.fa.pac -rw-r--r-- 1 root root 1,1G jun 2 16:30 hg18.fa.rbwt -rw-r--r-- 1 root root 739M jun 2 15:15 hg18.fa.rpac -rw-r--r-- 1 root root 370M jun 2 17:03 hg18.fa.rsa -rw-r--r-- 1 root root 370M jun 2 16:47 hg18.fa.sa
Samtools
samtools faidx hg18.fa
will create a file:
hg18.fa.fai