RRedon:Protocols/Variation pipeline/Reference genome: Difference between revisions
From OpenWetWare
Jump to navigationJump to search
(New page: {{RRedon}} ==Reference Genome== Download the hg18/build36 from UCSC: [http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes/ http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes...) |
(→MAQ) |
||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
{{RRedon}} | {{RRedon}} | ||
= | =Download= | ||
Download the hg18/build36 from UCSC: [http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes/ http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes] | Download the hg18/build36 from UCSC: [http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes/ http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes] | ||
Line 10: | Line 10: | ||
ls chr*.fa | grep -v _hap | xargs cat > hg18.fa | ls chr*.fa | grep -v _hap | xargs cat > hg18.fa | ||
rm -f chr*.fa | rm -f chr*.fa | ||
=Indexing= | |||
==MAQ== | |||
(main article for [[RRedon:Protocols/Variation_pipeline/MAQ|MAQ]]). | |||
maq fasta2bfa hg18.fa hg18.bfa | |||
(...) | |||
-- 45 sequences have been converted | |||
ls -lah | |||
-rw-r--r-- 1 root root 1,5G jun 2 17:13 hg18.bfa | |||
==BWA== | |||
(main article for [[RRedon:Protocols/Variation_pipeline/BWA|BWA]]). | |||
Index the reference genome: | |||
bwa index -a bwtsw hg18.fasta | |||
(....) | |||
[bwt_gen] Finished constructing BWT in 311 iterations. | |||
[bwa_index] 2229.02 seconds elapse. | |||
[bwa_index] Update BWT... 15.79 sec | |||
[bwa_index] Update reverse BWT... 15.97 sec | |||
[bwa_index] Construct SA from BWT and Occ... 1001.58 sec | |||
[bwa_index] Construct SA from reverse BWT and Occ... 987.96 sec | |||
ls -la | |||
-rw-r--r-- 1 root root 3,0G jun 2 14:47 hg18.fa | |||
-rw-r--r-- 1 root root 123K jun 2 15:15 hg18.fa.amb | |||
-rw-r--r-- 1 root root 1,8K jun 2 15:15 hg18.fa.ann | |||
-rw-r--r-- 1 root root 1,1G jun 2 16:30 hg18.fa.bwt | |||
-rw-r--r-- 1 root root 739M jun 2 15:15 hg18.fa.pac | |||
-rw-r--r-- 1 root root 1,1G jun 2 16:30 hg18.fa.rbwt | |||
-rw-r--r-- 1 root root 739M jun 2 15:15 hg18.fa.rpac | |||
-rw-r--r-- 1 root root 370M jun 2 17:03 hg18.fa.rsa | |||
-rw-r--r-- 1 root root 370M jun 2 16:47 hg18.fa.sa | |||
==Samtools== | |||
samtools faidx hg18.fa | |||
will create a file: | |||
hg18.fa.fai | |||
[[Category:Bioinformatics]] | [[Category:Bioinformatics]] | ||
[[Category:NGS]] | [[Category:NGS]] |
Latest revision as of 08:27, 2 June 2010
Download
Download the hg18/build36 from UCSC: http://hgdownload.cse.ucsc.edu/goldenPath/hg18/chromosomes
export http_proxy=${PXYHOST}:{PXYPORT} wget "http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/chromFa.zip" md5sum chromFa.zip 7fc7f751134f3800f646118e39f9991d chromFa.zip ##OK same as http://hgdownload.cse.ucsc.edu/goldenPath/hg18/bigZips/md5sum.txt unzip chromFa.zip ls chr*.fa | grep -v _hap | xargs cat > hg18.fa rm -f chr*.fa
Indexing
MAQ
(main article for MAQ).
maq fasta2bfa hg18.fa hg18.bfa
(...)
-- 45 sequences have been converted ls -lah -rw-r--r-- 1 root root 1,5G jun 2 17:13 hg18.bfa
BWA
(main article for BWA).
Index the reference genome:
bwa index -a bwtsw hg18.fasta
(....)
[bwt_gen] Finished constructing BWT in 311 iterations. [bwa_index] 2229.02 seconds elapse. [bwa_index] Update BWT... 15.79 sec [bwa_index] Update reverse BWT... 15.97 sec [bwa_index] Construct SA from BWT and Occ... 1001.58 sec [bwa_index] Construct SA from reverse BWT and Occ... 987.96 sec
ls -la -rw-r--r-- 1 root root 3,0G jun 2 14:47 hg18.fa -rw-r--r-- 1 root root 123K jun 2 15:15 hg18.fa.amb -rw-r--r-- 1 root root 1,8K jun 2 15:15 hg18.fa.ann -rw-r--r-- 1 root root 1,1G jun 2 16:30 hg18.fa.bwt -rw-r--r-- 1 root root 739M jun 2 15:15 hg18.fa.pac -rw-r--r-- 1 root root 1,1G jun 2 16:30 hg18.fa.rbwt -rw-r--r-- 1 root root 739M jun 2 15:15 hg18.fa.rpac -rw-r--r-- 1 root root 370M jun 2 17:03 hg18.fa.rsa -rw-r--r-- 1 root root 370M jun 2 16:47 hg18.fa.sa
Samtools
samtools faidx hg18.fa
will create a file:
hg18.fa.fai