User:Lindenb/Notebook/UMR915/20110222: Difference between revisions
From OpenWetWare
Line 61: | Line 61: | ||
for the Align param ( http://code.google.com/p/mosaik-aligner/wiki/ParameterSettings ) | for the Align param ( http://code.google.com/p/mosaik-aligner/wiki/ParameterSettings ) | ||
==Align== | |||
<pre>/usr/local/package/mosaik-aligner/bin/MosaikAligner -mmp 0.05 -act 26 -bw 51 -mhp 100 -p 7 -in ../reads1.mkb -ia _ignore.backup/hg19/reference.dat -j _ignore.backup/hg19/reference_hs15 -out align1.hg19.mka | |||
------------------------------------------------------------------------------ | |||
MosaikAligner 1.1.0018 2010-10-29 | |||
Michael Stromberg & Wan-Ping Lee Marth Lab, Boston College Biology Department | |||
------------------------------------------------------------------------------ | |||
- Using the following alignment algorithm: all positions | |||
- Using the following alignment mode: aligning reads to all possible locations | |||
- Using a maximum mismatch percent threshold of 0.05 | |||
- Using 7 processors | |||
- Using a Smith-Waterman bandwidth of 51 | |||
- Using an alignment candidate threshold of 26bp. | |||
- Setting hash position threshold to 100 | |||
- Using a jump database for hashing. Storing keys & positions in memory. | |||
- Using a homo-polymer gap open penalty of 4 | |||
- loading reference sequence... finished. | |||
- loading jump key database into memory... finished. | |||
- loading jump positions database into memory... finished. | |||
Aligning read library (317992): | |||
100%[============================================================================================================================] 83.7 reads/s in 1:03:18 | |||
Alignment statistics (mates): | |||
=================================== | |||
# filtered out: 19943 ( 6.3 %) | |||
# unique: 288449 ( 90.7 %) | |||
# non-unique: 9600 ( 3.0 %) | |||
----------------------------------- | |||
total: 317992 | |||
total aligned: 298049 ( 93.7 %) | |||
MosaikAligner CPU time: 26744.380 s, wall time: 4015.994 s | |||
/usr/local/package/mosaik-aligner/bin/MosaikAligner -mmp 0.05 -act 26 -bw 51 -mhp 100 -p 7 -in ../reads2.mkb -ia _ignore.backup/hg19/reference.dat -j _ignore.backup/hg19/reference_hs15 -out align2.hg19.mka | |||
------------------------------------------------------------------------------ | |||
MosaikAligner 1.1.0018 2010-10-29 | |||
Michael Stromberg & Wan-Ping Lee Marth Lab, Boston College Biology Department | |||
------------------------------------------------------------------------------ | |||
- Using the following alignment algorithm: all positions | |||
- Using the following alignment mode: aligning reads to all possible locations | |||
- Using a maximum mismatch percent threshold of 0.05 | |||
- Using 7 processors | |||
- Using a Smith-Waterman bandwidth of 51 | |||
- Using an alignment candidate threshold of 26bp. | |||
- Setting hash position threshold to 100 | |||
- Using a jump database for hashing. Storing keys & positions in memory. | |||
- Using a homo-polymer gap open penalty of 4 | |||
- loading reference sequence... finished. | |||
- loading jump key database into memory... finished. | |||
- loading jump positions database into memory... finished. | |||
Aligning read library (563204): | |||
100%[============================================================================================================================] 131.1 reads/s in 1:11:35 | |||
Alignment statistics (mates): | |||
=================================== | |||
# filtered out: 28749 ( 5.1 %) | |||
# unique: 519486 ( 92.2 %) | |||
# non-unique: 14969 ( 2.7 %) | |||
----------------------------------- | |||
total: 563204 | |||
total aligned: 534455 ( 94.9 %) | |||
MosaikAligner CPU time: 30219.450 s, wall time: 4500.977 s</pre> |
Revision as of 06:07, 22 February 2011
Belgium
re-aligning on hg19 . see Makefile in
/GENOTYPAGE/data/users/lindenb/20101108_belgium/20110222
/usr/local/package/mosaik-aligner/bin/MosaikBuild -fr /GENOTYPAGE/data/pubdb/ucsc/hg19/chromosomes/hg19.fa -oa _ignore.backup/hg19/reference.dat ------------------------------------------------------------------------------ MosaikBuild 1.1.0018 2010-10-29 Michael Stromberg & Wan-Ping Lee Marth Lab, Boston College Biology Department ------------------------------------------------------------------------------ - converting /GENOTYPAGE/data/pubdb/ucsc/hg19/chromosomes/hg19.fa to a reference sequence archive. - parsing reference sequences: ref seqs: 25 (0.2884 ref seqs/s) - writing reference sequences: 100%[===========================================================================================================] 0.8177 ref seqs/s in 30 s - calculating MD5 checksums: 100%[===========================================================================================================] 1.43 ref seqs/s in 17 s - writing reference sequence index: 100%[===========================================================================================================] 25.0 ref seqs/s in 1 s - creating concatenated reference sequence: 100%[===========================================================================================================] 7.13 ref seqs/s in 3 s - writing concatenated reference sequence... finished. - creating concatenated 2-bit reference sequence... finished. - writing concatenated 2-bit reference sequence... finished. - writing masking vector... finished. MosaikBuild CPU time: 187.420 s, wall time: 200.587 s /usr/local/package/mosaik-aligner/bin/MosaikJump -ia _ignore.backup/hg19/reference.dat -hs 15 -out _ignore.backup/hg19/reference_hs15 ------------------------------------------------------------------------------ MosaikJump 1.1.0018 2010-10-29 Michael Stromberg Marth Lab, Boston College Biology Department ------------------------------------------------------------------------------ - retrieving reference sequence... finished. - hashing reference sequence: 100%[=============================================================================================================] 2,301,657 hashes/s in 22:24 - serializing final sorting vector... finished. - writing jump positions database: 100%[=====================================================================================================] 1,423,918 hash positions/s in 33:29 - serializing jump keys database (17 blocks): blocks: 17 (0.9169 blocks/s) MosaikJump CPU time: 3327.950 s, wall time: 3405.920 s
mean length for sample 1 is 385.282
gunzip -c 1.TCA.454Reads.fna.gz | egrep ">" | tr " " "\n" | grep length | cut -d '=' -f 2 | awk '{n+=1.0; total+=int($1)*1.0;} END { print total/n;}'
using
-mmp 0.05 -act 26 -bw 51 -mhp 100
for the Align param ( http://code.google.com/p/mosaik-aligner/wiki/ParameterSettings )
Align
/usr/local/package/mosaik-aligner/bin/MosaikAligner -mmp 0.05 -act 26 -bw 51 -mhp 100 -p 7 -in ../reads1.mkb -ia _ignore.backup/hg19/reference.dat -j _ignore.backup/hg19/reference_hs15 -out align1.hg19.mka ------------------------------------------------------------------------------ MosaikAligner 1.1.0018 2010-10-29 Michael Stromberg & Wan-Ping Lee Marth Lab, Boston College Biology Department ------------------------------------------------------------------------------ - Using the following alignment algorithm: all positions - Using the following alignment mode: aligning reads to all possible locations - Using a maximum mismatch percent threshold of 0.05 - Using 7 processors - Using a Smith-Waterman bandwidth of 51 - Using an alignment candidate threshold of 26bp. - Setting hash position threshold to 100 - Using a jump database for hashing. Storing keys & positions in memory. - Using a homo-polymer gap open penalty of 4 - loading reference sequence... finished. - loading jump key database into memory... finished. - loading jump positions database into memory... finished. Aligning read library (317992): 100%[============================================================================================================================] 83.7 reads/s in 1:03:18 Alignment statistics (mates): =================================== # filtered out: 19943 ( 6.3 %) # unique: 288449 ( 90.7 %) # non-unique: 9600 ( 3.0 %) ----------------------------------- total: 317992 total aligned: 298049 ( 93.7 %) MosaikAligner CPU time: 26744.380 s, wall time: 4015.994 s /usr/local/package/mosaik-aligner/bin/MosaikAligner -mmp 0.05 -act 26 -bw 51 -mhp 100 -p 7 -in ../reads2.mkb -ia _ignore.backup/hg19/reference.dat -j _ignore.backup/hg19/reference_hs15 -out align2.hg19.mka ------------------------------------------------------------------------------ MosaikAligner 1.1.0018 2010-10-29 Michael Stromberg & Wan-Ping Lee Marth Lab, Boston College Biology Department ------------------------------------------------------------------------------ - Using the following alignment algorithm: all positions - Using the following alignment mode: aligning reads to all possible locations - Using a maximum mismatch percent threshold of 0.05 - Using 7 processors - Using a Smith-Waterman bandwidth of 51 - Using an alignment candidate threshold of 26bp. - Setting hash position threshold to 100 - Using a jump database for hashing. Storing keys & positions in memory. - Using a homo-polymer gap open penalty of 4 - loading reference sequence... finished. - loading jump key database into memory... finished. - loading jump positions database into memory... finished. Aligning read library (563204): 100%[============================================================================================================================] 131.1 reads/s in 1:11:35 Alignment statistics (mates): =================================== # filtered out: 28749 ( 5.1 %) # unique: 519486 ( 92.2 %) # non-unique: 14969 ( 2.7 %) ----------------------------------- total: 563204 total aligned: 534455 ( 94.9 %) MosaikAligner CPU time: 30219.450 s, wall time: 4500.977 s