User:Lindenb/Notebook/UMR915/20101210: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 4: Line 4:
MosaikText -in align2.sorted.mka -bam sample2.bam</pre>
MosaikText -in align2.sorted.mka -bam sample2.bam</pre>


recalibrate with GATK:
===recalibrate with GATK:===
create a subset of dbsnp_129.rod for my ranges
create a subset of dbsnp_129.rod for my ranges
<pre>java -jar GenomeAnalysisTK.jar -I sample1.bam -R chrXXXXX.fa  -T CountCovariates -l INFO  -recalFile recal_data1.csv  -cov ReadGroupCovariate -cov QualityScoreCovariate  -cov CycleCovariate  -cov DinucCovariate --DBSNP dbsnp_129_chrXXXX.rod -U </pre>
<pre>java -jar GenomeAnalysisTK.jar -I sample1.bam -R chrXXXXX.fa  -T CountCovariates -l INFO  -recalFile recal_data1.csv  -cov ReadGroupCovariate -cov QualityScoreCovariate  -cov CycleCovariate  -cov DinucCovariate --DBSNP dbsnp_129_chrXXXX.rod -U </pre>


sample1:<pre>INFO  15:31:10,995 TraversalEngine - [PROGRESS] Traversed 5,625,164 sites in 73.96 secs (13.15 secs per 1M sites)
INFO  15:31:10,996 TraversalEngine - Total runtime 73.96 secs, 1.23 min, 0.02 hours
INFO  15:31:10,998 TraversalEngine - 89 reads were filtered out during traversal out of 187197 total (0.05%)
INFO  15:31:10,998 TraversalEngine -  -> 89 reads (0.05% of total) failing ZeroMappingQualityReadFilter</pre>
sample2:<pre></pre>


calling with GATK
what recal_data1.csv does look like ?<pre># Counted Sites    4619191
# Counted Bases    80003987
# Skipped Sites    18142
# Fraction Skipped 1 / 255 bp
ReadGroup,QualityScore,Cycle,Dinuc,nObservations,nMismatches,Qempirical
ZDID8XTBKGO,7,106,AA,2,0,40
ZDID8XTBKGO,7,107,AT,2,0,40
ZDID8XTBKGO,7,107,NN,1,0,40
ZDID8XTBKGO,7,107,TT,5,0,40
ZDID8XTBKGO,7,108,AA,1,0,40
ZDID8XTBKGO,7,108,CT,2,0,40
ZDID8XTBKGO,7,108,GA,1,0,40
ZDID8XTBKGO,7,108,GT,1,0,40
ZDID8XTBKGO,7,108,TA,1,1,1
ZDID8XTBKGO,7,108,TT,1,0,40</pre>
 
 
===calling with GATK===
<pre> java -jar GenomeAnalysisTK.jar -I sample1.bam -R chrXXXXXXXXXXXXX.fa  -T UnifiedGenotyper -o sample1.vcf  -stand_call_conf 50.0 -U -S SILENT -L "chrX:x1-x2" </pre>
<pre> java -jar GenomeAnalysisTK.jar -I sample1.bam -R chrXXXXXXXXXXXXX.fa  -T UnifiedGenotyper -o sample1.vcf  -stand_call_conf 50.0 -U -S SILENT -L "chrX:x1-x2" </pre>
or with a list of position
or with a list of position
<pre>java -jar GenomeAnalysisTK.jar  -I sample1.bam -R chrXXXXXXXXXXXXX.fa  -T UnifiedGenotyper -o sample1.vcf  -stand_call_conf 50.0 -U -S SILENT -L ranges.list</pre>
<pre>java -jar GenomeAnalysisTK.jar  -I sample1.bam -R chrXXXXXXXXXXXXX.fa  -T UnifiedGenotyper -o sample1.vcf  -stand_call_conf 50.0 -U -S SILENT -L ranges.list</pre>

Revision as of 07:35, 10 December 2010

20101209        Top        20101213       


Belgium

 /usr/local/package/mosaik-aligner/bin/MosaikSort  -in align2.mka -out align2.sorted.mka
MosaikText -in align2.sorted.mka -bam sample2.bam

recalibrate with GATK:

create a subset of dbsnp_129.rod for my ranges

java -jar GenomeAnalysisTK.jar -I sample1.bam -R chrXXXXX.fa  -T CountCovariates -l INFO  -recalFile recal_data1.csv  -cov ReadGroupCovariate -cov QualityScoreCovariate  -cov CycleCovariate  -cov DinucCovariate --DBSNP dbsnp_129_chrXXXX.rod -U 

sample1:

INFO  15:31:10,995 TraversalEngine - [PROGRESS] Traversed 5,625,164 sites in 73.96 secs (13.15 secs per 1M sites) 
INFO  15:31:10,996 TraversalEngine - Total runtime 73.96 secs, 1.23 min, 0.02 hours 
INFO  15:31:10,998 TraversalEngine - 89 reads were filtered out during traversal out of 187197 total (0.05%) 
INFO  15:31:10,998 TraversalEngine -   -> 89 reads (0.05% of total) failing ZeroMappingQualityReadFilter

sample2:

what recal_data1.csv does look like ?

# Counted Sites    4619191
# Counted Bases    80003987
# Skipped Sites    18142
# Fraction Skipped 1 / 255 bp
ReadGroup,QualityScore,Cycle,Dinuc,nObservations,nMismatches,Qempirical
ZDID8XTBKGO,7,106,AA,2,0,40
ZDID8XTBKGO,7,107,AT,2,0,40
ZDID8XTBKGO,7,107,NN,1,0,40
ZDID8XTBKGO,7,107,TT,5,0,40
ZDID8XTBKGO,7,108,AA,1,0,40
ZDID8XTBKGO,7,108,CT,2,0,40
ZDID8XTBKGO,7,108,GA,1,0,40
ZDID8XTBKGO,7,108,GT,1,0,40
ZDID8XTBKGO,7,108,TA,1,1,1
ZDID8XTBKGO,7,108,TT,1,0,40


calling with GATK

 java -jar GenomeAnalysisTK.jar -I sample1.bam -R chrXXXXXXXXXXXXX.fa  -T UnifiedGenotyper -o sample1.vcf  -stand_call_conf 50.0 -U -S SILENT -L "chrX:x1-x2" 

or with a list of position

java -jar GenomeAnalysisTK.jar  -I sample1.bam -R chrXXXXXXXXXXXXX.fa  -T UnifiedGenotyper -o sample1.vcf  -stand_call_conf 50.0 -U -S SILENT -L ranges.list