RRedon:Protocols/Variation pipeline/BWA: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
mNo edit summary
 
Line 22: Line 22:
* Generate alignments in the SAM format given paired-end reads. Repetitive read pairs will be placed randomly.  
* Generate alignments in the SAM format given paired-end reads. Repetitive read pairs will be placed randomly.  


   bwa sampe hg18.fasta output1.aln output2.aln file1.fastq.gz file2.fastq.gz | gzip --best >  output.sam  
   bwa sampe hg18.fasta output1.aln output2.aln file1.fastq.gz file2.fastq.gz | gzip --best >  output.sam.gz





Latest revision as of 00:49, 9 June 2010

Home        Contact        Internal        Lab Members        Protocols        Publications        Research        Talks       


Download

Indexing the reference sequence

See Main article : Reference genome

Map

  • Align one fastq files

-l Take the first INT subsequence as seed

-q Parameter for read trimming.


-t number of threads = 10 on server 2

 bwa aln -l 32 -q 15 -t 10 -foutput1.aln hg18.fasta file1.fastq.gz
 bwa aln -l 32 -q 15 -t 10 -foutput2.aln hg18.fasta file2.fastq.gz
  • Generate alignments in the SAM format given paired-end reads. Repetitive read pairs will be placed randomly.
 bwa sampe hg18.fasta output1.aln output2.aln file1.fastq.gz file2.fastq.gz | gzip --best >  output.sam.gz


  • export to bam ?
 samtools view output.sam >  output.bam 


← Fix this! I used this ? Use the reference genome indexed by samtools

 samtools import hg18.fa.fai output.sam output.bam
 samtools sort output.bam output.bam.sorted
 samtools index chr1.sorted.bam
  • sort bam
 samtools sort output.bam sorted_prefix 

do insert size stats e.g. 99.8 percentile for MAQ max insert size ← Fix this! what does that mean ?