User:Timothee Flutre/Notebook/Postdoc/2013/12/01: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(Autocreate 2013/12/01 Entry for User:Timothee_Flutre/Notebook/Postdoc)
 
(→‎Entry title: first version)
Line 6: Line 6:
| colspan="2"|
| colspan="2"|
<!-- ##### DO NOT edit above this line unless you know what you are doing. ##### -->
<!-- ##### DO NOT edit above this line unless you know what you are doing. ##### -->
==Entry title==
==One-liners for high-throughput sequencing data==
* Insert content here...


* '''softwares''': see BWA, Bowtie, MOSAIK, etc; they take fastq files as input and return bam files as output
    IN1="reads_R1.fastq.gz"
    IN2="reads_R2.fastq.gz"
    OUT="alignments.bam"
* '''total number of reads in the fastq file''':
    nbLines=$(zcat $IN1 | wc -l); echo "scale=0; "${nbLines}"/4" | bc -l
* '''total number of reads in the bam file''': should be equal to the nb of reads in the fastq file if no filtering was made
    samtools view $OUT | cut -f1 | sort | uniq | wc -l
* '''flag statistics in SAM/BAM''':
    samtools flagstat $OUT
which returns something like:
    4635834 + 0 in total (QC-passed reads + QC-failed reads)
    20290 + 0 secondary
    0 + 0 supplimentary
    0 + 0 duplicates
    4443270 + 0 mapped (95.85%:-nan%)
    4615544 + 0 paired in sequencing
    2307772 + 0 read1
    2307772 + 0 read2
    4299122 + 0 properly paired (93.14%:-nan%)
    4412810 + 0 with itself and mate mapped
    10170 + 0 singletons (0.22%:-nan%)
    57898 + 0 with mate mapped to a different chr
    44330 + 0 with mate mapped to a different chr (mapQ>=5)
* '''total number of entries in the bam file''': same as line 1
    samtools view $OUT | wc -l
* '''list of different flags in the bam file''': along with their number of occurrences
    samtools view $OUT | cut -f2 | sort | uniq -c
* '''total number of mapped entries''': same as line 5
    samtools view -F 4 $OUT | wc -l
* '''total number of unmapped entries''': same as line 1 - line 5
    samtools view -f 4 $OUT | wc -l


<!-- ##### DO NOT edit below this line unless you know what you are doing. ##### -->
<!-- ##### DO NOT edit below this line unless you know what you are doing. ##### -->

Revision as of 06:48, 1 July 2015

Project name <html><img src="/images/9/94/Report.png" border="0" /></html> Main project page
Next entry<html><img src="/images/5/5c/Resultset_next.png" border="0" /></html>

One-liners for high-throughput sequencing data

  • softwares: see BWA, Bowtie, MOSAIK, etc; they take fastq files as input and return bam files as output
   IN1="reads_R1.fastq.gz"
   IN2="reads_R2.fastq.gz"
   OUT="alignments.bam"
  • total number of reads in the fastq file:
   nbLines=$(zcat $IN1 | wc -l); echo "scale=0; "${nbLines}"/4" | bc -l
  • total number of reads in the bam file: should be equal to the nb of reads in the fastq file if no filtering was made
   samtools view $OUT | cut -f1 | sort | uniq | wc -l
  • flag statistics in SAM/BAM:
   samtools flagstat $OUT

which returns something like:

   4635834 + 0 in total (QC-passed reads + QC-failed reads)
   20290 + 0 secondary
   0 + 0 supplimentary
   0 + 0 duplicates
   4443270 + 0 mapped (95.85%:-nan%)
   4615544 + 0 paired in sequencing
   2307772 + 0 read1
   2307772 + 0 read2
   4299122 + 0 properly paired (93.14%:-nan%)
   4412810 + 0 with itself and mate mapped
   10170 + 0 singletons (0.22%:-nan%)
   57898 + 0 with mate mapped to a different chr
   44330 + 0 with mate mapped to a different chr (mapQ>=5)
  • total number of entries in the bam file: same as line 1
   samtools view $OUT | wc -l
  • list of different flags in the bam file: along with their number of occurrences
   samtools view $OUT | cut -f2 | sort | uniq -c
  • total number of mapped entries: same as line 5
   samtools view -F 4 $OUT | wc -l
  • total number of unmapped entries: same as line 1 - line 5
   samtools view -f 4 $OUT | wc -l