Short read toolbox: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
 
(21 intermediate revisions by 3 users not shown)
Line 1: Line 1:
=Short read toolbox=


This page has been created to help list resources for working with next generation sequence data.
This page was created to help list resources for working with high throughput sequencing data. You can also check out our individual lab pages to see updates on methods -- [[Cronn Lab]] or [[Liston:Lab]] -- to see updates on methods .


 
=Disambiguation=
=Online short-read resources=
Short read toolbox may refer to:
*[http://seqanswers.com/ SEQanswers] - Online forum for next generation sequencing.
* [http://brianknaus.com Short read toolbox] - The website of Brian J. Knaus.
*[http://seqanswers.com/forums/showthread.php?t=43 SEQanswers software post] - Post of software available for next generation sequence data.
*[[Short read toolbox Botany2010]] - Resources provided at the Botany 2010 conference.
*[http://seqanswers.com/wiki/Category:Bioinformatics_application SEQwiki] - SEQ Answers wikilist of bioinformatic applications.
*[[Short read toolbox Botany2012]] - Resources provided at the Botany 2012 conference.
*[http://pathogenomics.bham.ac.uk/blog/2009/09/tips-for-de-novo-bacterial-genome-assembly/ De novo tips] - Blog on de novo assembly.
*[http://genome.ucsc.edu/index.html UCSC Bioinformatics] - UC Santa Cruz's bioinformatics server.
*[http://www.phylo.org/ Cipres] - Cipres.
*[http://gmod.org/wiki/Main_Page GMOD] - Generic model organism database (GMOD) project collection of tools.
*[ftp://ftp.illumina.com/ Illumina Manuals] username: guest password: illumina
 
=Companies developing next-generation and third generation sequencing technologies=
*[http://www.illumina.com/ Illumina] - Illumina
*[http://www.454.com/ 454] - 454/Roche
*[http://www.appliedbiosystems.com/absite/us/en/home/applications-technologies/solid-next-generation-sequencing.html SOLiD] - ABI by Life Technologies
*[http://www.iontorrent.com/ Ion Torrent Semiconductor]  - Ion Torrent
*[http://www.pacificbiosciences.com/ SMRT] - Pacific BioSciences
*[http://www.nanoporetech.com/ Nanopore] - Oxford Nanopore Technologies
 
=Sequence format information=
*[http://brianknaus.com/software/srtoolbox/shortread.html Short Read Toolbox] - Descriptions and examples of qseq, scarf, fastq and fasta formats. Includes scripts to translate these formats to the fastq format standard.
*[http://en.wikipedia.org/wiki/FASTQ_format FASTQ] - Wikipedia's FASTQ page.
*[http://en.wikipedia.org/wiki/FASTA_format FASTA] - Wikipedia's FASTA page.
 
=Alignment format information=
*[http://samtools.sourceforge.net/ SAMtools] - SAMtools.
*[http://sourceforge.net/apps/mediawiki/amos/index.php?title=AMOS AMOS] - AMOS.
*[http://genome.ucsc.edu/FAQ/FAQformat.html UCSC] - UCSC's faq on file formats.
 
=Short-read quality control software=
*[http://www.science.oregonstate.edu/~dolanp/tileqc/index.html TileQC] - Requires R, RMySQL and MySQL.
*[http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/ FastQC] - A quality control tool for high throughput sequence data. A Java application.
*[http://brianknaus.com/software/srtoolbox/shortread.html Short Read Toolbox] - Scripts for quality control of Illumina data.
 
=Open source de novo assemblers=
*[http://www.ebi.ac.uk/~zerbino/velvet/ Velvet] - Implements De Bruijn Graphs in C. Requires 64 bit Linux OS.
*[http://www.genomic.ch/edena.php Edena] - 32 and 64 bit Linux.
*[http://www.bcgsc.ca/platform/bioinfo/software/abyss ABySS] - Multi-threaded de novo assembly.
*[http://sourceforge.net/apps/mediawiki/denovoassembler/index.php?title=Main_Page Ray] - Multi-threaded de novo assembly.
 
*[http://qsra.cgrb.oregonstate.edu/ QSRA] - Utilizes quality scores.
 
=Open source reference guided assemblers=
*[http://soap.genomics.org.cn/index.html SOAP] -  Short Oligonucleotide Analysis Package.
*[http://maq.sourceforge.net/ MAQ] - Mapping and Assembly with Qualities.
*[http://bowtie-bio.sourceforge.net/index.shtml Bowtie] - Bowtie. An ultrafast, memory-efficient short read aligner.
*[http://bio-bwa.sourceforge.net/ BWA] - Burrows-Wheeler aligner.
*[http://rga.cgrb.oregonstate.edu/ RGA] - Perl script which calls blat to assemble short reads.
 
=Hybrid assemblers (reference guided & de novo)=
*[http://www.bx.psu.edu/miller_lab/ YASRA] - Yet Another Short Read Aligner.
*[http://etda.libraries.psu.edu/theses/approved/WorldWideIndex/ETD-4698/index.html Aakrosh Ratan dissertation] - Description of YASRA.
*[[Liston:Computer_Scripts]] - scripts for post-processing of YASRA contigs.
 
=Assembly viewers=
*[http://bioinf.scri.ac.uk/tablet/ Tablet] - Tablet, visualizes ACE, AFG, MAQ, SOAP, SAM and BAM formats.
*[http://samtools.sourceforge.net/ SAMtools] - SAMtools.
 
=Alignment programs=
*[http://align.bmr.kyushu-u.ac.jp/mafft/online/server/ MAFFT] - MAFFT.
*[http://www.ebi.ac.uk/Tools/t-coffee/index.html T-Coffee] - T-Coffee.
*[http://www.ebi.ac.uk/Tools/muscle/index.html Muscle] - Muscle.
*[http://www.bx.psu.edu/miller_lab/ LASTZ] - LASTZ, hosted at the Miller lab.
*[http://mummer.sourceforge.net/ MUMmer] - MUMmer.
*[http://mulan.dcode.org/ Mulan] Multiple Sequence Alignment and Visualization Tool.
*[http://genome.lbl.gov/vista/ VISTA] Tools for Comparative Genomics.
*[http://asap.ahabs.wisc.edu/software/mauve/ mauve] - Multiple (bacterial) genome aligment.
 
=Nucleotide sequence query programs=
*[http://blast.ncbi.nlm.nih.gov/Blast.cgi BLAST] - BLAST.
*[http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#BLATAlign BLAT] - BLAT.
 
=Linux=
*[[media:Essential_Linux.pdf | Essential Linux]]
 
=Perl=
A very brief example to demonstrate file input/output.
 
Code:<br>
<pre>
#!/usr/bin/perl
use strict;
use warnings;
my (@temp, $in, $out);
my $inf = "data.fq";
my $outf = "data_out.fq";
open($in, "<", $inf) or die "Can't open $inf: $!";
open($out, ">", $outf) or die "Can't open $outf: $!";
while(<$in>){
  chomp($temp[0]=$_); # First line is an identifier.
  chomp($temp[1]=<$in>); # Second line is sequence.
  chomp($temp[2]=<$in>); # Third line is an identifier.
  chomp($temp[3]=<$in>); # Fourth line is quality.
  print $out join("\t", @temp)."\n";
}
close $in or die "$in: $!";
close $out or die "$out: $!";
</pre>
*[http://perldoc.perl.org/perlintro.html perlintro] - Introduction to perl with links to other documentation.
*[http://www.bioperl.org/wiki/HOWTO:Beginners BioPerl beginners] - Introduction to BioPerl (be prepared for object oriented code).
 
=Python=
*[http://docs.python.org/tutorial/ Python tutorial]
*[http://biopython.org/wiki/Biopython Biopython]
 
=R project=
*[http://www.r-project.org/ R project] - Statistical programming environment.
*[http://www.bioconductor.org/ Bioconductor] - R for biologists (micro-array and next generation data).
*[http://ape.mpl.ird.fr/ APE] - Analysis of phylogenetics and evolution R package.
*[http://manuals.bioinformatics.ucr.edu/home/ht-seq/ HT Sequence Analysis with R and Bioconductor]
 
=Useful links=
*[[User:Brian J. Knaus]]
*[[Cronn Lab]]
*[[Liston:Lab | Liston Lab]]

Latest revision as of 14:38, 7 June 2012

This page was created to help list resources for working with high throughput sequencing data. You can also check out our individual lab pages to see updates on methods -- Cronn Lab or Liston:Lab -- to see updates on methods .

Disambiguation

Short read toolbox may refer to: