|
|
(22 intermediate revisions by 4 users not shown) |
Line 1: |
Line 1: |
| =Short read toolbox=
| |
|
| |
|
| This page has been created to help list resources for working with next generation sequence data. | | This page was created to help list resources for working with high throughput sequencing data. You can also check out our individual lab pages to see updates on methods -- [[Cronn Lab]] or [[Liston:Lab]] -- to see updates on methods . |
|
| |
|
| | | =Disambiguation= |
| =Online short-read resources= | | Short read toolbox may refer to: |
| *[http://seqanswers.com/ SEQanswers] - Online forum for next generation sequencing.
| | * [http://brianknaus.com Short read toolbox] - The website of Brian J. Knaus. |
| *[http://seqanswers.com/forums/showthread.php?t=43 SEQanswers software post] - Post of software available for next generation sequence data.
| | *[[Short read toolbox Botany2010]] - Resources provided at the Botany 2010 conference. |
| *[http://seqanswers.com/wiki/Category:Bioinformatics_application SEQwiki] - SEQ Answers wikilist of bioinformatic applications.
| | *[[Short read toolbox Botany2012]] - Resources provided at the Botany 2012 conference. |
| *[http://pathogenomics.bham.ac.uk/blog/2009/09/tips-for-de-novo-bacterial-genome-assembly/ De novo tips] - Blog on de novo assembly.
| |
| *[http://genome.ucsc.edu/index.html UCSC Bioinformatics] - UC Santa Cruz's bioinformatics server.
| |
| *[http://www.phylo.org/ Cipres] - Cipres.
| |
| *[http://gmod.org/wiki/Main_Page GMOD] - Generic model organism database (GMOD) project collection of tools.
| |
| *[ftp://ftp.illumina.com/ Illumina Manuals] username: guest password: illumina
| |
| | |
| =Links to Companies Developing Next-Generation and Third Generation Sequencing Technologies=
| |
| *[http://www.illumina.com/ Illumina] - Illumina
| |
| *[http://www.454.com/ 454] - 454/Roche
| |
| *[http://www.appliedbiosystems.com/absite/us/en/home/applications-technologies/solid-next-generation-sequencing.html SOLiD] - ABI by Life Technologies
| |
| *[http://www.iontorrent.com/ Ion Torrent Semiconductor] - Ion Torrent
| |
| *[http://www.pacificbiosciences.com/ SMRT] - Pacific BioSciences
| |
| *[http://www.nanoporetech.com/ Nanopore] - Oxford Nanopore Technologies
| |
| | |
| =List of sequence format information=
| |
| *[http://brianknaus.com/software/srtoolbox/shortread.html Short Read Toolbox] - Descriptions and examples of qseq, scarf, fastq and fasta formats. Includes scripts to translate these formats to the fastq format standard. | |
| *[http://en.wikipedia.org/wiki/FASTQ_format FASTQ] - Wikipedia's FASTQ page.
| |
| *[http://en.wikipedia.org/wiki/FASTA_format FASTA] - Wikipedia's FASTA page. | |
| | |
| =List of alignment format information=
| |
| *[http://samtools.sourceforge.net/ SAMtools] - SAMtools.
| |
| *[http://sourceforge.net/apps/mediawiki/amos/index.php?title=AMOS AMOS] - AMOS.
| |
| *[http://genome.ucsc.edu/FAQ/FAQformat.html UCSC] - UCSC's faq on file formats.
| |
| | |
| =List of short-read quality control software=
| |
| *[http://www.science.oregonstate.edu/~dolanp/tileqc/index.html TileQC] - Requires R, RMySQL and MySQL.
| |
| *[http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/ FastQC] - A quality control tool for high throughput sequence data. A Java application.
| |
| *[http://brianknaus.com/software/srtoolbox/shortread.html Short Read Toolbox] - Scripts for quality control of Illumina data.
| |
| | |
| =List of open source de novo assemblers=
| |
| *[http://www.ebi.ac.uk/~zerbino/velvet/ Velvet] - Implements De Bruijn Graphs in C. Requires 64 bit Linux OS.
| |
| *[http://www.genomic.ch/edena.php Edena] - 32 and 64 bit Linux. | |
| *[http://www.bcgsc.ca/platform/bioinfo/software/abyss ABySS] - Multi-threaded de novo assembly.
| |
| *[http://sourceforge.net/apps/mediawiki/denovoassembler/index.php?title=Main_Page Ray] - Multi-threaded de novo assembly.
| |
| | |
| *[http://qsra.cgrb.oregonstate.edu/ QSRA] - Utilizes quality scores.
| |
| | |
| =List of open source reference guided assemblers=
| |
| *[http://soap.genomics.org.cn/index.html SOAP] - Short Oligonucleotide Analysis Package.
| |
| *[http://maq.sourceforge.net/ MAQ] - Mapping and Assembly with Qualities.
| |
| *[http://bowtie-bio.sourceforge.net/index.shtml Bowtie] - Bowtie. An ultrafast, memory-efficient short read aligner.
| |
| *[http://bio-bwa.sourceforge.net/ BWA] - Burrows-Wheeler aligner.
| |
| *[http://rga.cgrb.oregonstate.edu/ RGA] - Perl script which calls blat to assemble short reads.
| |
| | |
| =Hybrid assemblers (reference guided & de novo)=
| |
| *[http://www.bx.psu.edu/miller_lab/ YASRA] - Yet Another Short Read Aligner.
| |
| *[http://etda.libraries.psu.edu/theses/approved/WorldWideIndex/ETD-4698/index.html Aakrosh Ratan dissertation] - Description of YASRA.
| |
| *[[Liston:Computer_Scripts]] - scripts for post-processing of YASRA contigs.
| |
| | |
| =List of assembly viewers=
| |
| *[http://bioinf.scri.ac.uk/tablet/ Tablet] - Tablet, visualizes ACE, AFG, MAQ, SOAP, SAM and BAM formats.
| |
| *[http://samtools.sourceforge.net/ SAMtools] - SAMtools.
| |
| | |
| =List of alignment programs=
| |
| *[http://align.bmr.kyushu-u.ac.jp/mafft/online/server/ MAFFT] - MAFFT.
| |
| *[http://www.ebi.ac.uk/Tools/t-coffee/index.html T-Coffee] - T-Coffee.
| |
| *[http://www.ebi.ac.uk/Tools/muscle/index.html Muscle] - Muscle.
| |
| *[http://www.bx.psu.edu/miller_lab/ LASTZ] - LASTZ, hosted at the Miller lab.
| |
| *[http://mummer.sourceforge.net/ MUMmer] - MUMmer.
| |
| *[http://mulan.dcode.org/ Mulan] Multiple Sequence Alignment and Visualization Tool.
| |
| *[http://genome.lbl.gov/vista/ VISTA] Tools for Comparative Genomics.
| |
| *[http://asap.ahabs.wisc.edu/software/mauve/ mauve] - Multiple (bacterial) genome aligment.
| |
| | |
| =List of nucleotide sequence query programs=
| |
| *[http://blast.ncbi.nlm.nih.gov/Blast.cgi BLAST] - BLAST.
| |
| *[http://genome.ucsc.edu/goldenPath/help/hgTracksHelp.html#BLATAlign BLAT] - BLAT.
| |
| | |
| =Linux=
| |
| *[[media:Essential_Linux.pdf | Essential Linux]]
| |
| | |
| =Perl=
| |
| A very brief example to demonstrate file input/output.
| |
| | |
| Code:<br>
| |
| <pre>
| |
| #!/usr/bin/perl
| |
| use strict;
| |
| use warnings;
| |
| my (@temp, $in, $out);
| |
| my $inf = "data.fq";
| |
| my $outf = "data_out.fq";
| |
| open($in, "<", $inf) or die "Can't open $inf: $!";
| |
| open($out, ">", $outf) or die "Can't open $outf: $!";
| |
| while(<$in>){
| |
| chomp($temp[0]=$_); # First line is an identifier.
| |
| chomp($temp[1]=<$in>); # Second line is sequence.
| |
| chomp($temp[2]=<$in>); # Third line is an identifier.
| |
| chomp($temp[3]=<$in>); # Fourth line is quality.
| |
| print $out join("\t", @temp)."\n";
| |
| }
| |
| close $in or die "$in: $!";
| |
| close $out or die "$out: $!";
| |
| </pre>
| |
| *[http://perldoc.perl.org/perlintro.html perlintro] - Introduction to perl with links to other documentation.
| |
| *[http://www.bioperl.org/wiki/HOWTO:Beginners BioPerl beginners] - Introduction to BioPerl (be prepared for object oriented code).
| |
| | |
| =Python=
| |
| *[http://docs.python.org/tutorial/ Python tutorial]
| |
| *[http://biopython.org/wiki/Biopython Biopython]
| |
| | |
| =R project=
| |
| *[http://www.r-project.org/ R project] - Statistical programming environment.
| |
| *[http://www.bioconductor.org/ Bioconductor] - R for biologists (micro-array and next generation data).
| |
| *[http://ape.mpl.ird.fr/ APE] - Analysis of phylogenetics and evolution R package.
| |
| *[http://manuals.bioinformatics.ucr.edu/home/ht-seq/ HT Sequence Analysis with R and Bioconductor]
| |
| | |
| =Useful links=
| |
| *[[User:Brian J. Knaus]]
| |
| *[[Cronn Lab]]
| |
| *[[Liston:Lab | Liston Lab]]
| |