Cronn Lab:Informatics: Difference between revisions
Line 17: | Line 17: | ||
==De novo assembly== | ==De novo assembly== | ||
For ''de novo'' assembly of micro-reads we typically use [http://www.ebi.ac.uk/~zerbino/velvet/ velvet] for genomic DNA. We are | For ''de novo'' assembly of micro-reads we typically use [http://www.ebi.ac.uk/~zerbino/velvet/ velvet] for genomic DNA. We are now using the [http://trinityrnaseq.sourceforge.net/ Trinity] package for de novo assembly of RNA-seq data. | ||
==Reference based assembly== | ==Reference based assembly== |
Revision as of 23:18, 12 September 2011
Informatics infrastructure
Much of our computational needs are facilitated through dedicated nodes on Oregon State University's Center for Genome Research and Biocomputinghigh-performance computing cluster. We currently own the following resources:
- pine1 - The original. Dual quad core 2.66 GHz Intel processors with 32 GB of RAM.
- pine2 - Dual quad core 2.13 GHz Intel processors with 96 GB of RAM.
- smokey - 20 TB RAID system.
These systems are currently run through a 64 bit version of Enterprise Red Hat Linux.
Solexa barcode sorting
Most of our Solexa runs include multiplex massively parallel sequencing (MMPS). Because these micro-reads include a sample-specific barcode (as well as the quality control 'T') a first step is to sort these reads by barcode and to remove the barcode. This is facilitated by a custom perl script.
- Nucl. Acids. Res. - Article describing barcoding.
- Short read toolbox - Includes barcode sorting script.
De novo assembly
For de novo assembly of micro-reads we typically use velvet for genomic DNA. We are now using the Trinity package for de novo assembly of RNA-seq data.
Reference based assembly
When we have a reasonable reference we use either RGA or MAQ.