User talk:Emilio Palumbo/g2f rna: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(New page: ==RNAseq== link= ===Mapping=== Specific variables to consider when mapping RNAseq: * intron size * overhang (number of bases from each...)
 
No edit summary
Line 1: Line 1:
==RNAseq==
==RNAseq==


[[File:http://rnaseq.uoregon.edu/img/image1.png|link=]]
http://rnaseq.uoregon.edu/img/image1.png


===Mapping===
===Mapping===
Line 15: Line 15:
** block order
** block order
** min/max distance
** min/max distance
===Input files===
To perform RNAseq analisys we need:
* reference genome sequence
* reference gene annotation
* sequences
'''Important note''': Please make sure the contig names for you reference genome and annotation correspond.
===Gemtools===
The [[http://algorithms.cnag.cat/wiki/The_GEM_library GEM mapper]] is a mapping program for next generation sequencing developed in collaboration between CRG and CNAG institutes in Barcelona. Many high-performance standalone programs (splice mapper, concersion tool, etc.) are provided along with the mapper; in general, new algorithms and tools can be easily implemented on the top of these.
[[http://gemtools.github.io/ Gemtools]] is a powerful set of high-level pipelines which greatly simplifies the use of the GEM mapper. Using gemtools one can index references and/or map several kinds of data from a simple command-line interface, without having to type complicated commands. In particular, gemtools contains a fast and accurate pipeline for mapping RNA-sequencing data.
The default gemtools RNAseq pipeline is shown [[http://genome.crg.es/~epalumbo/gem-pipeline.pdf here]].
===Running the pipeline===
The following step are needed to run the gemtools rnaseq pipeline:
Index the genome:
<pre>
gemtools index genome.fa
</pre>
Generate the transcriptome and index it:
<pre>
gemtools t-index annotation.gtf -m MAX_READ_LENGTH
</pre>
After this you can run the pipeline:
<pre>
gemtools rna-pipeline -f FASTQ_FILE -q QUALITY_OFFSET -i GENOME_INDEX -a ANNOTATION_FILE -t NUMBER_OF_CORES -o OUTPUT_FOLDER -m MAXIMUM_READ_LENGHT_FOR_DENOVO_JUNCTIONS
</pre>

Revision as of 02:07, 15 November 2013

RNAseq

http://rnaseq.uoregon.edu/img/image1.png

Mapping

Specific variables to consider when mapping RNAseq:

  • intron size
  • overhang (number of bases from each side of the junction that should be covered by a certain read)
  • splice site consensus (canonical, extended, non-canonical)
  • donor/acceptor splice site consensus sequences
  • junction “filtering”:
    • chromosome/strand
    • block order
    • min/max distance

Input files

To perform RNAseq analisys we need:

  • reference genome sequence
  • reference gene annotation
  • sequences

Important note: Please make sure the contig names for you reference genome and annotation correspond.

Gemtools

The [GEM mapper] is a mapping program for next generation sequencing developed in collaboration between CRG and CNAG institutes in Barcelona. Many high-performance standalone programs (splice mapper, concersion tool, etc.) are provided along with the mapper; in general, new algorithms and tools can be easily implemented on the top of these.

[Gemtools] is a powerful set of high-level pipelines which greatly simplifies the use of the GEM mapper. Using gemtools one can index references and/or map several kinds of data from a simple command-line interface, without having to type complicated commands. In particular, gemtools contains a fast and accurate pipeline for mapping RNA-sequencing data.

The default gemtools RNAseq pipeline is shown [here].

Running the pipeline

The following step are needed to run the gemtools rnaseq pipeline:

Index the genome:

gemtools index genome.fa

Generate the transcriptome and index it:

gemtools t-index annotation.gtf -m MAX_READ_LENGTH

After this you can run the pipeline:

gemtools rna-pipeline -f FASTQ_FILE -q QUALITY_OFFSET -i GENOME_INDEX -a ANNOTATION_FILE -t NUMBER_OF_CORES -o OUTPUT_FOLDER -m MAXIMUM_READ_LENGHT_FOR_DENOVO_JUNCTIONS