User talk:Emilio Palumbo/g2f rna: Difference between revisions
(New page: ==RNAseq== link= ===Mapping=== Specific variables to consider when mapping RNAseq: * intron size * overhang (number of bases from each...) |
No edit summary |
||
Line 1: | Line 1: | ||
==RNAseq== | ==RNAseq== | ||
http://rnaseq.uoregon.edu/img/image1.png | |||
===Mapping=== | ===Mapping=== | ||
Line 15: | Line 15: | ||
** block order | ** block order | ||
** min/max distance | ** min/max distance | ||
===Input files=== | |||
To perform RNAseq analisys we need: | |||
* reference genome sequence | |||
* reference gene annotation | |||
* sequences | |||
'''Important note''': Please make sure the contig names for you reference genome and annotation correspond. | |||
===Gemtools=== | |||
The [[http://algorithms.cnag.cat/wiki/The_GEM_library GEM mapper]] is a mapping program for next generation sequencing developed in collaboration between CRG and CNAG institutes in Barcelona. Many high-performance standalone programs (splice mapper, concersion tool, etc.) are provided along with the mapper; in general, new algorithms and tools can be easily implemented on the top of these. | |||
[[http://gemtools.github.io/ Gemtools]] is a powerful set of high-level pipelines which greatly simplifies the use of the GEM mapper. Using gemtools one can index references and/or map several kinds of data from a simple command-line interface, without having to type complicated commands. In particular, gemtools contains a fast and accurate pipeline for mapping RNA-sequencing data. | |||
The default gemtools RNAseq pipeline is shown [[http://genome.crg.es/~epalumbo/gem-pipeline.pdf here]]. | |||
===Running the pipeline=== | |||
The following step are needed to run the gemtools rnaseq pipeline: | |||
Index the genome: | |||
<pre> | |||
gemtools index genome.fa | |||
</pre> | |||
Generate the transcriptome and index it: | |||
<pre> | |||
gemtools t-index annotation.gtf -m MAX_READ_LENGTH | |||
</pre> | |||
After this you can run the pipeline: | |||
<pre> | |||
gemtools rna-pipeline -f FASTQ_FILE -q QUALITY_OFFSET -i GENOME_INDEX -a ANNOTATION_FILE -t NUMBER_OF_CORES -o OUTPUT_FOLDER -m MAXIMUM_READ_LENGHT_FOR_DENOVO_JUNCTIONS | |||
</pre> |
Revision as of 02:07, 15 November 2013
RNAseq
http://rnaseq.uoregon.edu/img/image1.png
Mapping
Specific variables to consider when mapping RNAseq:
- intron size
- overhang (number of bases from each side of the junction that should be covered by a certain read)
- splice site consensus (canonical, extended, non-canonical)
- donor/acceptor splice site consensus sequences
- junction “filtering”:
- chromosome/strand
- block order
- min/max distance
Input files
To perform RNAseq analisys we need:
- reference genome sequence
- reference gene annotation
- sequences
Important note: Please make sure the contig names for you reference genome and annotation correspond.
Gemtools
The [GEM mapper] is a mapping program for next generation sequencing developed in collaboration between CRG and CNAG institutes in Barcelona. Many high-performance standalone programs (splice mapper, concersion tool, etc.) are provided along with the mapper; in general, new algorithms and tools can be easily implemented on the top of these.
[Gemtools] is a powerful set of high-level pipelines which greatly simplifies the use of the GEM mapper. Using gemtools one can index references and/or map several kinds of data from a simple command-line interface, without having to type complicated commands. In particular, gemtools contains a fast and accurate pipeline for mapping RNA-sequencing data.
The default gemtools RNAseq pipeline is shown [here].
Running the pipeline
The following step are needed to run the gemtools rnaseq pipeline:
Index the genome:
gemtools index genome.fa
Generate the transcriptome and index it:
gemtools t-index annotation.gtf -m MAX_READ_LENGTH
After this you can run the pipeline:
gemtools rna-pipeline -f FASTQ_FILE -q QUALITY_OFFSET -i GENOME_INDEX -a ANNOTATION_FILE -t NUMBER_OF_CORES -o OUTPUT_FOLDER -m MAXIMUM_READ_LENGHT_FOR_DENOVO_JUNCTIONS