Talk:Wikiomics:Repeat finding: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(RepeatMasker for RepeatScout speedup)
 
(→‎TANTAN: new section)
(6 intermediate revisions by the same user not shown)
Line 10: Line 10:
If one is concerned about lower sensitivity of "-qq", then this can be compensated by lowering minimum occurrence threshold (i.e. ("--thresh=5) in the next step.
If one is concerned about lower sensitivity of "-qq", then this can be compensated by lowering minimum occurrence threshold (i.e. ("--thresh=5) in the next step.
*'''[[User:Darek Kedra|darked]] 09:26, 23 March 2010 (EDT)''':
*'''[[User:Darek Kedra|darked]] 09:26, 23 March 2010 (EDT)''':
== Seedmasker ==
"SeedMasker is public domain software for masking genomes based on over-represented words."
http://www.drive5.com/seedmasker/
*'''[[User:Darek Kedra|darked]] 15:30, 24 March 2010 (EDT)''':
== ReRep ==
GSS sequences including 454 data
http://www.biomedcentral.com/1471-2105/9/366/
http://bioinfo.pdtis.fiocruz.br/ReRep/
== Tandem repeat finder parser ==
@SOURCEFORGE PERL script 2 check
http://sourceforge.net/projects/trfparser/
== REPET ==
http://urgi.versailles.inra.fr/index.php//Tools/REPET
== TANTAN ==
new algorithm from 2011:
www: http://www.cbrc.jp/tantan/
article: http://nar.oxfordjournals.org/content/39/4/e23.full

Revision as of 11:20, 11 March 2011

RepeatScout possible speedups:

RepeatMasker  input_genome_sequence.fas -lib output_repeats.fas.filtered_1 -norna -nolow -no_is 

-qq (5-10x faster, a bit less sensitive) -pa numbers of parallel processes to use, in case you got multiprocessor or multicore machines

If one is concerned about lower sensitivity of "-qq", then this can be compensated by lowering minimum occurrence threshold (i.e. ("--thresh=5) in the next step.

  • darked 09:26, 23 March 2010 (EDT):

Seedmasker

"SeedMasker is public domain software for masking genomes based on over-represented words." http://www.drive5.com/seedmasker/

  • darked 15:30, 24 March 2010 (EDT):

ReRep

GSS sequences including 454 data


http://www.biomedcentral.com/1471-2105/9/366/

http://bioinfo.pdtis.fiocruz.br/ReRep/

Tandem repeat finder parser

@SOURCEFORGE PERL script 2 check http://sourceforge.net/projects/trfparser/


REPET

http://urgi.versailles.inra.fr/index.php//Tools/REPET

TANTAN

new algorithm from 2011:

www: http://www.cbrc.jp/tantan/

article: http://nar.oxfordjournals.org/content/39/4/e23.full