Data Files: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 55: Line 55:
***Left half [[Media:FLY.new.left.fa.zip]]
***Left half [[Media:FLY.new.left.fa.zip]]
***Right half [[Media:FLY.new.right.fa.zip]]
***Right half [[Media:FLY.new.right.fa.zip]]
***All [[Media:FLY.new.seqs]]
***All [[Media:FLY.new.seqs.zip]]
**Renamed sequence ids in alignments
**Renamed sequence ids in alignments
***Left half [[Media:Infernal_aligned_reads.new.left.fa.zip]]
***Left half [[Media:Infernal_aligned_reads.new.left.fa.zip]]

Revision as of 16:12, 12 July 2010

Drosophila 16S Paper

Sequence files

  • These are the quality-trimmed reads that did not have enough overlap to assemble (thus "frags").

File:Fly.frags

  • These are the quality-trimmed reads that did not have enough overlap to assemble and had a significant blast hit to the "left" side of a reference 16S sequence.

File:Fly.frags.left

  • These are the quality-trimmed reads that did not have enough overlap to assemble and had a significant blast hit to the "right" side of a reference 16S sequence.

File:Fly.frags.right

  • These are the reads that assembled into complete clones using the JGI's 16S pipeline (genelib).

File:Fly.contigs

  • NAST-aligned sequences from the Corby-Harris paper.

File:Corby.NAST.aligned.fasta

  • NAST-aligned sequences from the Cox and Gilmore paper.

File:Cox.NAST.aligned

  • A fasta file of all the Clean Chimera-checked sequenced that were unclassified at the Genus level --James Angus Chandler 19:16, 24 April 2009 (EDT)

Media:Unclassified.fasta.gz

Taxonomy Assignments

  • OLD, not quality-trimmed data: there are three files here, each corresponds to one of the three chimera-checked sequence files above (i.e., putative, sub-threshold, and clean)
  1. clean sequences File:Classifications All NAST.Bclean.fasta30593.xls
  2. sub-threshold chimeric sequences File:Classifications All NAST.Bambig.fasta27959.xls
  3. putative chimeric sequences File:Classifications All NAST.Bchimera.fasta28223.xls

Alignment

  • below is the most current version of the NAST-formatted alignment file.

this is the older, not quality-trimmed file. It contains all of our sequences, plus the Corby-Harris and Cox-Gilmore sequences.

File:All.good.gz


  • This is a concatenated alignment with the quality-trimmed data. Each half was aligned using the NAST aligner and then both halves were concatenated. There is a reference sequence in there (called testseq) that was used to decide where to end the left half and begin the right half before concatenating. This alignment does not include all of the full-length sequences that were assembled with genelib.

File:Fly.merged.fasta

Redoing

Metadata Files

Here is the environment file you asked for. Look it over to tell me if I need to add anything. I left ??? for the Cox-Gilmore samples since I cannot seem to find my copy of it and thus do not know what they collected over.

Media:MainEnvFile.xls --James Angus Chandler 20:45, 19 May 2009 (EDT)

  • This file allows the translation from JGI clone IDs to our sample IDs.

File:May09.trans.xls

Nearly Complete Taxonomy and Alignment

Media:NoWolb_noNNs_cleaned_noTurrs_noDescrepsAKAfinal.fasta‎ Media:Infernal_RDP_Correct%_noWolb_noNNs_noTurrs_noDescreps_noMissingsAKAfinal.xlsx

CalTech Presentation

Media:CalTech_Presentation.ppt --James Angus Chandler 00:56, 11 June 2009 (EDT)