Proportal FAQs: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
 
Line 13: Line 13:


==Question: How to understand the metagenome data?==
==Question: How to understand the metagenome data?==
The CAMERA dataset is the first of the metagenomic data set that was used as a model how to store Prochlorococcus metagenomes into ProPortal. However, unlike CAMERA, only Pro/Syn/cyanophage metagenome data are interested and included at ProPortal.  The metagenomic part of ProPortal has been implemented with very user-friend UI but is an area that is still pretty much under-development.
The CAMERA dataset is the first of the metagenomic data set that was used as a model how to store Prochlorococcus metagenomes into ProPortal. However, unlike CAMERA, only Pro/Syn/cyanophage metagenome data are interested and included at ProPortal.   


The bar graphs provided on the website report the direct read counts that are assigned to the currently available host/phage genomes. The read counts should also be normalized to the genome size.  Reporting the raw read counts is intended to answer the simplest questions such as "Is this gene/genomic region represented at all in the metagenomes?"  And to give the users a quick answer on whether it's worth proceeding further.
The bar graphs provided on the website report the direct read counts that are assigned to the currently available host/phage genomes. The read counts should also be normalized to the genome size.  Reporting the raw read counts is intended to answer the simplest questions such as "Is this gene/genomic region represented at all in the metagenomes?"  And to give the users a quick answer on whether it's worth proceeding further.


From the UI, we can query a specific Pro/Syn/phage read, and see which genome it is recruited to and what gene(s) it overlaps with, for instance, http://proportal.mit.edu/gosread/JCVI_READ_1105499780090/.
The metagenomic part of ProPortal has been implemented with very user-friend UI but is an area that is still pretty much under-development. For instance, from the UI, we can query a specific Pro/Syn/phage read, and see which genome it is recruited to and what gene(s) it overlaps with, http://proportal.mit.edu/gosread/JCVI_READ_1105499780090/.


For a specific genomic region, we can query how many GOS reads are recruited to that region and where those reads come from.  Obviously from the website, it is only reported the raw counts and no normalization is done, the back-end database allows our lab members to do more sophisticated queries.  The web UI currently is still very simple.
==Question: Future development?==
 
===Question: Population Dynamics===
Population Dynamics : The Data is easy to access. Pro and Syn number info is useful but more easy access to environmental metadata would be useful, i.e. nutrients, light intensity, salinity, temperature etc etc rather than searching an external website.
 
<B>Answer</B>
 
This is true. We could get these data if we do not already have them. Each of those papers did statistics with all the environmental data. so there must be a master spread sheet. (Check with Allison...)
 
===Question: More citations?===
The citations also need updating on the Synechococcus side since key genome papers Dufresne et al., 2008 and Scanlan et al., 2009 are missing on both the website and the manuscript.
 
The manuscript is well written though the cited literature should encompass beyond the Chisholm lab since non-specialist readers might find it harder to access other papers with excellent datasets on the molecular ecological, microarray, genome and metagenomic side.
 
<B>Answer</B>
 
Both Dufresne et al., 2008 and Scanlan et al., 2009 papers are stored in the DB and can be queried using "Search Data" with "Publications" option.
 
Dufrense 2008: http://www.ncbi.nlm.nih.gov/pubmed/18507822
 
Scanlan 2009: http://www.ncbi.nlm.nih.gov/pubmed/19487728
 
In addition, the following genomes have been linked to Dufresne et al., 2008 paper, in which those genomes were first reported.
 
  Synechococcus: BL107, WH5701, RCC307, RS9916, RS9917, WH7803, WH7805, CC9605, CC9902
 
We should cite more references outside the lab.
 
===Question: Cluster analysis?===
I didn’t find Figure 3 particularly useful.
 
<B>Answer</B>
 
We can try to defend if we think useful...
 
===Question: Future development?===
Finally, it would be really nice if some of the things that will only appear in a future ProPortal update e.g. phylogenetic trees for gene clusters; linking GOS reads to gene clusters and genomes are actually included at its outset.
Finally, it would be really nice if some of the things that will only appear in a future ProPortal update e.g. phylogenetic trees for gene clusters; linking GOS reads to gene clusters and genomes are actually included at its outset.

Latest revision as of 13:29, 21 November 2011

Question: Proportal vs IMG, CAMERA and others websites?

While there are several excellent resources available to explore and compare microbial genomes—e.g. CyanoBase, IMG and MicrobesOnline, the unique strength of ProPortal is its comprehensive nature—including genomic, transcriptomic, metagenomic and population data from both domesticated and wild populations of cyanobacteria and phage.

It is easier to have all data in one place for large-scale data retrieval and cross-link between different types of data. Essentially, Proportal provides an own way of clustering genes (also described in Kettler et al) that are perhaps more suitable for the genomes in Proportal database. Proportal also provides external links to MicrobesOnline (from the gene page) if available and from there, users can browser the genomes by KEGG pathways, use MO's comparative genome browser and view the precomputed BLAST results. In Addition, Proportal provides a link to NCBI BLAST page for users to perform BLAST search on the fly for the gene in view.

Question: How to use the Search page?

The current keyword search engine in Proportal has the simplest implementation and is pretty aggressive. It is also case sensitive. A keyword will be used to search gene name, locus tag and gb tag first, and then segregated to search gene descriptions.

Question: Will more microarray data become available?

Microarray-based readouts of transcript levels in Prochlorococcus strains MED4 and MIT9313 exposed to 85 various phosphate, nitrogen, iron and ambient light conditions have been integrated into ProPortal. Transcript data for changing O2/CO2 ratios will soon be added. Datasets from other groups describing transcriptional response in Synechococcus 90 are not currently integrated, but could be in future releases.

Question: How to understand the metagenome data?

The CAMERA dataset is the first of the metagenomic data set that was used as a model how to store Prochlorococcus metagenomes into ProPortal. However, unlike CAMERA, only Pro/Syn/cyanophage metagenome data are interested and included at ProPortal.

The bar graphs provided on the website report the direct read counts that are assigned to the currently available host/phage genomes. The read counts should also be normalized to the genome size. Reporting the raw read counts is intended to answer the simplest questions such as "Is this gene/genomic region represented at all in the metagenomes?" And to give the users a quick answer on whether it's worth proceeding further.

The metagenomic part of ProPortal has been implemented with very user-friend UI but is an area that is still pretty much under-development. For instance, from the UI, we can query a specific Pro/Syn/phage read, and see which genome it is recruited to and what gene(s) it overlaps with, http://proportal.mit.edu/gosread/JCVI_READ_1105499780090/.

Question: Future development?

Finally, it would be really nice if some of the things that will only appear in a future ProPortal update e.g. phylogenetic trees for gene clusters; linking GOS reads to gene clusters and genomes are actually included at its outset.