DataONE:GEO reuse study

From OpenWetWare

(Difference between revisions)
Jump to: navigation, search
m (Analysis of data reuse of NCBI's GEO dataset)
(add link to brainstorming)
Line 19: Line 19:
[[DataONE/GEO_reuse_study/pilot|Preliminary, exploratory work]]
[[DataONE/GEO_reuse_study/pilot|Preliminary, exploratory work]]
 +
 +
==Brainstorming==
 +
 +
[[DataONE/GEO_reuse_study/brainstorming|Brainstorming about analyses]]
==Useful refs==
==Useful refs==
* Fry, J and Lockyer, S and Oppenheim, C and Houghton, J and Rasmussen, B (2009) Identifying benefits arising from the curation and open sharing of research data produced by UK Higher Education and research institutes. Project Report  http://ie-repository.jisc.ac.uk/279/
* Fry, J and Lockyer, S and Oppenheim, C and Houghton, J and Rasmussen, B (2009) Identifying benefits arising from the curation and open sharing of research data produced by UK Higher Education and research institutes. Project Report  http://ie-repository.jisc.ac.uk/279/

Revision as of 19:17, 20 June 2010

This DataONE OpenWetWare site contains informal notes for several research projects funded through DataONE. DataONE is a collaboration among many partner organizations, and is funded by the US National Science Foundation (NSF) under a Cooperative Agreement.

Home        People        Research        Summer 2010        Resources       


Contents

Analysis of data reuse of NCBI's GEO datasets

Long term Aims

To understand the extent and value of reuse for data stored in the NCBI's GEO database.

Short-term Aims

To fill in the blanks in these sentences:

We have collected some information using ??? on the GEO database, which is made possible because GEO citations are indexed in ????. We recorded all papers that cite the GEO data per year and the number of data sets in GEO for that year. We examined a subset of XXX of those citing papers to estimate the proportion of citations which (1) reused the original data in a significant way (rather than simply allude to its existence), and (2) did not include an author of the original work (because these authors would have access to the data in the absence of the archive). We also used this sample to record the nature of the reuse, for verification, meta-analysis or new questions.
For every data set in GEO, there are XXX citations to data. Moreover, GEO is rapidly growing and there is a necessary time lag between deposition and reuse (on average XXX months after deposition), from which we can estimate that the typical paper is likely to generate YYY citations over the short term. This number should increase as more time passes and citations continue to accumulate for each paper, and it is an underestimate because not all citations to the data use standard references that can be tracked by ????. Of these citations, XXX% of them are estimated to results in novel scientific work that could not have been performed with the archive, for a total of XXX new pieces of work for each archived data set.

Research plan and initial work

Pilot

Preliminary, exploratory work

Brainstorming

Brainstorming about analyses

Useful refs

  • Fry, J and Lockyer, S and Oppenheim, C and Houghton, J and Rasmussen, B (2009) Identifying benefits arising from the curation and open sharing of research data produced by UK Higher Education and research institutes. Project Report http://ie-repository.jisc.ac.uk/279/
Personal tools