DataONE:Notebook/Reuse of repository data/2010/06/24

From OpenWetWare
Jump to navigationJump to search
Reuse of Repository Data Main project page
Previous entry      Next entry

Notes for June 24, 2010

  • Valerie Enriquez 10:07, 24 June 2010 (EDT): Will resume reviewing past searches for input into new spreadsheet quantifying search hits and misses located here.
  • Will also solidify abstract today and begin working on paper and presentation once all spreadsheets are complete.

"Re-Searches"

  1. Resource: ISI Web of Science Search term(s): Cited Author=(Beck R*) AND Cited Work=(BMC EVOL BIOL) AND Cited Year=(2007) Timespan=2008-2010. Databases=SCI-EXPANDED, SSCI, A&HCI, CPCI-S, CPCI-SSH Results: 11
  2. Resource: ISI Web of Science Search term(s): Cited Author=(Menkis A*) AND Cited Work=(PLOS GENET) AND Cited Year=(2008) Timespan=2008-2010. Databases=SCI-EXPANDED, SSCI, A&HCI, CPCI-S, CPCI-SSH. Results: 9
  3. Resource: ISI Web of Science Search term(s): Cited author(Pangaea) Limits: Timespan=2008-2010 (month field not available in advanced search) Language: English Results: 2
  4. Resource: ISI Web of Science Search term(s): Cited Work(Pangaea) Limits: Timespan=2008-2010 (month field not available in advanced search) Language: English Results: 9
  5. Resource: Scirus Search term(s): (exact phrase) doi:10.1594/PANGAEA* (in the complete document) Limits: Only show results published between: 2008 and 2010 (month field not available in advanced search) Only show results that are Abstracts, Articles Journal Sources Results: 12
  6. Resource: ISI Web of Science Search term(s): Cited Author=(Stein R) AND Cited Work=(QUATERNARY SCI REV) AND Cited Year=(2004) Timespan=2008-2010. Databases=SCI-EXPANDED, SSCI, A&HCI, CPCI-S, CPCI-SSH. Results: 16

Rough Paper Outline

  1. Abstract: While online data repositories are growing rapidly, it is still difficult to track their usage among researchers. While there are tools that track citations of articles, there is not a reliable way to track the usage of raw data stored in repositories, much less its reuse. This study will examine different ways of finding reused data as well as the level of difficulty in finding true hits as opposed to false drops.
  2. Introduction: While it is becoming easier to track the citation of articles in publication thanks to unique identifiers such as DOIs and tracking tools such as Scopus and ISI Web of Science Cited Reference Search, it is still difficult to find data cited. Finding citations for reused data from repositories proves an even greater challenge.
  3. Method: Three repositories were selected: 1. TreeBASE 2. Pangaea 3. ORNL-DAAC and three methods of finding articles were selected 1. ISI Web of Science Cited Reference Search 2. Scirus 3. Google Scholar. I conducted initial searches to find the best keywords and limits to find the most relevant articles. The next round of searching narrowed down results based on data author or project title.
  4. Results: For the most part, it was easier to find cited data by data author name as opposed to any mention of the repository name where the data is hosted.
  5. Discussion: While some repositories like Pangaea and ORNL-DAAC have recommendations that include DOIs or other unique identifiers, not all citations use them. If researchers adhere more to these standards, it could help facilitate making repository data citation more transparent. In turn, as cited articles can help bring prestige to a researcher or institution, prominently cited data could also provide a means of recognition.