Talk:DataONE:Notebook/Reuse of repository data
- Sarah Judson 19:58, 14 June 2010 (EDT): Valerie, I don't know if you've already ruled this out as a search option, but when I got started on the project and was selecting journals, I thought of a technique for the depository approach. It's pretty simple I think and maybe too time consuming, but I'll outline it for you:
- On dryad (which is probably different from treebase), I clicked on "authors" and "Journal Title" to see which journals/authors had the most citations (i.e. most likely for data reuse)
- I then searched a few articles in ISI to see how many times they are cited and picked one that was cited many (10-20+ times)
- Then I quickly searched those papers for the last name of the original author (i.e. the paper picked in step 1) and made note of if the paper from ISI was citing the paper for conceptual purposes, or hopefully and more exciting, the dataset. I alternatively search "hdl" in the full text of the article (dryad's accession number).
- This process didn't take too long and helped focus on most likely journals/authors that practice data reuse. However, it's probably biased towards groups I thought were more likely to have data reuse. That is probably ok for a preliminary study like this, but not as standardized as it possibly could be. Anyways, just an idea.
- Sarah Judson 19:58, 14 June 2010 (EDT):Let me know when you're ready for my zotero full text database for test searching. I'm making note of common search terms as I go through articles and annotating/tagging articles with "Yes_DataCitation", "Yes_Treebase", "Yes_DataSharing" (i.e. not reuse) or "No_DataCitation" which can help validate if your test search catches articles that have been manually noted to have a data set citation of some sort.
- Valerie Enriquez 16:25, 15 June 2010 (EDT): TreeBASE is sort of funny in that sometimes even the cited accession number might not match, where if you search for a number like S1459, you'll only find it under "Legacy ID" and not "Study ID" since the study is technically over. Also, running a search for "S" might not be useful in fulltext. I'd definitely appreciate it if you linked me to your zotero full text database. Unfortunately, my school has very limited fulltext access to articles. Once I have more solid findings for phase II of TreeBASE, I'll re-edit my Connotea/CiteULike and send you the link. Thanks again!
- Sarah Judson 20:11, 15 June 2010 (EDT):I can send you account information to let you directly sync to my zotero library, but there are two major cons with that 1. if you modify things it will delete/change things that i don't want deleted (though it could be conducive to open science on that same note) and 2. my current library houses all my other projects = lots of junk for you to sift through and unfortunately zotero doesn't let you pick and choose which libraries you want nor have multiple libraries on the same desktop (at least not without extreme hassle). So, I'm thinking I'll send you the static library I have right now....basically January 2010 articles for 4 major journals. Only problem is, I haven't coded all of them yet. So, let me know when you get it, if it works (i.e. you have zotero and it opens period) and then we can get on gchat and perform a few searches at the same time to make sure you're getting the full text too (sometimes it doesn't txfr well between machines). If you have another idea on how this could work, let me know.
- Sarah Judson 20:11, 15 June 2010 (EDT):I can help you get fulltext...my university has wide privileges (and they aren't scheduled to run out utnil August even though I graduated...sweet). So if UNM/Todd can't help, if you give me lists of what you need I should be able to get them relatively easily.
- Valerie Enriquez 21:37, 15 June 2010 (EDT): Mostly the things I find that I don't have access to are articles in Science Direct through Elsevier. I'm going to email Todd to see if I can get UNM permissions. Thanks again!
re: searching success
- Heather A Piwowar 16:49, 22 June 2010 (EDT): Valerie, just talked to Todd and we had a thought. It seems like generating effective searches for reuse is really difficult. What makes a search effective varies by repository due to repository names, support for dois, etc. Evidence of the difficulties would be very useful/motivating for initiatives like datacite, and interesting to all people who submit data. As such, describing the difficulties in formulating effective searches for reuse, using three repositories as examples, would make a great publication in and of itself. Maybe a research article, or a perspectives piece, or ??? And insights from the write-up could inform how to proceed for the last few weeks of your internship. See what you think, consider some potential publishing venues that might be appropriate for a case study like this, and let's chat sometime on Wednesday about it?