OpenWetWare:Software/Flexible Science Databases: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
 
No edit summary
 
(7 intermediate revisions by 2 users not shown)
Line 1: Line 1:
'''[[User:Lucks|Lucks]] 20:14, 3 April 2006 (EDT)''': This page has been created to discuss the idea of Flexible Science Databases originally posted on the ideas page.  Below is the discussion that took place there - I will be adding a more guided framework for discussion in the near future.
{{cpnavbar}}
{| cellspacing="3" width="712px" class="green1" cellpadding="3"
|


== Overview ==


===Flexible Science Databases with OWW ===
There is a need for a flexible database system in the sciencesDatabases exist, but how many of us want to learn the messy SQL syntax, or put more than 20 records in an unwieldy Excel spreadsheet? We all have needs for a really simple way to create a database, the way that makes sense to us for our projects, not the way that is shaped by a specific tool, originally built for another purpose. We would also like really simple ways to enter that dataThose two features combined would enable all of the scratching in the lab notebooks and envelopes to be put together digitally - indexing, searching, sharing, extending - they all become trivial.
*'''[[User:Lucks|Lucks]] 17:45, 24 March 2006 (EST)''': I recently came across the desire/need for a flexible database system in my research (bioinformatics)For example, I want to know about phage gene expression. Not knowing enough biology to perform an efficient literature search (coming from a physics background), I have made a little database that allows me to input author name, citation info, abstract, and some relevancy measure while I search in the literature.  (For the specific implementation, please visit [http://lux.wufoo.com/forms/pubmed-searches-for-phage-gene-expression/ PubMed Searches for Phage Gene Expression].) <br> The point that this little example illustrates is the creative use of small, easily customizable databases in scientific investigationsAccumulation of litearature search results is one example (you can see that common citation managers like EndNote are just specific, albeit not very flexible views of a literature database), but you can imagine many more, from entering specific observations from experiments, to bioinformatic data collection, ...  The goal is some flexible framework where it is easy to create and modify databases on the fly, with nice user interfaces for database entry. <br> Which is why I bring this discussion to OWWThere are a couple of ways to do this.  The easiest method (almost available) comes from [http://wufoo.com Wufoo] (used in the example above).  I have been using this recently with a beta-key and it has many nice features, but I think something could be made that is more geared towards scientists (and when released, Wufoo will require a monthly fee)I am wondering if this could be done in the framework of a wiki such as OWWI imagine a scenario where through template features, users could easily define fields in a database in wikimarkup.  Once this is done, pages would be generated to allow data input, as well as retrieval (dumps in CSV format, or even automatically formatting a wiki table ...) <br> This idea is more along the lines of the [[Science 2.0]] article, but I think it would be very useful, could be integrated into many aspects of scientific investigation, and could be implemented in something like OWW.
 
**'''[[User:Smeister|Smeister]] 07:58, 25 March 2006 (EST)''': If I understand you right - you are trying to implement database functionality in OWW - this is the direction I always thought and hoped projects like OWW will move into, eventually - after all, almost everything in science has to do with huge amounts of data... Flexible online databases, customizable and maintained by a big group people with little computer knowledge - but expert knowledge on the subject covered by the database - would add an enormous value to OWW, for sure. One can also envision some sort of peer evaluation for the validity of the entries (i.e. something along the lines of what they use for stories and comments, over at digg.com). That's a huge project, though...  
This page is a brainstorming forum for how to shape such a tool - what it should be able to do, how to make it, what it could be used for, etc.  The first major hurdle in building a tool is to come up with a concrete concept - the hope is that we can achieve this through discussion and this forum.
***'''[[User:Lucks|Lucks]] 10:57, 25 March 2006 (EST)''': While a big project, with some discussion, it could be possible to come up with a few hacks that would be relatively easy to implement so that the idea could at least be tested.  I have some experience with running a mediawiki, but I have never looked into the code. Alternatively, some software could be developed external to OWW, but linked so that new database entreis are automatically inselted into pages in OWW - a lot of alternatives hereI have put together a database web interface before with Perl-CGI (using [http://search.cpan.org/~domizio/CGI-Builder-1.35/lib/CGI/Builder.pm CGI::Builder]), which works great (see the [http://slyjbl.hopto.org/PLD/ Pica Literature Database]), but requires the user to make an HTML form with limited, but ugly Perl HTML:Template variable references - basically not very user friendly at the initial setup.  I am now learning a little bit about [http://www.rubyonrails.org/ Ruby on Rails], which is fantastically flexible for this sort of thing. (I am pretty sure this is the framework that Wufoo uses.) I can imagine a wiki design page that allows someone to make a form design via wiki-markup, which then gets fed to a rails program that creates the database, and then passes entries back to the wiki.
 
****'''[[User:Lucks|Lucks]] 18:42, 26 March 2006 (EST)''': I am not sure how the templates in Mediawiki work, but perhaps a user could define a database layout by designing a template pageA new record in the database would then be created by making a new page (with some systematic page name) using this template.  There could be some facility for conglomerating all the records together to provide a global view of the dataset, and possibly a database dump for further analysis elsewhereNot that mediawiki would be the best end-solution, but it might be an easy testbed to vet out the idea in general.  This also keeps everything within OWW.
Your contribution is vital! You are the scientists and end-usersAll of your comments will shape whatever comes out of this discussionPlease post ideas on the discussion page, eventually they will be migrated over here as things take shape. Keep following the progress - it will be fun to see how your individual contributions are weaved together!
*'''[[User:Rshetty|RS]] 13:54, 25 March 2006 (EST)''': I think this is an interesting idea. Why don't we discuss it at the next [[OpenWetWare:Steering committee]] meeting from 4p-5p on April 3. Can you both make it (i.e. attend in person or phone in)?  Just to be clear, I think we should also discuss it on the wiki, I just thing we might want to discuss it in realtime as well.
 
**'''[[User:Lucks|Lucks]] 18:08, 25 March 2006 (EST)''': Fantastic - I think I can make the meeting in person.
== Announcements ==
**'''[[User:Smeister|Smeister]] 04:29, 27 March 2006 (EST)''': I already wanted to join the last time around, but the time difference is a bit of a problem for me...
'''[[User:Lucks|Lucks]] 20:04, 14 April 2006 (EDT)''': I have completed the basic outline and content of this page - I look forward to hearing from all of you!
**'''[[User:Rshetty|RS]] 19:00, 29 March 2006 (EST)''': Sorry about this.  It can be a bit of a problem scheduling a time that works for everyone.  Regardless, we'll post meeting notes on the wiki.
 
== The Idea ==
 
The goal is some flexible framework where it is easy to create and modify databases on the fly, with nice user interfaces for database entry. Flexible online databases, customizable and maintained by a big group of people with little computer knowledge - but expert knowledge on the subject covered by the database would add an enormous value to OWW, and the scientific community as a whole. One can also envision some sort of peer evaluation for the validity of the entries (i.e. something along the lines of what they use for stories and comments, over at digg.com).
 
=== Features ===
 
*Easy to use database creation.
*Easy to specify database design - especially variable codings.  We would like a way to store a variable in the database in a machine-understandable way (for example an NCBI accession number), but display it in a human-understandable way (the gene name).
 
=== Possible Uses ===
 
*Your own fully-customizeable literature database that would be able to store literature references and associated information, exactly how you wanted.  (For an example and an implementation that sparked this discussion, see the [http://slyjbl.hopto.org/PLD/ Pica Literature Database]).
 
*Quick and dirty literature searches where you want to extract just a few pieces of information from each reference, then search through them later. (For an example, see [http://lux.wufoo.com/forms/pubmed-searches-for-phage-gene-expression/ PubMed Searches for Phage Gene Expression], which is implemented with [http://www.wufoo.com Wufoo]).
 
== Implementation Ideas ==
 
A web-based implementation seems to make the most sense since it will allow powerful data sharing, synchronization, distribution, etc. It will also allow integration with the many new types of information that will be available with the Web 2.0To that end, there are several possibilities:
 
* An XML grammer designing scheme whereby user input is translated into an XML grammer.  Some sort of form is filled out, and the record is stored as flat XML.
* Frontend to a serious database.  Here the full gamut of relational databases could be used (without the user knowing).
* Wiki template pages - the database design could come in the form of designing a wiki templateRecords are entered by filling out the template similar to the [[Calendar|OWW Calendar]].  After all, Mediawikis are just user interfaces for database entry of content pages.
 
== Sources of Inspiration ==
 
*[http://www.wufoo.com Wufoo] has many of the features we would want with some shortcomings. It does look really nice though!
*[http://www.rubyonrails.org/screencasts Ruby on Rails Screencasts]: one look at ''Creating a Weblog in 15 Minutes'' will make you realize that the only step not done by Rails is the original database creation!
 
== Want to get this going? ==
If you are interested in more than contributing to this wiki page, see [[User:Lucks#Contact|Lucks]] for contact information.
|-
|}

Latest revision as of 12:39, 23 September 2006

Back to Community Portal

Overview

There is a need for a flexible database system in the sciences. Databases exist, but how many of us want to learn the messy SQL syntax, or put more than 20 records in an unwieldy Excel spreadsheet? We all have needs for a really simple way to create a database, the way that makes sense to us for our projects, not the way that is shaped by a specific tool, originally built for another purpose. We would also like really simple ways to enter that data. Those two features combined would enable all of the scratching in the lab notebooks and envelopes to be put together digitally - indexing, searching, sharing, extending - they all become trivial.

This page is a brainstorming forum for how to shape such a tool - what it should be able to do, how to make it, what it could be used for, etc. The first major hurdle in building a tool is to come up with a concrete concept - the hope is that we can achieve this through discussion and this forum.

Your contribution is vital! You are the scientists and end-users. All of your comments will shape whatever comes out of this discussion. Please post ideas on the discussion page, eventually they will be migrated over here as things take shape. Keep following the progress - it will be fun to see how your individual contributions are weaved together!

Announcements

Lucks 20:04, 14 April 2006 (EDT): I have completed the basic outline and content of this page - I look forward to hearing from all of you!

The Idea

The goal is some flexible framework where it is easy to create and modify databases on the fly, with nice user interfaces for database entry. Flexible online databases, customizable and maintained by a big group of people with little computer knowledge - but expert knowledge on the subject covered by the database would add an enormous value to OWW, and the scientific community as a whole. One can also envision some sort of peer evaluation for the validity of the entries (i.e. something along the lines of what they use for stories and comments, over at digg.com).

Features

  • Easy to use database creation.
  • Easy to specify database design - especially variable codings. We would like a way to store a variable in the database in a machine-understandable way (for example an NCBI accession number), but display it in a human-understandable way (the gene name).

Possible Uses

  • Your own fully-customizeable literature database that would be able to store literature references and associated information, exactly how you wanted. (For an example and an implementation that sparked this discussion, see the Pica Literature Database).
  • Quick and dirty literature searches where you want to extract just a few pieces of information from each reference, then search through them later. (For an example, see PubMed Searches for Phage Gene Expression, which is implemented with Wufoo).

Implementation Ideas

A web-based implementation seems to make the most sense since it will allow powerful data sharing, synchronization, distribution, etc. It will also allow integration with the many new types of information that will be available with the Web 2.0. To that end, there are several possibilities:

  • An XML grammer designing scheme whereby user input is translated into an XML grammer. Some sort of form is filled out, and the record is stored as flat XML.
  • Frontend to a serious database. Here the full gamut of relational databases could be used (without the user knowing).
  • Wiki template pages - the database design could come in the form of designing a wiki template. Records are entered by filling out the template similar to the OWW Calendar. After all, Mediawikis are just user interfaces for database entry of content pages.

Sources of Inspiration

  • Wufoo has many of the features we would want with some shortcomings. It does look really nice though!
  • Ruby on Rails Screencasts: one look at Creating a Weblog in 15 Minutes will make you realize that the only step not done by Rails is the original database creation!

Want to get this going?

If you are interested in more than contributing to this wiki page, see Lucks for contact information.