The BioBricks Foundation:Standards/Technical: Difference between revisions
(seeding page with content from BBF Standards mailing list) |
(No difference)
|
Revision as of 15:50, 4 February 2008
Data Exchange Standards
Questions
Discuss and answer these questions concerning data exchange standards:
What is the data model needed to describe a biobrick?
Once the data model is firmly in place, the format should follow as the one that best implements that data model. For example, if we settle on an RDF-like 'everything is a relationship triplet' approach, then some format that can handle these triplets would be most appropriate. In addition, with a model like this, there are XML-based and more human-readable formats that can both implement the model equally well.
I think that tying our selves to a format too early will make us not have a clear model in mind, and will cause us to hack up the format. It is best to do model, then format.
So things to think about in a model are what type of relationships to we want to convey?
- Inheritance (where was a particular part derived from, and by who = link + data)
- Characterization (something quantitative about that part by itself = data)
- Plays well with others (what other parts can this one interact with - with possible data associated with this interaction = link + data)
- ...
What is the best format / technology for exchange?
Suggestions
Please fill in these sections with details
create a new XML format
adapt existing CellML, SBML XML formats
create a custom file format
use Turtle/N3 notation for semantic web documents
Example of Turtle/N3
I somewhat share the reservation about completely new file formats, but the non-readability and general nastiness of XML is also an issue. A good solution, IMO, would be to use the Turtle format (formerly "notation 3" or N3) developed by the semantic web folks. It is concise, human-readable and editable (i used it myself some years ago) *AND* is equivalent to XML. That means there is a well defined translation back and for and many libraries and tools do the conversion. Being semantic web, it also solves the linking problem (everything is a link).
I'll cook up some small example and send it around later today. Quick Preview:
<html><pre>
- ... skipping namespace definition for rdf, bbf, harvard ...
- define a biobrick hosted at this address
- BBa_0001
rdf:type bbf:biobrick; bbf:sequence "AAACCCGGG"; bbf:contains [:BBa_0003, harvard:BBa_J1000, :BBa_00010];
.
- add information to biobrick defined elsewhere
harvard:BBa_J1000
rdf:sameAs :BBa_0999;
. </pre></html> OK, one can argue about human-readability but it's at least possible to understand and edit these documents (and much better than the equivalent xml).