The BioBricks Foundation:Standards/Technical/Exchange/Core Data Model: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 30: Line 30:


* '''partSequence''' -- PartSequence object [0..1], describing the sequence of sub-Parts and scar sequences from which one can construct this part. This should be a sequence of "basic" parts that can not be de-composed further.
* '''partSequence''' -- PartSequence object [0..1], describing the sequence of sub-Parts and scar sequences from which one can construct this part. This should be a sequence of "basic" parts that can not be de-composed further.
* '''feature''' -- pointer to a SequenceAnnotation object [0..n], which links regions of the sequence to a GeneBank-classified functional description (work in progress).


=== Missing bits and pieces ===
=== Missing bits and pieces ===

Revision as of 11:14, 21 August 2009

Overview

The core data model covers the low-level description of DNA constructs. The definition of a "Part" is at the heart of the model. Parts can be combined with Vectors (which are a sub-class of Part) into the description of a full DNA molecule (for example a complete plasmid). This DNA molecule can then be associated with a DNA- or cell-stock sample. Note, the image is not showing all data fields of every class.

Definitions

Part

A part is a building block for synthetic biology. At the moment, we are mostly concerned with the DNA-level description. DNA parts MUST map to a continuous stretch of DNA. Multiple disconnected segments have to be broken up into separate parts.

Informally, a part can be either "basic" or "composite". The sequence of composite parts is a concatenation of smaller sub-Parts -- often with intervening "scar" sequences from assembly reactions. However, we do not define special sub-classes for basic and composite parts. Whenever available, the sub-part composition should be described in addition to the plain text sequence.

We later also want to represent RNA or protein parts and we need to represent the relations between them. For example, several different DNA parts may translate to the same protein part. How we exactly do this remains a matter of discussion. My (Raik's) suggestion is to introduce different "Description levels" (i.e. sub-classes of parts) and allow multiple inheritance between part objects. For example, a part with the beta-lactamase DNA sequence would be of "Type" DnaPart (description level) and would be the child (inheritance) of a ProteinPart that describes the amino acid sequence, structure and activity of this beta-lactamase enzyme.

Fields

  • name -- string [1], common name of the part
  • shortDescription -- string [0..1], very brief description (less than 100 characters) for display in tables and lists
  • longDescription -- string [0..1], detailed human readable description
  • author -- Person object(s) [0..n], (rename this to the equivalent foaf:maker from the FriendOfAFriend vocabulary?)
  • ?? owl:type -- pointer to class [1..n], which is used as description level
  • dnaSequence -- string [0..1], only applies to DNA parts
  • partSequence -- PartSequence object [0..1], describing the sequence of sub-Parts and scar sequences from which one can construct this part. This should be a sequence of "basic" parts that can not be de-composed further.
  • feature -- pointer to a SequenceAnnotation object [0..n], which links regions of the sequence to a GeneBank-classified functional description (work in progress).

Missing bits and pieces

  • Keeping track of rating and experiences:
 We should consider using the RDFa vocabulary for user reviews -- it's supported by a large industry consortium including Google.
  • Categorization of parts:
 I suggest using multiple part inheritance for grouping parts into families and categories
  • Characterization, experimental data, systems biology models, ...