Endy:Notebook/Synthetic Biology Data Transfer Protocol

From OpenWetWare

< Endy:Notebook
Revision as of 04:41, 24 November 2009 by Cesar A. Rodriguez (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search
Add to My Links

Search this Project

Customize your entry pages

Project Description/Abstract

Synthetic Biology Data Transfer Protocol (SB/DTP) is a simple, platform- and data model- independent specification for transferring data via the world wide web between members of the Synthetic Biology community. Platform-independence is achieved by implementing SB/DTP as a client/server architecture where the clients connect to servers that expose an application interface (API) as a RESTful web service. RESTful web services use the standard GET, POST, and UPDATE requests defined in the Hypertext Transfer Protocol (HTTP) to receive and transmit data. Data model-independence is a desirable trait in a data transfer protocol. It gives data providers the freedom to change their data model as needed with an unlikely need to change the protocol. Data model-independence is achieved by implementing SB/DTP as an entity-attribute-value (EAV) system. In an EAV system, a client can request data from a server by indicating the entities and atttibutes you want. For example, a SB/DTP compliant client can request from an SB/DTP compliant server the "sequence" (attribute) for the "basic part" (entity) with the "identifier" (attribute) "equal to" (request method) BBa_K105001 (value). Information about the structure of the data model being used by the SB/DTP compliant server resides outside of the protocol specification. The data model specification can reside in web pages that document the services provided by a particular site.

Developers

Presentations

Review

Google Data Protocol. http://code.google.com/apis/gdata/.

Use Cases, Requirements, Comments

  • Export BioBrick Part sequences as Genbank files with full annotation.
  • SynBioSS Requirements
    • Parts Feed: It would be useful to have the "type" of part (e.g. Promoter, RBS, Composite/Device...) as a field. At the moment we parse through a part's features and try to deduce its type, but this is inefficient and not all parts are annotated with features. To my knowledge, "type" is required by the Parts Registry.
    • Features Feed: The Registry is highly redundant with regards to features. For example, BBa_E0040 is a Coding DNA BioBrick that has a feature for "GFP Protein". The issue is that any composite brick containing BBA_E0040, e.g. BBa_I8510, also contains that "GFP Protein" feature. To remedy this, the Standard Biological Parts Web Service should only store features for individual parts, not Composites/Devices, unless said features arise due to the arrangement of the individual parts in a device. These could be stored as "Device Features" instead of "Part Features".
  • Include the short description for all parts returned by the service
  • Ability to query for the number of parts in a subset (i.e. Favorites, Recents, By Type)
  • Ability to request for parts n to m in a subset (eg. Favorite parts 1 to 20)
  • Ability to request for the long description of a part
  • Provide the contact information of the part engineers
  • 12:13, 8 April 2009 (EDT): some rough] interface ideas. The goal is to have two-levels - at the bottom is a simple, traditional strain database that considers metadata associated with organisms frozen in a freezer, possibly with plasmids in them. The second level considers the physical implementation of 'parts' in these plasmids. Level 1 would be fully functional (and immediately useful) without level 2.
import registries

registry = registries.fetch('MIT')

recents = registry.recents() # grab recent entries, etc.

strain = registry.fetch('MIT_001') # the lowest level organization is in terms of strains - organisms in a freezer somewher
base_genotype = strain.base_genotype  # they have genotypes, resistance markers, etc.
comments = strain.usage_notes
resistance_markers = strain.markers
...

plasmids = strain.plasmids   # strains can have plasmids which also have resistance markers, origins, etc.
for plasmid in plasmids:
    plasmid.markers
    plasmid.origin
    part_BBa = plasmid.part('BBa')  # get a part embedded in the plasmid using shortcut notation
    part_BBb = plasmid.part('BBb')
    part_Eco_Bam = plasmid.part(('EcoRI','BamHI')) # get a specific cut of the part by specifying pairs of sites

    part = part_BBa
    part.description  # parts have metadata associated with them
    ...

...
part1 = plasmid1.part('BBa','left') # left and right cuts of parts
part2 = plasmid2.part('BBa','right')
composite_part = part1.compose(part2) # create part composition
...

Design

Query Syntax


Example:


Query String Parameters:

v = SB/DTP Version
e = Entity
a = Attribute
f = Query Function
p = Query Funciton Parameter
m = Maximum Size of the Result Set
i = Index of the First Entity

SB/DTP Version 
It is optional. If not present, defaults to the most current version.

Query Function 
Functions supported are:

Function Name Operator SB/DTP Syntax
Contains N/A contains
Starts With N/A startswith
Equals = equals
Not Equal != notequal
Greater Than > greaterthan
Less Than < lessthan
Greater Than or Equal >= greaterthanorequal
Less Than or Equal <= lessthanorequal

Query Funtion Parameter 
Parameter for a query function.

Maximum Size of the Result Set
Indicates the maximum number of entities the web service should return.  The default is 30.

Index of the First Entity
Indicates the numerical position of the first entity to be returned from the total set of retrieved entities.  The default is 0.
Eg. If you want entities 31 through 60 of the 100 entities that meet your search criteria the URL would be the following:


Response Class Model

Image:SB_DTP_Class_Model.png

Implementation

Maintenance

Notes

  • Note I
  • Note II


Recent changes



Personal tools