BBF RFC 31
Vincent, 16th May 2009
it is a good surprise to read about some new PoBoL development. I fully agree with the authors that the Synthetic Biology community needs an open and standardized data model to represent and exchange BioBricks. However, I need to be convinced that this brand new RDF schema is the best way to engage with the rest of the SynBio community, as well as the wider scientific community that works with existing DNA sequence formats.
Please find below some comments about the proposal. I hope they are clear and useful.
> related to [6. Motivation]
What about compatibility with existing DNA sequence standards, and their respective database/tools ? I understand that SynBio will require some specific features, but is it really required to start from scratch. Defining a standard that would extend on previous standards might help us to avoid reinventing the wheel, as well as reaching out other communities (in terms of people and software solutions / infrastructures). At least, I would be interested to read about the authors' reasons not to consider, at all, any of the existing standards to represent DNA parts.
I would also find useful if the authors could describe two or three relevant scenarii, where such a data format would be required.
*Michal Galdzicki 14:20, 25 May 2009 (PDT): reply
> related to [10.1.1 Class BioBrick]
Is there a concept of unicity for a BioBrick ? Or is it accepted for the proposed standard to have duplicates ? For example, two Biobricks with same DNASequences, same Format, but different ShortDescriptions ? Also, if unicity is required, at which level ? in the same lab or in the all world ? Would it be useful to consider unique identifiers ?
"The BioBrick class May be extended at any time". This built-in flexibility might be difficult to deal with when people try to practically implement the schema.
What about sequence annotations ?
> related to [10.1.3 BioBrickBasic]
Quick clarification: Let's say that I use direct DNA synthesis to get a 5kb (4 genes) metabolic pathway, with prefix/suffix chosen to satisfy a particular BioBrick standard (+ not incompatible restriction sites in the 5kb). From what I understand, this would constitute a BioBrickBasic instance, no ?
> related to [10.1.5 BioBrickFormat]
Recombinant DNA is a method amongst others to put together 2 pieces of DNA. For example, in vitro recombination could very well become a popular way of physically assembling DNA (no resulting scar). It looks like this proposed scheme only considers Recombinant DNA-type assemblies. Is it a limitation ? Is it ok ? Or are we saying that pieces of DNA using homology recombination for assembly will never be considered as BioBricks ?
At the end, if this proposal is restricted to BioBricks, as opposed to generic "DNA-parts" (or assembled DNA sequences), I would say that it is a limitation of the scope to accomodate future genetic circuit assemblies.
> related to [Class Sample]
What if the sample is a PCR product (linear DNA, no vector) ? How would you distinguish between a mini-prep in buffer, dry DNA, or a stab ?
At the end, I am not sure that this type of information is very useful. I would prefer to see a community agreement on key attributes before getting into those details that are more relevant to a Laboratory Information System.
> General comments
Without denying the descriptive power of RDF, I feel that using a RDF framework, at this stage, might prevent a majority of people within the community to engage with this important process of describing essential features of "DNA-parts". I would prefer to see a "Minimum Information Required for the description of a DNA-part" discussion before getting into a specific knowledge representation, such as RDF.
- See MIBBI: Minimum Information for Biological and Biomedical Investigations
- See The minimum information about a genome sequence (MIGS) specification
- A possible way to work toward a DNA-part format could be:
- Step 1: Get the community to agree on a Minimum Information Required document
- Step 2: Generate a Data Model (UML)
- Step 3: Create a proof of concept implementation with associated software tools to validate/read/write the standard (in RDF for example)
Could the authors comment on the impact of such information model on the current MIT registry, and on future part registries ?
Characterization of Biobricks is one of the highest priorities for our community. How the authors suggest to integrate this new type of information in PoBol ? Will it be part of the standard, or will it require a different system ?