User:Ilya/Registry

From OpenWetWare
Jump to navigationJump to search

Home        About        Conferences        Labs        Courses        Resources        FAQ       

Data or Metadata

(from LSID best practices) Data is defined as a sequence of unchanging bytes. Examples of data are microscope images, a protein sequence, a text file, etc. Metadata is usually information that describes the data either literally (date created, MD5 check sum, size) or contains information describing the relationship between the data and other objects. If you cannot determine what should be data and what should be metadata from your data model, follow this rule of thumb: Large byte sequences are easier to manipulate as data, while short byte sequences can be included as data, metadata, or made available in both forms.

Abstraction Hierarchy

  • Part - simple biological function encoded in DNA
  • Device - simple logical function; collection of parts
  • System - collection of devices
  • Device is_a part in context of the system but also device has_a part.
  • Device is_a subclass of Part, System is_a subclass of Device
  • How to represent barriers and interfaces betwee levels of abstration?
  • Genetic, protein and cell devices

BE768

Jena

Ruby

Miscellaneous

  • Semantics - the meaning that is implied by words and sentences.
  • Software agent can search distributed registries using an ontology. This is impossible right now because storage schema is unknown.
  • Data is represented by a graph of triples (statements about resources)
  • Syntax doesn't matter: there are many ways to serialize the data (XML, N3, etc).
  • Ontology vs taxonomy vs thesaurus vs list
  • Ontology vs Taxonomy vs Folksonomy vs Collabulary
    • Taxonomy - ontology that has concepts without attributes.
  • Microformats
    • "lowercase semantic web"
    • humans first, machines second
  • HCLS task forces:
    • BIORDF (Structured data to RDF) - Susie Stephen, Joanne Luciano co-leads
    • T2S (Text to Structured RDF) - Robert Futrelle, Matthew Cockerill
  • Architecture of the World Wide Web @ W3C
  • Reification @ Wikipedia

To Do

  • Use LSID for parts identification
    • do we need to setup LSID resolution service?
  • How ot represent sequence features?
    • Part has features and has a sequence
    • Sequence has features but a part already has sequence
  • Tools to create and edit ontology and RDF instances?
    • Protege from Stanford?
    • IsaViz from W3C?
  • legacy RDBMS <-> RDF <-> objects (e.g., Javascript)
  • how to convert legacy RDBMS to RDF data store?

From XML to RDF

(from [1])

  • ?

Links

Metadata applications

  • Google Base
  • Biozon - a unified biological resource on DNA sequences, proteins, complexes and cellular pathways. The information in Biozon is logically represented as a graph in which nodes represent some unit of data, and edges indicate a relationship between two nodes.
  • Open Directory RDF dump The Open Directory Project is the largest, most comprehensive human-edited directory of the Web. It is constructed and maintained by a vast, global community of volunteer editors.

Other

This site is hosted on OpenWetWare and can be edited by all members of the Synthetic Biology community.
Making life better, one part at a time.