User:Ilya/Registry: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
 
(85 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{Synthetic biology top}}
==To do==
<div style="padding: 10px; color: #000000; background-color: #ccccff; width:730px" >
*Map parts database schema to RDF/OWL (D2R Map/Server)
**[http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Tasks/SenseLab Use RDF/OWL to describe neuronal data available in SenseLab] - similar project
*[http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Tasks/ HCLSIG BioRDF subgroup tasks] - interesting projects
*Use LSID for parts identification
**setup LSID resolution service
*How to represent sequence features (do they belong to sequence or part)?
**Part has features and has a sequence (piece of DNA with molecular function combined by BB assembly)
**Sequence has features but a part already has sequence
*Tools to create and edit ontology and RDF instances?
**Protege from Stanford?
**IsaViz from W3C?
*existing RDBMS <-> RDF <-> objects (e.g., Javascript)
*Do we need "Device"?
*I want to build a NOR gate vs. I have a NOR gate
*Find a way to use MediaWiki software to work with the Semantic Web ontology of biological parts: create a UI from the description of a part in the ontology that would check the entered information for correctness according to the part definition in the ontology.
 
==To read==
*[[doi:10.1371/journal.pone.0000339|Deductive Biocomputing]]
:As biologists increasingly rely upon computational tools, it is imperative that they be able to appropriately apply these tools and clearly understand the methods the tools employ. Such tools must have access to all the relevant data and knowledge and, in some sense, “understand” biology so that they can serve biologists' goals appropriately and “explain” in biological terms how results are computed.
*[http://dig.csail.mit.edu/2007/01/camp/ Semantic Web Boot Camp 2007 IAP]
*[http://www.semantictools.ru/ SemanticTools.ru]
*[http://www.lisperati.com/tellstuff/ How To Tell Stuff To A Computer - The Enigmatic Art of Knowledge Representation]
*[http://esw.w3.org/topic/HCLS/Banff2007Demo HCLS Demo given in Banff at WWW2007]
*[http://dataportability.org/ DataPortability project] - share and remix data using open standards
*[http://theinfo.org/ (theinfo)] - for people with large data sets
*[http://ibm-slrp.sourceforge.net/ IBM Semantic Layered Research Platform]
*[http://esw.w3.org/topic/LinkedData Linked Data] is to spreadsheets and databases what the Web of hypertext documents is to word processor files
*[http://www.w3.org/2001/sw/sweo/ Semantic Web Education and Outreach (SWEO) Interest Group]
**[http://www.w3.org/TR/cooluris/ Cool URIs for the Semantic Web] - document explaining the effective use of URIs to enable the growth of the Semantic Web
*[http://www.freebase.com/ Freebase] is an open, shared database of the world's knowledge
*Using RDF on the Web: [http://thefigtrees.net/lee/blog/2007/01/using_rdf_on_the_web_a_survey.html A Survey], [http://thefigtrees.net/lee/blog/2007/01/using_rdf_on_the_web_a_vision.html A Vision]
*[http://www.w3.org/2005/Incubator/rdb2rdf/ W3C RDB2RDF Incubator Group]
*[http://www.biositemap.org/ Biositemap] allows scientists, engineers, centers and institutions engaged in modeling, software tool development and analysis of biomedical and informatics data to broadcast and disseminate to the world the information about their latest computational biology resources (data, software tools and web-services) - [[Wikipedia:Biositemap|from Wikipedia]].
*[http://bioontology.org/projects/ontologies/SoftwareOntology/ Software Ontology]
*[http://intranet.cs.man.ac.uk/bhig/  Bio-Health Informatics Group] at the University of Manchester
*[http://img.cs.man.ac.uk/ The Information Management Group] at the University of Manchester
*[[Wikipedia:Semantic_search|Semantic search]]
*[http://www.topazproject.org/ Topaz] is a powerful object to RDF persistence and query service ([http://www.plos.org/cms/node/260 used by PLoS]).
 
==BBF standards==
*[[The_BioBricks_Foundation:Standards/Technical|Technical standards]]
**[[The_BioBricks_Foundation:Standards/Technical/Exchange|Data exchange]]
**[[PICA_Framework_Draft_Proposal_Documents|Part Interaction and Composition Assertion Framework Draft Proposal Documents]]
**[http://brickit.wiki.sourceforge.net/Data+model BrickIt data model] aims to create a portable web-based registry that helps synthetic biologists to plan, organize and track their local biobrick samples ([http://brickit.wiki.sourceforge.net/ wiki])
**[http://biohack.sourceforge.net/wiki/index.php/Biobricks Biobricks at Biohack wiki]
 
==External projects==
*[http://research.nokia.com/projects/connectingme ConnectingMe] project will develop a new application architecture that uses a semantic web information repository and data integration engine along with a user customizable presentation engine
*[http://projects.csail.mit.edu/jourknow/ Jourknow] - semantic web personal info organizer ([http://projects.csail.mit.edu/jourknow/study/ FAQ])
 
==Meetings==
*[http://esw.w3.org/topic/CambridgeSemanticWebGatherings/Gatherings/ Cambridge Semantic Web Gatherings]
**Demos:
***[http://e-culture.multimedian.nl/ MultimediaN N9C Eculture project homepage]
***[http://cmch.tv/research/semanticSearch.asp CMCH Database of Literature smart search]
***[http://demo.openlinksw.com/isparql/ OpenLink iSPARQL]


==Data or Metadata==
==Data or metadata==
(from [http://www.ibm.com/developerworks/opensource/library/os-lsidbp/ LSID best practices])
(from [http://www.ibm.com/developerworks/opensource/library/os-lsidbp/ LSID best practices])
Data is defined as a sequence of unchanging bytes. Examples of data are microscope images, a protein sequence, a text file, etc. Metadata is usually information that describes the data either literally (date created, MD5 check sum, size) or contains information describing the relationship between the data and other objects.
Data is defined as a sequence of unchanging bytes. Examples of data are microscope images, a protein sequence, a text file, etc. Metadata is usually information that describes the data either literally (date created, MD5 check sum, size) or contains information describing the relationship between the data and other objects.
If you cannot determine what should be data and what should be metadata from your data model, follow this rule of thumb: Large byte sequences are easier to manipulate as data, while short byte sequences can be included as data, metadata, or made available in both forms.
If you cannot determine what should be data and what should be metadata from your data model, follow this rule of thumb: Large byte sequences are easier to manipulate as data, while short byte sequences can be included as data, metadata, or made available in both forms.


==Abstraction Hierarchy==
==From XML to RDF==
*Part - simple biological function encoded in DNA
[[doi:10.1038/nbt1139|From XML to RDF]]: how semantic web technologies will change the design of 'omic' standards
*Device - simple logical function; collection of parts
* ?
*System - collection of devices
 
*Device is_a part in context of the system but also device has_a part.
==Microformats==
*Device is_a subclass of Part, System is_a subclass of Device
*[http://microformats.org/ microformats.org]
*How to represent barriers and interfaces betwee levels of abstration?
*[http://gmpg.org/xfn/ XFN] - Xhtml Friends Network
*Genetic, protein and cell devices


==Miscellaneous==
==Miscellaneous==
Line 21: Line 75:
*Data is represented by a graph of triples (statements about resources)
*Data is represented by a graph of triples (statements about resources)
*Syntax doesn't matter: there are many ways to serialize the data (XML, N3, etc).
*Syntax doesn't matter: there are many ways to serialize the data (XML, N3, etc).
*[http://www.biowisdom.com/ontology/faq_q1.htm Ontology vs taxonomy vs thesaurus vs list]
*[[Wikipedia:Folksonomy|Folksonomy]] vs [[Wikipedia:Collabulary|Collabulary]]
*[http://microformats.org/ Microformats]
**"lowercase semantic web"
**humans first, machines second
*HCLS task forces:
*HCLS task forces:
**[http://www.w3.org/2001/sw/hcls/task_forces/BIORDF.doc BIORDF] (Structured data to RDF) - Susie Stephen, Joanne Luciano co-leads
**[http://www.w3.org/2001/sw/hcls/task_forces/BIORDF.doc BIORDF] (Structured data to RDF) - Susie Stephen, Joanne Luciano co-leads
**[http://www.ccs.neu.edu/home/futrelle/W3C-HCLSig/group-report-draft26Jan06.html T2S] (Text to Structured RDF) - Robert Futrelle, Matthew Cockerill
**[http://www.ccs.neu.edu/home/futrelle/W3C-HCLSig/group-report-draft26Jan06.html T2S] (Text to Structured RDF) - Robert Futrelle, Matthew Cockerill
*[http://www.w3.org/TR/webarch/ Architecture of the World Wide Web] @ W3C
*[http://www.w3.org/TR/webarch/ Architecture of the World Wide Web] @ W3C
 
*[[Wikipedia:Reification|Reification]] @ Wikipedia
==To Do==
*[[Wikipedia:Metadata|Metadata]]
*Use LSID for parts identification
**[[Wikipedia:Semantic_mapper|Semantic mapper]] is tool or service that aids in the transformation of data elements from one namespace into another namespace.
*How ot represent sequence features?
**[[Wikipedia:Metadata_registry|Metadata registry]] is a central location in an organization where metadata definitions are stored and maintained in a controlled method.
**Part has features and has a sequence
*[http://www.idealliance.org/proceedings/xtech05/papers/02-03-01/ XSLT vs XQuery]
**Sequence has features but a part already has sequence
*[http://www.w3.org/TR/sw-oosd-primer/ A Semantic Web Primer for Object-Oriented Software Developers] - OO vs SW, links to software, etc
*Tools to create and edit ontology and RDF instances?
*rdfs:label vs rdfs:comment
**Protege from Stanford?
**used to describe a resource with human readable text in addition to "pure" RDF properties (may have multiple values for internationalization needs)
**IsaViz from W3C?
**rdfs:label is used to give a human-readable name of a resource
*legacy RDBMS <-> RDF <-> objects (e.g., Javascript)
**rdfs:comment is used to give a longer description
*how to convert legacy RDBMS to RDF data store?
*[http://www.ibm.com/developerworks/xml/library/x-tiprdfai.html rdf:about and rdf:ID in RDF/XML]
 
*Resource manipulation and description: URIQA, REST, [http://webdav.org/ WebDAV], WSDL, etc
==From XML to RDF==
**[http://www.mnot.net/blog/2004/04/27/webdav4rest WebDAV for REST]
(from [http://dx.doi.org/10.1038/nbt1139])
**[http://www.mnot.net/blog/2004/04/14/rest_in_wsdl REST in WSDL]
* ?
**[http://esw.w3.org/topic/WebDescriptionProposals Web Description Proposals]
 
*[[Wikipedia:Logic|Logic]] studies the laws of valid [[Wikipedia:Inference|inference]] (the act or process of deriving a conclusion based solely on what one already knows).
==Links==
*[[Wikipedia:Closed_world_assumption|Closed world assumption]] is the presumption that what is not currently known to be true is false.
===Metadata applications===
*[[Wikipedia:Open_World_Assumption|Open World Assumption]] assumes that its knowledge of the world is incomplete. If something cannot be proved to be true, then it doesn't automatically become false.
*[http://base.google.com/ Google Base]
*[http://biozon.org/ Biozon] -  a unified biological resource on DNA sequences, proteins, complexes and cellular pathways. The information in Biozon is logically represented as a graph in which nodes represent some unit of data, and edges indicate a relationship between two nodes.
*[http://rdf.dmoz.org/ Open Directory RDF dump] The Open Directory Project is the largest, most comprehensive human-edited directory of the Web. It is constructed and maintained by a vast, global community of volunteer editors.
 
===Other===
*[http://www.mozilla.org/rdf/doc/ RDF in Mozilla]
 
</div>
{{Synthetic biology bottom}}

Latest revision as of 14:31, 23 June 2008

To do

  • Map parts database schema to RDF/OWL (D2R Map/Server)
  • HCLSIG BioRDF subgroup tasks - interesting projects
  • Use LSID for parts identification
    • setup LSID resolution service
  • How to represent sequence features (do they belong to sequence or part)?
    • Part has features and has a sequence (piece of DNA with molecular function combined by BB assembly)
    • Sequence has features but a part already has sequence
  • Tools to create and edit ontology and RDF instances?
    • Protege from Stanford?
    • IsaViz from W3C?
  • existing RDBMS <-> RDF <-> objects (e.g., Javascript)
  • Do we need "Device"?
  • I want to build a NOR gate vs. I have a NOR gate
  • Find a way to use MediaWiki software to work with the Semantic Web ontology of biological parts: create a UI from the description of a part in the ontology that would check the entered information for correctness according to the part definition in the ontology.

To read

As biologists increasingly rely upon computational tools, it is imperative that they be able to appropriately apply these tools and clearly understand the methods the tools employ. Such tools must have access to all the relevant data and knowledge and, in some sense, “understand” biology so that they can serve biologists' goals appropriately and “explain” in biological terms how results are computed.

BBF standards

External projects

  • ConnectingMe project will develop a new application architecture that uses a semantic web information repository and data integration engine along with a user customizable presentation engine
  • Jourknow - semantic web personal info organizer (FAQ)

Meetings

Data or metadata

(from LSID best practices) Data is defined as a sequence of unchanging bytes. Examples of data are microscope images, a protein sequence, a text file, etc. Metadata is usually information that describes the data either literally (date created, MD5 check sum, size) or contains information describing the relationship between the data and other objects. If you cannot determine what should be data and what should be metadata from your data model, follow this rule of thumb: Large byte sequences are easier to manipulate as data, while short byte sequences can be included as data, metadata, or made available in both forms.

From XML to RDF

From XML to RDF: how semantic web technologies will change the design of 'omic' standards

  • ?

Microformats

Miscellaneous

  • Semantics - the meaning that is implied by words and sentences.
  • Software agent can search distributed registries using an ontology. This is impossible right now because storage schema is unknown.
  • Data is represented by a graph of triples (statements about resources)
  • Syntax doesn't matter: there are many ways to serialize the data (XML, N3, etc).
  • HCLS task forces:
    • BIORDF (Structured data to RDF) - Susie Stephen, Joanne Luciano co-leads
    • T2S (Text to Structured RDF) - Robert Futrelle, Matthew Cockerill
  • Architecture of the World Wide Web @ W3C
  • Reification @ Wikipedia
  • Metadata
    • Semantic mapper is tool or service that aids in the transformation of data elements from one namespace into another namespace.
    • Metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.
  • XSLT vs XQuery
  • A Semantic Web Primer for Object-Oriented Software Developers - OO vs SW, links to software, etc
  • rdfs:label vs rdfs:comment
    • used to describe a resource with human readable text in addition to "pure" RDF properties (may have multiple values for internationalization needs)
    • rdfs:label is used to give a human-readable name of a resource
    • rdfs:comment is used to give a longer description
  • rdf:about and rdf:ID in RDF/XML
  • Resource manipulation and description: URIQA, REST, WebDAV, WSDL, etc
  • Logic studies the laws of valid inference (the act or process of deriving a conclusion based solely on what one already knows).
  • Closed world assumption is the presumption that what is not currently known to be true is false.
  • Open World Assumption assumes that its knowledge of the world is incomplete. If something cannot be proved to be true, then it doesn't automatically become false.