OSDD:Huh?: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
No edit summary
 
(6 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{OSDD}}
{{OSDD}}


Open Source Drug Discovery (OSDD) is the application of methods widely employed in open source software development to the process of drug discovery - that all data and ideas are freely shared between participants and that anyone may take part.<br>
== OSDD ==


'''Open source software development'''
Open Source Drug Discovery (OSDD) is drug discovery conducted using methods widely employed in open source software development: all data and ideas are freely shared between participants and anyone may take part.


There are many examples of high quality, robust and widely used applications that were developed by an open source model. In these cases the development was open to anyone, and the final product emerged from a distributed team of participants. In many cases a kernel of work was funded, while the extra development by the community was not.
=== Open Source Software Development ===


'''Open data'''
Open source in software development implies the project was open to anyone, and the final product emerged from a distributed team of participants. There may have been a funded kernel of work initially, but the subsequent development by the community is not explicitly funded. There are many examples of high quality, robust and widely used applications that were developed by an open source model, such as the Firefox and Chrome web browsers, the Linux operating system and the Apache web server. There are thriving open source software development communities on the web at, for example, [http://sourceforge.net/ Sourceforge] and [https://github.com/ GitHub]. Central to the operation of these sites and projects is the sharing of data and ideas in near-real time.<br>
Many valuable initiatives advocating open data have emerged in which large datasets are deposited to assist groups of researchers (e.g., Pubchem, ChEBI and SAGE Commons); the release of the GSK data falls into this class. These important ventures employ the internet merely as an information resource, rather than as a means for active collaboration.<br>


For people to work together on the web, data must be freely available. Thus the posting of open data is a necessary but not sufficient condition for open science to happen. Available data may be used by anyone with no requirement to work with anyone. As an example, the deposition of malaria bioactivity data in mid-2010 was a significant contribution to the field, but use of the data does not oblige the user to collaborate with anyone.<br> 
=== Open data ===


'''Open Innovation and Prize-based Incentives'''
Many valuable initiatives advocating open data have emerged in which large datasets are deposited to assist groups of researchers (''e.g.,'' [http://pubchem.ncbi.nlm.nih.gov/ Pubchem], [https://www.ebi.ac.uk/chembldb/ ChEMBL] and [http://sagebase.org/ SAGE Bionetworks]); the [http://dx.doi.org/10.1038/nature09107 release of malaria data in 2010] falls into this class. These very important ventures employ the internet as an information resource, rather than as a means for active collaboration. For people to work together on the web, data must be freely available. Yet the posting of open data is only a necessary and not sufficient condition for open science. Open data may be used without a requirement to work with anyone. The GSK malaria data, for example, may be browsed and used by people engaged in closed, proprietary research projects - there is no obligation to enagage in an open research project.<br>
 
An important feature of open data is that it maximises re-use (or should be released in a way that permits re-use). Essentially the generator of data should avoid making assumptions about what data are good for. The data acquired by the Hubble space telescope [http://archive.stsci.edu/hst/bibliography/pubstat.html has led to more publications] by teams analysing the data than from the original teams that acquired the data.<br>
 
The [http://pantonprinciples.org/ Panton Principles] describe important recommendations for releasing data into the open.<br> 
 
=== Open Innovation and Prize-based Incentives ===


As an effort to stimulate innovation, several pharma companies have adopted an "[http://en.wikipedia.org/wiki/Open_innovation open innovation]" model. This is a somewhat nebulous term that means companies must try to bring in the best external ideas to complement in-house research.[http://www.nature.com/nrd/journal/v9/n2/abs/nrd3099.html NRDD Article] The mechanisms of bringing in new ideas are:<br>
As an effort to stimulate innovation, several pharma companies have adopted an "[http://en.wikipedia.org/wiki/Open_innovation open innovation]" model. This is a somewhat nebulous term that means companies must try to bring in the best external ideas to complement in-house research.[http://www.nature.com/nrd/journal/v9/n2/abs/nrd3099.html NRDD Article] The mechanisms of bringing in new ideas are:<br>
Line 21: Line 26:
For more on this distinction see Will Spooner's [http://wspoonr.blogspot.com/2011/06/battle-between-open-science-and-open.html article].<br>
For more on this distinction see Will Spooner's [http://wspoonr.blogspot.com/2011/06/battle-between-open-science-and-open.html article].<br>


'''Crowdsourcing'''
=== Crowdsourcing ===
 
The use of a widely distributed set of participants to accelerate a project is a strategy that has been widely employed in many areas. The [http://en.wikipedia.org/wiki/The_Surgeon_of_Crowthorne writing of the Oxford English Dictionary] made use of volunteers to identify the first uses, or best examples of the use, of words. Pioneering work on distribution of computing power required on science projects (where the science itself was not necessarily an open activity) was achieved with the [http://setiathome.berkeley.edu/ SETI@Home] and [http://folding.stanford.edu/ Folding@Home] projects.<br>
With the rise of the web, several highly successful ''crowdsourcing'' experiments have emerged in which tasks are distributed to thousands of human participants, such as the [http://www.nature.com/doifinder/10.1038/nature09304 Foldit] and [http://doi.wiley.com/10.1111/j.1365-2966.2008.13490.x Galaxyzoo] projects. What is notable about such cases is the speed with which the science progresses through the harnessing of what has been termed the [http://en.wikipedia.org/wiki/Cognitive_Surplus “cognitive surplus”].
 
=== Open science ===
 
Open science is the application of open source methods to science. Thus data must be released as they are acquired, and it must be possible for any reader of the data to have an impact on the project. There should be a minimisation of groups working on parts of the project in isolation and only periodically releasing data - ideally complete data release and collaboration happen in real time, to prevent duplication of effort, and to maximise useful interaction between participants.<br>
 
Though there is no formal line to distinguish crowdsourced projects from open science projects, it could be argued that open science projects are mutable at every level. For example, while anyone could participate in the original Galaxyzoo project, the software, and the basic project methodology, were not open to change by those who participated. On the other hand in the Polymath project, while there was a question to answer at the outset, the direction the project took could be influenced by anyone, depending on how the project went. In the Synaptic Leap discovery of a chemical synthesis of a drug, the eventual solution was influenced by project participants as it proceeded.
 
=== Open Source Drug Discovery ===
 
Drug discovery is a complex process involving many different stages. Compounds are discovered as having some biological activity, and these are then improved through iterative chemical synthesis and biological evaluation. Compounds that appear to be promising are assessed for their behaviour and toxicity in biological systems. The move to evaluation in humans is the clinical trial phase, and there are regulatory phases after that, as well as the need to create the relevant molecule on a large scale.
 
Since no drug has ever been discovered using an open source approach it is difficult to be certain about how OSDD would work. However it seems likely that the biggest impact of the open approach would be in the early phases before clinical trials have commenced. Open methods could also have an impact on the process chemistry phase, in creating an efficient chemical synthesis on a large scale.<br>
 
Open work cannot be patented, since there can be no delays to release of data, and no partial buy-ins. If a group opts out of the project to pursue a "fork", they leave the project. Open source drug discovery must operate without patents. The hypothesis is that through working in an open mode, research and development costs are reduced, and research is accelerated. This offsets the lack of capital support for the project. Costs of clinical trials and product registration would have to be sourced from governments and NGOs. Whether this is possible is one of the central questions of OSDD.<br>


Several highly successful ''crowdsourcing'' experiments have emerged in which tasks are distributed to thousands of participants, such as the [http://www.nature.com/doifinder/10.1038/nature09304 Foldit] and [http://doi.wiley.com/10.1111/j.1365-2966.2008.13490.x Galaxyzoo] projects. What is notable about such cases is the speed with which the science progresses through the harnessing of what has been termed the “cognitive surplus”.[Shirky, C. Cognitive Surplus: Creativity and Generosity in a Connected Age; Penguin, 2010.]
=== What Can I Do? ===


'''Open science'''
Open projects rely on participation by interested strangers. To participate, find coordination pages of the projects, read and ask questions. Follow feeds from projects. If things aren't clear, try to contact someone involved with the project. Some sites, like this one, are wikis, and so you can edit those pages directly. Some are blogs, and so you can leave comments.


'''Open Source Drug Discovery'''
This wiki hosts current [[OSDD:Projects|projects]]

Latest revision as of 19:20, 9 February 2012

Open Source Research Home        Malaria        Tuberculosis        Links       


OSDD

Open Source Drug Discovery (OSDD) is drug discovery conducted using methods widely employed in open source software development: all data and ideas are freely shared between participants and anyone may take part.

Open Source Software Development

Open source in software development implies the project was open to anyone, and the final product emerged from a distributed team of participants. There may have been a funded kernel of work initially, but the subsequent development by the community is not explicitly funded. There are many examples of high quality, robust and widely used applications that were developed by an open source model, such as the Firefox and Chrome web browsers, the Linux operating system and the Apache web server. There are thriving open source software development communities on the web at, for example, Sourceforge and GitHub. Central to the operation of these sites and projects is the sharing of data and ideas in near-real time.

Open data

Many valuable initiatives advocating open data have emerged in which large datasets are deposited to assist groups of researchers (e.g., Pubchem, ChEMBL and SAGE Bionetworks); the release of malaria data in 2010 falls into this class. These very important ventures employ the internet as an information resource, rather than as a means for active collaboration. For people to work together on the web, data must be freely available. Yet the posting of open data is only a necessary and not sufficient condition for open science. Open data may be used without a requirement to work with anyone. The GSK malaria data, for example, may be browsed and used by people engaged in closed, proprietary research projects - there is no obligation to enagage in an open research project.

An important feature of open data is that it maximises re-use (or should be released in a way that permits re-use). Essentially the generator of data should avoid making assumptions about what data are good for. The data acquired by the Hubble space telescope has led to more publications by teams analysing the data than from the original teams that acquired the data.

The Panton Principles describe important recommendations for releasing data into the open.

Open Innovation and Prize-based Incentives

As an effort to stimulate innovation, several pharma companies have adopted an "open innovation" model. This is a somewhat nebulous term that means companies must try to bring in the best external ideas to complement in-house research.NRDD Article The mechanisms of bringing in new ideas are:

  • Prizes for solutions to problems (e.g., Innocentive). A competition means that teams work in isolation and do not pool ideas. Such a mechanism does not change the nature of the research, rather the motivation to participate. The pharmaceutical industry itself essentially already operates on this model.
  • Licensing agreements with academic groups/start-ups (e.g., Eli Lilly’s PD2 program). In such arrangements, companies may purchase the rights to promising ideas. Vigilance of intellectual property may of course shut down any open collaboration at a promising stage. It has therefore been proposed to limit open innovation science to “pre-competitive areas” (e.g., toxicology) but to date the industry has been unable to define what the term “pre-competitive” means beyond the avoidance of duplication of effort and the requirement for public-domain information resources.NRDD article

For more on this distinction see Will Spooner's article.

Crowdsourcing

The use of a widely distributed set of participants to accelerate a project is a strategy that has been widely employed in many areas. The writing of the Oxford English Dictionary made use of volunteers to identify the first uses, or best examples of the use, of words. Pioneering work on distribution of computing power required on science projects (where the science itself was not necessarily an open activity) was achieved with the SETI@Home and Folding@Home projects.

With the rise of the web, several highly successful crowdsourcing experiments have emerged in which tasks are distributed to thousands of human participants, such as the Foldit and Galaxyzoo projects. What is notable about such cases is the speed with which the science progresses through the harnessing of what has been termed the “cognitive surplus”.

Open science

Open science is the application of open source methods to science. Thus data must be released as they are acquired, and it must be possible for any reader of the data to have an impact on the project. There should be a minimisation of groups working on parts of the project in isolation and only periodically releasing data - ideally complete data release and collaboration happen in real time, to prevent duplication of effort, and to maximise useful interaction between participants.

Though there is no formal line to distinguish crowdsourced projects from open science projects, it could be argued that open science projects are mutable at every level. For example, while anyone could participate in the original Galaxyzoo project, the software, and the basic project methodology, were not open to change by those who participated. On the other hand in the Polymath project, while there was a question to answer at the outset, the direction the project took could be influenced by anyone, depending on how the project went. In the Synaptic Leap discovery of a chemical synthesis of a drug, the eventual solution was influenced by project participants as it proceeded.

Open Source Drug Discovery

Drug discovery is a complex process involving many different stages. Compounds are discovered as having some biological activity, and these are then improved through iterative chemical synthesis and biological evaluation. Compounds that appear to be promising are assessed for their behaviour and toxicity in biological systems. The move to evaluation in humans is the clinical trial phase, and there are regulatory phases after that, as well as the need to create the relevant molecule on a large scale.

Since no drug has ever been discovered using an open source approach it is difficult to be certain about how OSDD would work. However it seems likely that the biggest impact of the open approach would be in the early phases before clinical trials have commenced. Open methods could also have an impact on the process chemistry phase, in creating an efficient chemical synthesis on a large scale.

Open work cannot be patented, since there can be no delays to release of data, and no partial buy-ins. If a group opts out of the project to pursue a "fork", they leave the project. Open source drug discovery must operate without patents. The hypothesis is that through working in an open mode, research and development costs are reduced, and research is accelerated. This offsets the lack of capital support for the project. Costs of clinical trials and product registration would have to be sourced from governments and NGOs. Whether this is possible is one of the central questions of OSDD.

What Can I Do?

Open projects rely on participation by interested strangers. To participate, find coordination pages of the projects, read and ask questions. Follow feeds from projects. If things aren't clear, try to contact someone involved with the project. Some sites, like this one, are wikis, and so you can edit those pages directly. Some are blogs, and so you can leave comments.

This wiki hosts current projects