OSDD:Huh?: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
No edit summary
Line 1: Line 1:
{{OSDD}}
{{OSDD}}


Open Source Drug Discovery (OSDD) is the application of methods widely employed in open source software development to the process of drug discovery - that all data and ideas are freely shared between participants and that anyone may take part.<br>
This page is intended as a concise description of what open source drug discovery is.<br>


'''Open source software development'''
Open Source Drug Discovery (OSDD) is drug discovery where the science is conducted using methods widely employed in open source software development: all data and ideas are freely shared between participants and anyone may take part.<br>


There are many examples of high quality, robust and widely used applications that were developed by an open source model. In these cases the development was open to anyone, and the final product emerged from a distributed team of participants. In many cases a kernel of work was funded, while the extra development by the community was not.
'''Open Source Software Development'''
 
There are many examples of high quality, robust and widely used applications that were developed by an open source model, such as the Firefox and Chrome web browsers, the Linux operating system and the Apache web server. In these cases the development was open to anyone, and the final product emerged from a distributed team of participants. In many cases a kernel of work was funded, while the extra development by the community was not. There are thriving open source software development communities on the web at, for example, Sourceforge and GitHub. Central to the operation of these sites and projects is the sharing of data and ideas in near-real time.<br>


'''Open data'''
'''Open data'''
Many valuable initiatives advocating open data have emerged in which large datasets are deposited to assist groups of researchers (e.g., Pubchem, ChEBI and SAGE Commons); the release of the GSK data falls into this class. These important ventures employ the internet merely as an information resource, rather than as a means for active collaboration.<br>


For people to work together on the web, data must be freely available. Thus the posting of open data is a necessary but not sufficient condition for open science to happen. Available data may be used by anyone with no requirement to work with anyone. As an example, the deposition of malaria bioactivity data in mid-2010 was a significant contribution to the field, but use of the data does not oblige the user to collaborate with anyone.<br>   
Many valuable initiatives advocating open data have emerged in which large datasets are deposited to assist groups of researchers (e.g., Pubchem, ChEBI and SAGE Commons); the release of the GSK malaria data falls into this class. These very important ventures employ the internet as an information resource, rather than as a means for active collaboration. For people to work together on the web, data must be freely available, but the posting of open data is a necessary and not sufficient condition for open science. Open data may be used without a requirement to work with anyone. The GSK malaria data, for example, may be browsed and used by people engaged in closed, proprietary research projects - there is no obligation to enagage in an open research project.<br>
 
An important feature of open data is that it maximises re-use (or should be released in a way that permits re-use). Essentially the generator of data should avoid making assumptions about what data are good for. The data acquired by the Hubble space telescope [http://archive.stsci.edu/hst/bibliography/pubstat.html has led to more publications] by teams analysing the data than from the original teams that acquired the data.<br>   


'''Open Innovation and Prize-based Incentives'''
'''Open Innovation and Prize-based Incentives'''
Line 23: Line 26:
'''Crowdsourcing'''
'''Crowdsourcing'''


The use of a widely distributed set of participants to accelerate a project is a strategy that has been widely employed in many areas. The writing of the Oxford English Dictionary made use of volunteers to identify the first uses, or best examples of the use, of words.[see The Professor and the Madman, Simon Winchester, XXX]. Pioneering work on distribution of computing power required on science projects (where the science itself was not necessarily an open activity) was achieved with the SETI@Home and Folding@Home projects.<br>  
The use of a widely distributed set of participants to accelerate a project is a strategy that has been widely employed in many areas. The writing of the Oxford English Dictionary made use of volunteers to identify the first uses, or best examples of the use, of words.[see The Professor and the Madman, Simon Winchester, XXX]. Pioneering work on distribution of computing power required on science projects (where the science itself was not necessarily an open activity) was achieved with the SETI@Home and Folding@Home projects.<br>
With the rise of the web, several highly successful ''crowdsourcing'' experiments have emerged in which tasks are distributed to thousands of human participants, such as the [http://www.nature.com/doifinder/10.1038/nature09304 Foldit] and [http://doi.wiley.com/10.1111/j.1365-2966.2008.13490.x Galaxyzoo] projects. What is notable about such cases is the speed with which the science progresses through the harnessing of what has been termed the “cognitive surplus”.[Shirky, C. Cognitive Surplus: Creativity and Generosity in a Connected Age; Penguin, 2010.]
With the rise of the web, several highly successful ''crowdsourcing'' experiments have emerged in which tasks are distributed to thousands of human participants, such as the [http://www.nature.com/doifinder/10.1038/nature09304 Foldit] and [http://doi.wiley.com/10.1111/j.1365-2966.2008.13490.x Galaxyzoo] projects. What is notable about such cases is the speed with which the science progresses through the harnessing of what has been termed the “cognitive surplus”.[Shirky, C. Cognitive Surplus: Creativity and Generosity in a Connected Age; Penguin, 2010.]


'''Open science'''
'''Open science'''
Open science is the application of open source methods to science. Thus data must be released as it is acquired, and it must be possible for any reader of the data to have an impact on the project. There should be a minimisation of groups working on parts of the project in isolation and only periodically releasing data - ideally data release and collaboration happen in real time, to prevent duplication of effort, and to maximise useful interaction between participants.<br>
Though there is no formal line to distinguish crowdsourced projects from open science projects, it could be argued that open science projects are mutable at every level. For example, while anyone could participate in the original Galaxyzoo project, the software, and question that the project set out to answer, were not open to change by those who participated. On the other hand in the Polymath project, while there was a question to answer at the outset, the direction the project took could be influenced by anyone, depending on how the project went. In the Synaptic Leap discovery of a chemical synthesis of a drug, the eventual solution was influenced by project participants as it proceeded.<br>


'''Open Source Drug Discovery'''
'''Open Source Drug Discovery'''
Drug discovery is a complex process involving many different stages. Compounds are discovered as having some biological activity, and these are then improved through iterative chemical synthesis and biological evaluation. Compounds that appear to be promising are assessed for their behaviour and toxicity in biological systems. The move to evaluation in humans is the clinical trial phase, and there are regulatory phases after that, as well as the need to create the relevant molecule on a large scale.<br>
Since no drug has ever been discovered using an open source approach it is difficult to be certain about how OSDD would work. However it seems likely that the biggest impact of the open approach would be in the early phases before clinical trials have commenced. Open methods could also have an impact on the process chemistry phase, in creating an efficient chemical synthesis on a large scale.<br>
Open work cannot be patented, since there can be no delays to release of data, and no partial buy-ins. If a group opts out of the project to pursue a "fork", they leave the project. Open source drug discovery must operate without patents. The hypothesis is that through working in an open mode, research and development costs are reduced, and research is accelerated. This offsets the lack of capital support for the project. Costs of clinical trials and product registration would have to be sourced from governments and NGOs. Whether this is possible is one of the central questions of OSDD.<br>

Revision as of 05:24, 23 July 2011

Open Source Research Home        Malaria        Tuberculosis        Links       


This page is intended as a concise description of what open source drug discovery is.

Open Source Drug Discovery (OSDD) is drug discovery where the science is conducted using methods widely employed in open source software development: all data and ideas are freely shared between participants and anyone may take part.

Open Source Software Development

There are many examples of high quality, robust and widely used applications that were developed by an open source model, such as the Firefox and Chrome web browsers, the Linux operating system and the Apache web server. In these cases the development was open to anyone, and the final product emerged from a distributed team of participants. In many cases a kernel of work was funded, while the extra development by the community was not. There are thriving open source software development communities on the web at, for example, Sourceforge and GitHub. Central to the operation of these sites and projects is the sharing of data and ideas in near-real time.

Open data

Many valuable initiatives advocating open data have emerged in which large datasets are deposited to assist groups of researchers (e.g., Pubchem, ChEBI and SAGE Commons); the release of the GSK malaria data falls into this class. These very important ventures employ the internet as an information resource, rather than as a means for active collaboration. For people to work together on the web, data must be freely available, but the posting of open data is a necessary and not sufficient condition for open science. Open data may be used without a requirement to work with anyone. The GSK malaria data, for example, may be browsed and used by people engaged in closed, proprietary research projects - there is no obligation to enagage in an open research project.

An important feature of open data is that it maximises re-use (or should be released in a way that permits re-use). Essentially the generator of data should avoid making assumptions about what data are good for. The data acquired by the Hubble space telescope has led to more publications by teams analysing the data than from the original teams that acquired the data.

Open Innovation and Prize-based Incentives

As an effort to stimulate innovation, several pharma companies have adopted an "open innovation" model. This is a somewhat nebulous term that means companies must try to bring in the best external ideas to complement in-house research.NRDD Article The mechanisms of bringing in new ideas are:

  • Prizes for solutions to problems (e.g., Innocentive). A competition means that teams work in isolation and do not pool ideas. Such a mechanism does not change the nature of the research, rather the motivation to participate. The pharmaceutical industry itself essentially already operates on this model.
  • Licensing agreements with academic groups/start-ups (e.g., Eli Lilly’s PD2 program). In such arrangements, companies may purchase the rights to promising ideas. Vigilance of intellectual property may of course shut down any open collaboration at a promising stage. It has therefore been proposed to limit open innovation science to “pre-competitive areas” (e.g., toxicology) but to date the industry has been unable to define what the term “pre-competitive” means beyond the avoidance of duplication of effort and the requirement for public-domain information resources.NRDD article

For more on this distinction see Will Spooner's article.

Crowdsourcing

The use of a widely distributed set of participants to accelerate a project is a strategy that has been widely employed in many areas. The writing of the Oxford English Dictionary made use of volunteers to identify the first uses, or best examples of the use, of words.[see The Professor and the Madman, Simon Winchester, XXX]. Pioneering work on distribution of computing power required on science projects (where the science itself was not necessarily an open activity) was achieved with the SETI@Home and Folding@Home projects.

With the rise of the web, several highly successful crowdsourcing experiments have emerged in which tasks are distributed to thousands of human participants, such as the Foldit and Galaxyzoo projects. What is notable about such cases is the speed with which the science progresses through the harnessing of what has been termed the “cognitive surplus”.[Shirky, C. Cognitive Surplus: Creativity and Generosity in a Connected Age; Penguin, 2010.]

Open science

Open science is the application of open source methods to science. Thus data must be released as it is acquired, and it must be possible for any reader of the data to have an impact on the project. There should be a minimisation of groups working on parts of the project in isolation and only periodically releasing data - ideally data release and collaboration happen in real time, to prevent duplication of effort, and to maximise useful interaction between participants.

Though there is no formal line to distinguish crowdsourced projects from open science projects, it could be argued that open science projects are mutable at every level. For example, while anyone could participate in the original Galaxyzoo project, the software, and question that the project set out to answer, were not open to change by those who participated. On the other hand in the Polymath project, while there was a question to answer at the outset, the direction the project took could be influenced by anyone, depending on how the project went. In the Synaptic Leap discovery of a chemical synthesis of a drug, the eventual solution was influenced by project participants as it proceeded.

Open Source Drug Discovery

Drug discovery is a complex process involving many different stages. Compounds are discovered as having some biological activity, and these are then improved through iterative chemical synthesis and biological evaluation. Compounds that appear to be promising are assessed for their behaviour and toxicity in biological systems. The move to evaluation in humans is the clinical trial phase, and there are regulatory phases after that, as well as the need to create the relevant molecule on a large scale.

Since no drug has ever been discovered using an open source approach it is difficult to be certain about how OSDD would work. However it seems likely that the biggest impact of the open approach would be in the early phases before clinical trials have commenced. Open methods could also have an impact on the process chemistry phase, in creating an efficient chemical synthesis on a large scale.

Open work cannot be patented, since there can be no delays to release of data, and no partial buy-ins. If a group opts out of the project to pursue a "fork", they leave the project. Open source drug discovery must operate without patents. The hypothesis is that through working in an open mode, research and development costs are reduced, and research is accelerated. This offsets the lack of capital support for the project. Costs of clinical trials and product registration would have to be sourced from governments and NGOs. Whether this is possible is one of the central questions of OSDD.