Harvard:Biophysics 101/2007/Project

From OpenWetWare

(Difference between revisions)
Jump to: navigation, search
Line 2: Line 2:
<div style="padding: 10px; width: 720px; border: 5px solid #DDDDFF;">
<div style="padding: 10px; width: 720px; border: 5px solid #DDDDFF;">
-
'''Editing Note (17:52 4/30/07)''': The below was based on (1) the (incomplete, since some people were absent) blackboard diagram that gave the steps and interconnections of the project, and (2) all actual scripts available on everyone's personal notebook.  Since the organization was based on my understandings of the scripts (which might be - and very probably are - wrong), please change, add, or reorder as you see fit.  And if your name or code was left out somewhere, it was unintentional; please insert it as you see fit.  ---[[User:TChan|TChan]]
+
'''Editing Note (17:52 4/30/07)''': The below was based on (1) the (incomplete, since some people were absent) blackboard diagram that gave the steps and interconnections of the project, and (2) all actual scripts available on everyone's personal notebook.  Since the organization was based on my understandings of the scripts (which might be - and very probably are - wrong), please change, add, or reorder as you see fit.  And if your name or code was left out somewhere, it was unintentional; please insert it as you see fit.  Also, it was not possible to lay out the project in a diagrammatic manner; the path of steps is thus noted using outline (ie. I, 1, A, a, i, etc.) notation, as well as the increasing indentation and smaller font of the OWW hierarchy (ie. =, ==, ===, etc.).  ---[[User:TChan|TChan]]
Line 14: Line 14:
=Path 1: Using OMIM=
=Path 1: Using OMIM=
-
==I.(1). Sequence -> RS Number (BLAST SNP)-> Disease Name (OMIM)==
+
==[I][1]. Sequence -> RS Number (BLAST SNP)-> Disease Name (OMIM)==
* Xiaodi: [[Harvard:Biophysics_101/2007/Notebook:Katie_Fifer/2007-4-15|code]] prints and outputs to file pickled version of the OMIM output for our sequence
* Xiaodi: [[Harvard:Biophysics_101/2007/Notebook:Katie_Fifer/2007-4-15|code]] prints and outputs to file pickled version of the OMIM output for our sequence
* Chris: quality checking [[Harvard:Biophysics_101/2007/Notebook:Christopher_Nabel/2007-4-24|code]] returns sequences back after dbSNP  
* Chris: quality checking [[Harvard:Biophysics_101/2007/Notebook:Christopher_Nabel/2007-4-24|code]] returns sequences back after dbSNP  
-
===2. RS Number -> Mesh Terms (PubMed)===
+
===[2]. RS Number -> Mesh Terms (PubMed)===
* Cynthia
* Cynthia
-
===3(A). RS Number (OMIM) -> PubMed PMIDs -> Mesh Terms===
+
===[3][A]. RS Number (OMIM) -> PubMed PMIDs -> Mesh Terms===
* Resmi: [[Harvard:Biophysics_101/2007/Notebook:Resmi_Charalel/2007-4-26|code]] (1) parses OMIM XML to (2) get PubMed PMIDs to (3) get Mesh Terms  
* Resmi: [[Harvard:Biophysics_101/2007/Notebook:Resmi_Charalel/2007-4-26|code]] (1) parses OMIM XML to (2) get PubMed PMIDs to (3) get Mesh Terms  
-
====B. Mesh Terms -> Disease Name (in-house code)====
+
====[B]. Mesh Terms -> Disease Name (in-house code)====
* Deniz
* Deniz
-
====C. Disease Name -> Prevalence (-> regulates Useful Patient Info)====
+
====[C]. Disease Name -> Prevalence (-> regulates Useful Patient Info)====
* Zach: [[Harvard:Biophysics_101/Notebook:ZS/2007-4-22|code]] returns incidence data (based on California data) of the disease name
* Zach: [[Harvard:Biophysics_101/Notebook:ZS/2007-4-22|code]] returns incidence data (based on California data) of the disease name
-
=====a. Disease Name -> Useful Patient Info (Medical search engines)=====
+
=====[a]. Disease Name -> Useful Patient Info (Medical search engines)=====
* Deniz: [[Harvard:Biophysics_101/2007/Notebook:Denizkural/2007-4-3|examples of coding]] which returns news updates in several formats
* Deniz: [[Harvard:Biophysics_101/2007/Notebook:Denizkural/2007-4-3|examples of coding]] which returns news updates in several formats
* Tiff: [[TChan/Notebook/2007-4-24|code]] (a) displays relevant URLs (, (b) returns lists of drugs, clinical trials, experts, etc.
* Tiff: [[TChan/Notebook/2007-4-24|code]] (a) displays relevant URLs (, (b) returns lists of drugs, clinical trials, experts, etc.
Line 42: Line 42:
-
======i. All displayable data -> Web Interface======
+
======[i]. All displayable data -> Web Interface======
* Xiaodi and Katie: [[Harvard:Biophysics_101/2007/Notebook:Xiaodi_Wu/2007-4-17|code and description]] displays (some - still need to integrate some remaining scripts) preceding data in a web interface
* Xiaodi and Katie: [[Harvard:Biophysics_101/2007/Notebook:Xiaodi_Wu/2007-4-17|code and description]] displays (some - still need to integrate some remaining scripts) preceding data in a web interface
Line 51: Line 51:
=Path 2: Using GenBank and PolyPhen=
=Path 2: Using GenBank and PolyPhen=
-
==I. Sequence -> Gene Name (BLAST)==
+
==[I]. Sequence -> Gene Name (BLAST)==
* Zach(?): [[Harvard:Biophysics_101/2007/Notebook:Katie_Fifer/2007-4-19|code]] returns relevant (normal)BLAST data, including the gene name/ID (?); [[http://openwetware.org/wiki/Harvard:Biophysics_101/Notebook:ZS/2007-4-22|code]] returns relevant (normal)BLAST data, including the gene location
* Zach(?): [[Harvard:Biophysics_101/2007/Notebook:Katie_Fifer/2007-4-19|code]] returns relevant (normal)BLAST data, including the gene name/ID (?); [[http://openwetware.org/wiki/Harvard:Biophysics_101/Notebook:ZS/2007-4-22|code]] returns relevant (normal)BLAST data, including the gene location
-
===1. Gene Name -> Mutations (GenBank)===
+
===[1]. Gene Name -> Mutations (GenBank)===
* Xiaodi
* Xiaodi
* Chris: quality checking [[Harvard:Biophysics_101/2007/Notebook:Christopher_Nabel/2007-4-17|code]] detects mismatches between our sequence and returned SNPs
* Chris: quality checking [[Harvard:Biophysics_101/2007/Notebook:Christopher_Nabel/2007-4-17|code]] detects mismatches between our sequence and returned SNPs
-
===2. Gene Name -> ??? (PolyPhen)===
+
===[2]. Gene Name -> ??? (PolyPhen)===
* Mike
* Mike
-
====A. Mutations -> Disease Name (OMIM)====
+
====[A]. Mutations -> Disease Name (OMIM)====
* Xiaodi
* Xiaodi
-
=====a. Disease Name -> Useful Patient Info (Medical search engines)=====
+
=====[a]. Disease Name -> Useful Patient Info (Medical search engines)=====
* Deniz: examples of code which returns [[Harvard:Biophysics_101/2007/Notebook:Denizkural/2007-4-3|Google news updates]] and [[Harvard:Biophysics_101/2007/Notebook:Denizkural/2007-4-24|MedStory news updates]] in several formats
* Deniz: examples of code which returns [[Harvard:Biophysics_101/2007/Notebook:Denizkural/2007-4-3|Google news updates]] and [[Harvard:Biophysics_101/2007/Notebook:Denizkural/2007-4-24|MedStory news updates]] in several formats
* Tiff: [[TChan/Notebook/2007-4-24|code]] (a) displays relevant URLs (, (b) returns lists of drugs, clinical trials, experts, etc.
* Tiff: [[TChan/Notebook/2007-4-24|code]] (a) displays relevant URLs (, (b) returns lists of drugs, clinical trials, experts, etc.
Line 75: Line 75:
-
======i. All displayable data -> Web Interface======
+
======[i]. All displayable data -> Web Interface======
* Xiaodi and Katie: [[Harvard:Biophysics_101/2007/Notebook:Xiaodi_Wu/2007-4-17|code and description]] displays (some - still need to integrate some remaining scripts) preceding data in a web interface
* Xiaodi and Katie: [[Harvard:Biophysics_101/2007/Notebook:Xiaodi_Wu/2007-4-17|code and description]] displays (some - still need to integrate some remaining scripts) preceding data in a web interface

Revision as of 21:16, 30 April 2007

Biophysics 101: Genomics, Computing, and Economics

Home        People        Schedule        Project        Python        Help       

Editing Note (17:52 4/30/07): The below was based on (1) the (incomplete, since some people were absent) blackboard diagram that gave the steps and interconnections of the project, and (2) all actual scripts available on everyone's personal notebook. Since the organization was based on my understandings of the scripts (which might be - and very probably are - wrong), please change, add, or reorder as you see fit. And if your name or code was left out somewhere, it was unintentional; please insert it as you see fit. Also, it was not possible to lay out the project in a diagrammatic manner; the path of steps is thus noted using outline (ie. I, 1, A, a, i, etc.) notation, as well as the increasing indentation and smaller font of the OWW hierarchy (ie. =, ==, ===, etc.). ---TChan


Contents

Overview

  • Project Goal: To develop tools to aid in analysis of personal DNA sequences.

We would like to develop software and documentation that will help people get from sequence to diagnosis. At the moment, we are focusing on identifying and classifying SNPs, but we will broaden this identification to other things like large deletions or insertions or repeats when we have more expertise. We are attempting to harness the power of other already existing tools, and we would also like to make this tool one that others can build upon. Specifically, our program will eventually be able to determine location based on BLAST, determine any SNPs based on NCBI SNP, and give a prognosis based on OMIM and online medical databases.


Path 1: Using OMIM

[I][1]. Sequence -> RS Number (BLAST SNP)-> Disease Name (OMIM)

  • Xiaodi: code prints and outputs to file pickled version of the OMIM output for our sequence
  • Chris: quality checking code returns sequences back after dbSNP


[2]. RS Number -> Mesh Terms (PubMed)

  • Cynthia


[3][A]. RS Number (OMIM) -> PubMed PMIDs -> Mesh Terms

  • Resmi: code (1) parses OMIM XML to (2) get PubMed PMIDs to (3) get Mesh Terms


[B]. Mesh Terms -> Disease Name (in-house code)

  • Deniz


[C]. Disease Name -> Prevalence (-> regulates Useful Patient Info)

  • Zach: code returns incidence data (based on California data) of the disease name


[a]. Disease Name -> Useful Patient Info (Medical search engines)
  • Deniz: examples of coding which returns news updates in several formats
  • Tiff: code (a) displays relevant URLs (, (b) returns lists of drugs, clinical trials, experts, etc.
  • Resmi: code displays PubMed review citations in text form
  • Cynthia: code displays PubMed article citations in text form


[i]. All displayable data -> Web Interface
  • Xiaodi and Katie: code and description displays (some - still need to integrate some remaining scripts) preceding data in a web interface



Path 2: Using GenBank and PolyPhen

[I]. Sequence -> Gene Name (BLAST)

  • Zach(?): code returns relevant (normal)BLAST data, including the gene name/ID (?); [[1]] returns relevant (normal)BLAST data, including the gene location


[1]. Gene Name -> Mutations (GenBank)

  • Xiaodi
  • Chris: quality checking code detects mismatches between our sequence and returned SNPs


[2]. Gene Name -> ??? (PolyPhen)

  • Mike


[A]. Mutations -> Disease Name (OMIM)

  • Xiaodi


[a]. Disease Name -> Useful Patient Info (Medical search engines)
  • Deniz: examples of code which returns Google news updates and MedStory news updates in several formats
  • Tiff: code (a) displays relevant URLs (, (b) returns lists of drugs, clinical trials, experts, etc.
  • Resmi: code displays PubMed review citations in text form
  • Cynthia: code displays PubMed article citations in text form


[i]. All displayable data -> Web Interface
  • Xiaodi and Katie: code and description displays (some - still need to integrate some remaining scripts) preceding data in a web interface



Diagram

Project Diagram


Project-in-Progress

Project-in-Progress notes have been moved to their own page.

Project Ideas

Project ideas have been moved to their own page.
Personal tools