Revision as of 10:19, 22 March 2007

Biophysics 101: Genomics, Computing, and Economics

Home People Schedule Project Python Help

Overview

Project Goal: Development of tools to aid in analysis of personal DNA sequences.

We would like to develop software and documentation that will help people get from sequence to diagnosis. At the moment, we are focusing on identifying and classifying SNPs, but we will broaden this identification to other things like large deletions or insertions or repeats when we have more expertise. We are attempting to harness the power of other already existing tools, and we would also like to make this tool one that others can build upon. Specifically, our program will eventually be able to determine location based on BLAST, determine any SNPs based on NCBI SNP, and give a prognosis based on OMIM and online medical databases.

Project Sections

ATTENTION: Everyone needs to post their code to one place. Let's say everyone post a link from here that works to their code and then I'll be able to combine it all. --Katie Fifer
Could someone who typed this up today please add the other sections that are being worked on? --TChan, 12:47 20 March 2007

The following is from the information that Zach typed up in class (Located here) --Hetmann,

5:32 20 March 2007

Editted to add some of my own notes and to reflect some semblance of order.. --Cchi 10:00, 22 March 2007 (EDT)

Integration

Katie
PM / encourage documentation

Sequence to BLAST SNP to rs#

Zach, Mike, and Tiffany

BioPython Modification

Parsing XML of Biopython BLAST - Deniz
Relevant file: Python25/Lib/site-packages/Bio/Blast/NCBIWWW.py
Discussion on BLAST SNP can proceed on the discussion page.

OMIM XML Parse

Xiaodi - completed?
rs -> OMIM XML parse -> phenotype text

OMIM

Resmi, Cynthia, and Hetmann
Handling the text from the parse

Controlled Vocabulary for parsing OMIM records

Masseroli et al.: "Our efforts to derive from the OMIM entries a controlled vocabulary of phenotype locations and descriptions enabled us to normalize and structure the valuable OMIM phenotypic data according to the obtained vocabulary and make them suitable for computational use. Although detailed phenotype descriptions could be further homogenized and standardized, their subdivision in hierarchical levels of detail that we performed allows to group specific phenotypes according to their common general traits, without loosing their specific characteristics. So, for example "Mental retardation, moderate" and "Mental retardation, nonspecific" can be both generally considered as "Mental retardation" and at the same time they can be treated as different types of mental defects. This provides the chance to modulate analysis granularity when searching for phenotypic traits shared among multiple diseases or genotypes. It also ensures more significant and clear results when categorical statistical analyses are performed at lower granularity levels of detail. Such interesting feature, proper of the hierarchical structure and hence belonging also to the defined phenotype location hierarchy, is exploited in the new GFINDer Genetic Disorders modules implemented for the study of genetic disorder related genes."
http://promoter.bioing.polimi.it/gfinder/Phenotypes.txt
http://promoter.bioing.polimi.it/gfinder/Phenotype_Locations.txt

—smd 13:19, 22 March 2007 (EDT)

Beyond OMIM

Tiffany, Resmi, Deniz, Xiaodi, Mike, Chris (note: ask if API exists)

Wikipedia (Mike) http://meta.wikimedia.org/wiki/API
Webmd (Tiff)
Emedicine (Resmi)
Google, Medstory. (Deniz)
Linking out of XML (Xiaodi)
MedStory (Mike?)
Pubmed (Chris)
Downloading OMIM, extra functionalities, Eutils (Deniz)

Multiple SNPs

Chris
figure out with of multiple SNPs are relevant

Kay

Unassigned

not in SNP db... then what? - I'd like to point out new efforts that aim to replace OMIM, called the "Human Variome Project" -- Deniz
OMIM DOA
systematically nonsyn. -> mutation not in OMIM or dbSNP?
other dbs: genecard (spec. conservation, pop. freq)
looking into linking gene expression w/ GEO?

Project Ideas

Project ideas have been moved to their own page.

@@ Line 34: / Line 34: @@
 * Handling the text from the parse
-*[http://www.biomedcentral.com/1471-2105/6/S4/S18 Masseroli et al.]:  "Our efforts to derive from the OMIM entries a controlled vocabulary of phenotype locations and descriptions enabled us to normalize and structure the valuable OMIM phenotypic data according to the obtained vocabulary and make them suitable for computational use. Although detailed phenotype descriptions could be further homogenized and standardized, their subdivision in hierarchical levels of detail that we performed allows to group specific phenotypes according to their common general traits, without loosing their specific characteristics. So, for example "Mental retardation, moderate" and "Mental retardation, nonspecific" can be both generally considered as "Mental retardation" and at the same time they can be treated as different types of mental defects. This provides the chance to modulate analysis granularity when searching for phenotypic traits shared among multiple diseases or genotypes. It also ensures more significant and clear results when categorical statistical analyses are performed at lower granularity levels of detail. Such interesting feature, proper of the hierarchical structure and hence belonging also to the defined phenotype location hierarchy, is exploited in the new GFINDer Genetic Disorders modules implemented for the study of genetic disorder related genes."  '''—[[User:ShawnDouglas|smd]] 11:10, 22 March 2007 (EDT)'''
+'''Controlled Vocabulary for parsing OMIM records'''
+*[http://www.biomedcentral.com/1471-2105/6/S4/S18 Masseroli et al.]:  "Our efforts to derive from the OMIM entries a controlled vocabulary of phenotype locations and descriptions enabled us to normalize and structure the valuable OMIM phenotypic data according to the obtained vocabulary and make them suitable for computational use. Although detailed phenotype descriptions could be further homogenized and standardized, their subdivision in hierarchical levels of detail that we performed allows to group specific phenotypes according to their common general traits, without loosing their specific characteristics. So, for example "Mental retardation, moderate" and "Mental retardation, nonspecific" can be both generally considered as "Mental retardation" and at the same time they can be treated as different types of mental defects. This provides the chance to modulate analysis granularity when searching for phenotypic traits shared among multiple diseases or genotypes. It also ensures more significant and clear results when categorical statistical analyses are performed at lower granularity levels of detail. Such interesting feature, proper of the hierarchical structure and hence belonging also to the defined phenotype location hierarchy, is exploited in the new GFINDer Genetic Disorders modules implemented for the study of genetic disorder related genes."
+*http://promoter.bioing.polimi.it/gfinder/Phenotypes.txt
+*http://promoter.bioing.polimi.it/gfinder/Phenotype_Locations.txt
+'''—[[User:ShawnDouglas|smd]] 13:19, 22 March 2007 (EDT)'''
 ==Beyond OMIM==

Harvard:Biophysics 101/2007/Project: Difference between revisions

Revision as of 10:19, 22 March 2007

Contents

Overview

Project Sections

Integration

Sequence to BLAST SNP to rs#

OMIM XML Parse

OMIM

Beyond OMIM

Multiple SNPs

Unassigned

Project Ideas

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

research

Tools