TChan/Notebook/2007-5-3: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(New page: *[ ](4) Write code from input: (number) rs# --GeneCards--> output: (number) allelic frequency *[ ] (5) Present: ** Lessons learned ** Stuff done)
 
No edit summary
Line 1: Line 1:
*[ ](4) Write code from input: (number) rs# --GeneCards--> output: (number) allelic frequency
=Allelic Frequency=
*[ ] (5) Present:
* '''Input''': rs#
* '''Output''': allelic frequency
 
* On further investigation of GeneCards, I found that they just get their [http://www.genecards.org/info.shtml#snp allele frequency data from dbSNP].  Also, since the GeneCard is for the gene relevant to our rs#, allelic frequency data in the GeneCard "SNP/Variants" box does not necessarily give the frequency of our one particular allele.  The allelic frequency data is of all alleles for the gene, ranked by whether or not there is any frequency data on that allele.
 
* Thus, I will write code to parse dbSNP XML for the allele frequency data, and return that or "No data available," if dbSNP doesn't have any information.
 
* The data for "Population Diversity" is somewhat indecipherable, and the only documentation of it is the following:
 
''Population Diversity Data
 
The best single measure of a variation's diversity in different populations is its average heterozygosity. This measure serves as the general probability that both alleles are in a diploid individual or in a sample of two chromosomes. Estimates of average heterozygosity have an accompanying standard error based on the sample sizes of the underlying data, which reflects the overall uncertainty of the estimate. dbSNP’s computation of average heterozygosity and standard error for RefSNP clusters is available online. Please note that dbSNP computes heterozygosity based on the submitted allele frequency for each SNP. If the frequency data for a SNP is not submitted, we cannot compute the heterozygosity value, and therefore the refSNP report will show no heterozygosity estimate.
 
Additional population diversity data include population counts, individuals sampled for a variation, genotype frequencies, and Hardy Weinberg probabilities.''
 
Thus, it seems like a good idea to output the heterozygosity, though it is somewhat difficult to understand for a user.
 
 
=Presentation=
** Lessons learned
** Lessons learned
** Stuff done
** Stuff done

Revision as of 11:56, 1 May 2007

Allelic Frequency

  • Input: rs#
  • Output: allelic frequency
  • On further investigation of GeneCards, I found that they just get their allele frequency data from dbSNP. Also, since the GeneCard is for the gene relevant to our rs#, allelic frequency data in the GeneCard "SNP/Variants" box does not necessarily give the frequency of our one particular allele. The allelic frequency data is of all alleles for the gene, ranked by whether or not there is any frequency data on that allele.
  • Thus, I will write code to parse dbSNP XML for the allele frequency data, and return that or "No data available," if dbSNP doesn't have any information.
  • The data for "Population Diversity" is somewhat indecipherable, and the only documentation of it is the following:

Population Diversity Data

The best single measure of a variation's diversity in different populations is its average heterozygosity. This measure serves as the general probability that both alleles are in a diploid individual or in a sample of two chromosomes. Estimates of average heterozygosity have an accompanying standard error based on the sample sizes of the underlying data, which reflects the overall uncertainty of the estimate. dbSNP’s computation of average heterozygosity and standard error for RefSNP clusters is available online. Please note that dbSNP computes heterozygosity based on the submitted allele frequency for each SNP. If the frequency data for a SNP is not submitted, we cannot compute the heterozygosity value, and therefore the refSNP report will show no heterozygosity estimate.

Additional population diversity data include population counts, individuals sampled for a variation, genotype frequencies, and Hardy Weinberg probabilities.

Thus, it seems like a good idea to output the heterozygosity, though it is somewhat difficult to understand for a user.


Presentation

    • Lessons learned
    • Stuff done