# User:Marshall Hampton/ScoringMatrices

### From OpenWetWare

(Difference between revisions)

Line 4: | Line 4: | ||

A log-odds scoring matrix is constructed from some empirically found frequencies of single letter alignments <math>f_{ij}</math> from the formula | A log-odds scoring matrix is constructed from some empirically found frequencies of single letter alignments <math>f_{ij}</math> from the formula | ||

+ | |||

<math>s_{ij} = \lambda log(\frac{f_{ij}}{p_i q_j})</math> | <math>s_{ij} = \lambda log(\frac{f_{ij}}{p_i q_j})</math> | ||

+ | |||

where <math>p_i</math> and <math>q_i</math> are the background frequencies of the amino acids from two sets of proteins. In the vast majority of current treatments, the evolution of amino acid frequencies is considered symmetric in time and it is assumed that <math>p_i = q_i</math> for each amino acid i. In this article we will not make that assumption in order to develop scoring matrices for organisms in which the amino acid frequencies seem biased in time. | where <math>p_i</math> and <math>q_i</math> are the background frequencies of the amino acids from two sets of proteins. In the vast majority of current treatments, the evolution of amino acid frequencies is considered symmetric in time and it is assumed that <math>p_i = q_i</math> for each amino acid i. In this article we will not make that assumption in order to develop scoring matrices for organisms in which the amino acid frequencies seem biased in time. |

## Revision as of 17:33, 6 July 2009

This will eventually be an article about constructing custom amino-acid scoring matrices using biopython. At the moment it is far from done.

## Introduction

A log-odds scoring matrix is constructed from some empirically found frequencies of single letter alignments *f*_{ij} from the formula

where *p*_{i} and *q*_{i} are the background frequencies of the amino acids from two sets of proteins. In the vast majority of current treatments, the evolution of amino acid frequencies is considered symmetric in time and it is assumed that *p*_{i} = *q*_{i} for each amino acid i. In this article we will not make that assumption in order to develop scoring matrices for organisms in which the amino acid frequencies seem biased in time.