BBRFC13: Difference between revisions
From OpenWetWare
Jump to navigationJump to search
(New page: <pre> BBF RFC-13: Rethinking the boundaries and composition of coding regions Tom Knight 19 November 2008 Related RFCs: 9, 11, 12...) |
No edit summary |
||
Line 57: | Line 57: | ||
single internal domain, and a simple Tail domain (the stop codon). | single internal domain, and a simple Tail domain (the stop codon). | ||
(1) Head: The Head | (1) Head Domain: The Head Domain consists of the start codon followed | ||
immediately by zero or more triplets specifiying an N-terminal | immediately by zero or more triplets specifiying an N-terminal | ||
tag, such as a protein export tag or lipoprotein binding tag. | tag, such as a protein export tag or lipoprotein binding tag. | ||
(2) Domains: | (2) Internal Domains: Internal domains consist of a series of codon triplets | ||
coding for an amino acid sequence without a start codon or stop | coding for an amino acid sequence without a start codon or stop | ||
codon. Multiple Domains can be fused. | codon. Multiple Internal Domains can be fused. | ||
(3) Special Domains: Short Domains with specific function may be | (3) Special Internal Domains: Short Internal Domains with specific function may be | ||
separately categorized, but obey the same composition rules as | separately categorized, but obey the same composition rules as | ||
normal domains. Special | normal Internal domains. Special Internal Domains include tags, linkers, | ||
cleavage-sites, intein-sites. | cleavage-sites, intein-sites. | ||
(4) Tail: The | (4) Tail Domain: The Tail Domain consists of zero or more | ||
triplet codons, followed by a pair of TAA stop codons. In the | triplet codons, followed by a pair of TAA stop codons. In the | ||
simplest case, the stop codons | simplest case, the stop codons terminate the protein with an | ||
Stop. More complex | Stop. More complex Tail Domains may include degradation tags | ||
appropriate to the organism (with different degradation rates, | appropriate to the organism (with different degradation rates, | ||
e.g.). | e.g.). | ||
Line 79: | Line 79: | ||
Note that different assembly techniques will, in general, result in | Note that different assembly techniques will, in general, result in | ||
different amino acid sequences for coding regions composed out of the | different amino acid sequences for coding regions composed out of the | ||
same | same Head, Tail, and Internal Domains. We anticipate that users will use | ||
care in thinking about the effects of such differences on their | care in thinking about the effects of such differences on their | ||
experiments, but also feel confident that many such differences will | experiments, but also feel confident that many such differences will | ||
be minor, when the composition uses structures such as export tags, | be minor, when the composition uses structures such as export tags, | ||
degradation tails, and purification tags. | |||
RFC 14 describes the use of these concepts in combination with BB-2 | RFC 14 describes the use of these concepts in combination with BB-2 | ||
assembly standard. | assembly standard. | ||
</pre> | </pre> |
Latest revision as of 10:04, 21 July 2009
BBF RFC-13: Rethinking the boundaries and composition of coding regions Tom Knight 19 November 2008 Related RFCs: 9, 11, 12 Keywords: protein fusions, domains, protein tags, assembly Purpose: With the advent of several assembly standards fostering in-frame protein domain fusions, it is important to rethink our categorization of parts to allow the documentation and distribution of parts containing only a portion of a protein coding region. This RFC attempts to document initial thoughts on the naming and documentation of such sub-coding region parts. Introduction Proteins typically consist of one or more domains, sequences of amino acids which fold relatively independently and which are evolutionarily shuffled as a unit among different protein coding regions. The DNA sequence of such domains must maintain in-frame translation, and thus is a multiple of three bases. In our older assembly technology, the assembly scar was 8 bases long, and failed to maintain the coding region frame. Several proposals for new assembly techniques, including the Ira Phillips proposal, Bam/Bgl, BB-2 (see RFCs 11, 12, 14), and blunt scarless assembly, allow in-frame composition of protein domains. The N-terminal domain of a protein coding region is special in a number of ways. First, it always contains a start codon, spaced at an appropriate distance from a ribosomal binding site. Second, many coding regions have special features at the N terminus, such as protein export tags and lipoprotein cleavage and attachment tags. These function when internal to a coding region, and therefore are termed Head domains. Similarly, the C-terminal domain of a protein is special, containing at least a stop codon. Other special features, such as degradation tags, are also required to be at the extreme C-terminus. Again, these domains cannot function when internal to a coding region, and are termed Tail domains. Proposal: Each coding region will consist logically of at least three domains, a Head domain, one or more internal domains, and a tail domain. A part in the registry may (similar to any composite part) consist of a composition of domains. In particular, existing coding regions consist of a particularly simple Head domain (the start codon), a single internal domain, and a simple Tail domain (the stop codon). (1) Head Domain: The Head Domain consists of the start codon followed immediately by zero or more triplets specifiying an N-terminal tag, such as a protein export tag or lipoprotein binding tag. (2) Internal Domains: Internal domains consist of a series of codon triplets coding for an amino acid sequence without a start codon or stop codon. Multiple Internal Domains can be fused. (3) Special Internal Domains: Short Internal Domains with specific function may be separately categorized, but obey the same composition rules as normal Internal domains. Special Internal Domains include tags, linkers, cleavage-sites, intein-sites. (4) Tail Domain: The Tail Domain consists of zero or more triplet codons, followed by a pair of TAA stop codons. In the simplest case, the stop codons terminate the protein with an Stop. More complex Tail Domains may include degradation tags appropriate to the organism (with different degradation rates, e.g.). Note that different assembly techniques will, in general, result in different amino acid sequences for coding regions composed out of the same Head, Tail, and Internal Domains. We anticipate that users will use care in thinking about the effects of such differences on their experiments, but also feel confident that many such differences will be minor, when the composition uses structures such as export tags, degradation tails, and purification tags. RFC 14 describes the use of these concepts in combination with BB-2 assembly standard.