OpenWetWare:Software/Online Database Access/GenBank Accession Numbers: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
(New page: GenBank Accession Numbers ACCESSION The unique identifier for a sequence record. An accession number applies to the complete record and is usually a combination of a letter(s) and num...)
 
No edit summary
Line 10: Line 10:
Records from the RefSeq database of reference sequences have a different accession number format that begins with two letters followed by an underscore bar and six or more digits, for example:
Records from the RefSeq database of reference sequences have a different accession number format that begins with two letters followed by an underscore bar and six or more digits, for example:


NT_123456  constructed genomic contigs
*NT_123456  constructed genomic contigs
NM_123456  mRNAs
*NM_123456  mRNAs
NP_123456  proteins
*NP_123456  proteins
NC_123456  chromosomes
*NC_123456  chromosomes


Note: compare accession number with Sequence Identifiers such as Version and GI for nucleotide sequences and protein_id and GI for amino acid sequences.
Note: compare accession number with Sequence Identifiers such as Version and GI for nucleotide sequences and protein_id and GI for amino acid sequences.

Revision as of 09:02, 11 October 2007

GenBank Accession Numbers


ACCESSION

The unique identifier for a sequence record. An accession number applies to the complete record and is usually a combination of a letter(s) and numbers, such as a single letter followed by five digits (e.g., U12345) or two letters followed by six digits (e.g., AF123456). Some accessions might be longer, depending on the type of sequence record.

Accession numbers do not change, even if information in the record is changed at the author's request. Sometimes, however, an original accession number might become secondary to a newer accession number, if the authors make a new submission that combines previous sequences, or if for some reason a new submission supercedes an earlier record.

Records from the RefSeq database of reference sequences have a different accession number format that begins with two letters followed by an underscore bar and six or more digits, for example:

  • NT_123456 constructed genomic contigs
  • NM_123456 mRNAs
  • NP_123456 proteins
  • NC_123456 chromosomes

Note: compare accession number with Sequence Identifiers such as Version and GI for nucleotide sequences and protein_id and GI for amino acid sequences.

Entrez Search Field: Accession [ACCN] Search Tip: The letters in the accession number can be written in upper- or lowercase. RefSeq accessions must contain an underscore bar between the letters and the numbers, e.g., NM_002111.