OpenWetWare:Software/Online Database Access/GenBank Accession Numbers: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
(added linkout to refseq accession # prefixes)
 
Line 14: Line 14:
*NP_123456  proteins
*NP_123456  proteins
*NC_123456  chromosomes
*NC_123456  chromosomes
(a complete list of [http://www.ncbi.nlm.nih.gov/RefSeq/key.html#accessions RefSeq accession number prefixes] is available at NCBI).


Note: compare accession number with Sequence Identifiers such as Version and GI for nucleotide sequences and protein_id and GI for amino acid sequences.
Note: compare accession number with Sequence Identifiers such as Version and GI for nucleotide sequences and protein_id and GI for amino acid sequences.

Latest revision as of 19:26, 14 November 2010

GenBank Accession Numbers


ACCESSION

The unique identifier for a sequence record. An accession number applies to the complete record and is usually a combination of a letter(s) and numbers, such as a single letter followed by five digits (e.g., U12345) or two letters followed by six digits (e.g., AF123456). Some accessions might be longer, depending on the type of sequence record.

Accession numbers do not change, even if information in the record is changed at the author's request. Sometimes, however, an original accession number might become secondary to a newer accession number, if the authors make a new submission that combines previous sequences, or if for some reason a new submission supercedes an earlier record.

Records from the RefSeq database of reference sequences have a different accession number format that begins with two letters followed by an underscore bar and six or more digits, for example:

  • NT_123456 constructed genomic contigs
  • NM_123456 mRNAs
  • NP_123456 proteins
  • NC_123456 chromosomes

(a complete list of RefSeq accession number prefixes is available at NCBI).

Note: compare accession number with Sequence Identifiers such as Version and GI for nucleotide sequences and protein_id and GI for amino acid sequences.

Entrez Search Field: Accession [ACCN] Search Tip: The letters in the accession number can be written in upper- or lowercase. RefSeq accessions must contain an underscore bar between the letters and the numbers, e.g., NM_002111.