Angela A. Garibaldi Week 8: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 29: Line 29:
#Go to [http://www.ncbi.nlm.nih.gov/gorf/gorf.html NCBI ORF Finder]
#Go to [http://www.ncbi.nlm.nih.gov/gorf/gorf.html NCBI ORF Finder]
#Input a DNA sequence for practice
#Input a DNA sequence for practice
  I input the following sequence: >S7V1-1
  I input the following sequence for the gp120 portion of the gp160: >P04578|33-511
  GAGATAGTAATTAGATCTGCCAATTTCACGGACAATACTAAGACCATAATAGTACAGCTGAATGTATCTG
  KLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPNPQEVVLVNVTEN
  TAGAAATTAATTGTACGAGACCCAACAACAATACAAGAAAAAGTATACCTATAGGACCAGGGAGAGCATT
  FNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKG
  TTATGCTACAGGAGAAATAATAGGGAATATAAGACAAGCACATTGTAACATTAGTAGAGCAAAATGGAAT
  EIKNCSFNISTSIRGKVQKEYAFFYKLDIIPIDNDTTSYKLTSCNTSVITQACPKVSFEP
  AACACTTTAAAACAGATAGCTACAAAATTAAGAAAACAATTTGAGAATAAAACAATAGTCTTTAATCAAT
  IPIHYCAPAGFAILKCNNKTFNGTGPCTNVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVI
  CCTCA
  RSVNFTDNAKTIIVQLNTSVEINCTRPNNNTRKRIRIQRGPGRAFVTIGKIGNMRQAHCN
ISRAKWNNTLKQIASKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFN
STWFNSTWSTEGSNNTEGSDTITLPCRIKQIINMWQKVGKAMYAPPISGQIRCSSNITGL
LLTRDGGNSNNESEIFRPGGGDMRDNWRSELYKYKVVKIEPLGVAPTKAKRRVVQREKR
[[Image:ORFfinderSS.jpg]]
 
Compare your results with the SWISS-PROT entry you found for the protein above to decipher what the output means. ExPASy also has a translation tool you can use [http://www.expasy.org/tools/dna.html here]
Compare your results with the SWISS-PROT entry you found for the protein above to decipher what the output means. ExPASy also has a translation tool you can use [http://www.expasy.org/tools/dna.html here]


*Based on the ExPASy tool, the following amino acid sequence was the only viable ORF. All others had '''stop''' codons within the first few codons
*Based on the ExPASy tool:
E I V I R S A N F T D N T K T I I V Q L N V S V E I N C T R P N N N T R K S I P I G P G R A F Y A T G E I I G N I R Q A H C N I S R A K W N N T L K Q I A T K L R K Q F E N K T I V F N Q S S
[[Image:ExpasySS.jpg]]


==Working with a single protein sequence==
==Working with a single protein sequence==

Revision as of 22:52, 14 March 2010

Retrieving Protein Sequences

  1. Go to UniProt UniProt
  2. Enter dUTPase in search window. This produces more than 3 relevant sequences, so found DUT ECOLI (P06968) on page 4
  3. Scroll down for FASTA format of amino acid sequences
  • In the case that your beginning information is not enough to find the protein sequence you seek,
  1. find the advanced search option. This no longer exists. You have to click the add and search button and a drop down menu will be displayed to give you the same search options as described in Figure 2-16 of the Bioinformatics for Dummeies

Retrieving a List of Related Protein Sequences

  1. Go to the Advanced Search UniProt as described above
  2. Because the advanced search is completely different, cannot deselect TrEMBL. Instead Select Reviewed- Yes as an alternative
  3. Input dUTPase in search again. There is no "description" field any longer.Yields many possibilities
  4. Since there are more than 211 total possibilities, so we selected entire first page of sequences (25)
  5. In newer version click retrieve at the bottom right corner instead of french button.
  6. Once you retrieve these, it is put into a list of which you can add to and then below choose the format you want the sequences in. No longer have to copy and paste into a document. FASTA format is available.

Reading a Swiss-Prot Entry

This time we skipped the example and did the activity using HIV gp120.

  1. Select the Reviewed - Yes. Our overall query to achieve these results: HIV gp120 AND reviewed:yes
  2. We selected the first option in the list
Entry Name: ENV_HV1H2 
Accession Number: P04578
  • Scroll down to Sequence Annotation - Region to Look at V3 sequence specifically.

ORFing your DNA Sequences

  1. Go to NCBI ORF Finder
  2. Input a DNA sequence for practice
I input the following sequence for the gp120 portion of the gp160: >P04578|33-511
KLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPNPQEVVLVNVTEN
FNMWKNDMVEQMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKG
EIKNCSFNISTSIRGKVQKEYAFFYKLDIIPIDNDTTSYKLTSCNTSVITQACPKVSFEP
IPIHYCAPAGFAILKCNNKTFNGTGPCTNVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVI
RSVNFTDNAKTIIVQLNTSVEINCTRPNNNTRKRIRIQRGPGRAFVTIGKIGNMRQAHCN
ISRAKWNNTLKQIASKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFN
STWFNSTWSTEGSNNTEGSDTITLPCRIKQIINMWQKVGKAMYAPPISGQIRCSSNITGL
LLTRDGGNSNNESEIFRPGGGDMRDNWRSELYKYKVVKIEPLGVAPTKAKRRVVQREKR

Compare your results with the SWISS-PROT entry you found for the protein above to decipher what the output means. ExPASy also has a translation tool you can use here

  • Based on the ExPASy tool:

Working with a single protein sequence

Utilizing Bioinformatics for Dummies pages 159-195

  1. Go to [1]
  2. Click protParam near top of page
  3. Enter sequence into space provided or by pasting the accession number. DO NOT INCLUDE THE FASTA FORMAT FIRST LINE, ONLY RAW DATA.
  4. Compute parameters