Angela A. Garibaldi Week 8: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 60: Line 60:


[[Image:Pscale21735.gif]]
[[Image:Pscale21735.gif]]
*strong signals are not sensitive to parameters. Recommended threshold for Kyte and Doolittle is '''1.6'''. If you forget this number do the following:
#Place paper over your results.
#Lower the paper until the tips of the strongest peaks appear
#Keep lowering this threshold as long as you can see nice sharp peaks.
*6 of the 7 transmembrane regions are easy to find.
#Go to [http://www.cbs.dtu.dk/services/TMHMM-2.0 TMHMM] '''only FASTA format is recognized'''
#Keep  Output Format radio buttons to their default value.

Revision as of 23:35, 14 March 2010

Retrieving Protein Sequences

  1. Go to UniProt UniProt
  2. Enter dUTPase in search window. This produces more than 3 relevant sequences, so found DUT ECOLI (P06968) on page 4
  3. Scroll down for FASTA format of amino acid sequences
  • In the case that your beginning information is not enough to find the protein sequence you seek,
  1. find the advanced search option. This no longer exists. You have to click the add and search button and a drop down menu will be displayed to give you the same search options as described in Figure 2-16 of the Bioinformatics for Dummeies

Retrieving a List of Related Protein Sequences

  1. Go to the Advanced Search UniProt as described above
  2. Because the advanced search is completely different, cannot deselect TrEMBL. Instead Select Reviewed- Yes as an alternative
  3. Input dUTPase in search again. There is no "description" field any longer.Yields many possibilities
  4. Since there are more than 211 total possibilities, so we selected entire first page of sequences (25)
  5. In newer version click retrieve at the bottom right corner instead of french button.
  6. Once you retrieve these, it is put into a list of which you can add to and then below choose the format you want the sequences in. No longer have to copy and paste into a document. FASTA format is available.

Reading a Swiss-Prot Entry

This time we skipped the example and did the activity using HIV gp120.

  1. Select the Reviewed - Yes. Our overall query to achieve these results: HIV gp120 AND reviewed:yes
  2. We selected the first option in the list
Entry Name: ENV_HV1H2 
Accession Number: P04578
  • Scroll down to Sequence Annotation - Region to Look at V3 sequence specifically.

ORFing your DNA Sequences

  1. Go to NCBI ORF Finder
  2. Input a DNA sequence for practice
I input the following sequence: >S7V1-1 
GAGATAGTAATTAGATCTGCCAATTTCACGGACAATACTAAGACCATAATAGTACAGCTGAATGTATCTG
TAGAAATTAATTGTACGAGACCCAACAACAATACAAGAAAAAGTATACCTATAGGACCAGGGAGAGCATT
TTATGCTACAGGAGAAATAATAGGGAATATAAGACAAGCACATTGTAACATTAGTAGAGCAAAATGGAAT
AACACTTTAAAACAGATAGCTACAAAATTAAGAAAACAATTTGAGAATAAAACAATAGTCTTTAATCAAT
CCTCA

Compare your results with the SWISS-PROT entry you found for the protein above to decipher what the output means. ExPASy also has a translation tool you can use here

  • Based on the ExPASy tool, the following amino acid sequence was the only viable ORF. All others had stop codons within the first few codons

E I V I R S A N F T D N T K T I I V Q L N V S V E I N C T R P N N N T R K S I P I G P G R A F Y A T G E I I G N I R Q A H C N I S R A K W N N T L K Q I A T K L R K Q F E N K T I V F N Q S S

Working with a single protein sequence

Utilizing Bioinformatics for Dummies pages 159-195

  1. Go to Expasy
  2. Click protParam near top of page
  3. Enter sequence into space provided or by pasting the accession number. DO NOT INCLUDE THE FASTA FORMAT FIRST LINE, ONLY RAW DATA.
  4. Compute parameters

I saved this file on my personal computer since WetWare does not allow html files. This will give information about the protein, composition, ph, stability, etc

  • For a tool to simulate cutting of your protein, use: [1]

Looking for transmembrane segments

  1. go to Protscale
  2. Enter your sequence in raw format or swiss-prot accession number.
  3. Select the radio button.
  4. Choose 19 in the pull-down menu because this number is best for looking for transmembrane helices. 7-11 would be better for globular proteins.

  • strong signals are not sensitive to parameters. Recommended threshold for Kyte and Doolittle is 1.6. If you forget this number do the following:
  1. Place paper over your results.
  2. Lower the paper until the tips of the strongest peaks appear
  3. Keep lowering this threshold as long as you can see nice sharp peaks.
  • 6 of the 7 transmembrane regions are easy to find.
  1. Go to TMHMM only FASTA format is recognized
  2. Keep Output Format radio buttons to their default value.