Chris Rhodes Week 8: Difference between revisions
From OpenWetWare
Jump to navigationJump to search
No edit summary |
No edit summary |
||
Line 21: | Line 21: | ||
**Known Secondary Structure and ontologies | **Known Secondary Structure and ontologies | ||
**Details about protein domains, cellular location, additional miscellaneous information. | **Details about protein domains, cellular location, additional miscellaneous information. | ||
**References to the studies and labs used to create all the information found in the UniProt protein entry. | |||
'''ORFing your DNA sequence'' | '''ORFing your DNA sequence'' |
Revision as of 16:00, 19 October 2011
For today's lab we will working out of the Bioinformatics for Dummies 2nd edition book performing selected activities from Chapters 2, 4, 5, and 6 but modifying the protocols to apply to the current website formats and the use of HIV-1 gp120.
In Class Activities
Retrieving Protein Sequences
- The protein retrieved in this exercise is HIV-1 gp120. It was found by going to ExPASy and searching "HIV gp120 envelope protein" using the UniProtKB database, but verified independent gp120 protein could not be found. The gp120 protein sequence was instead taken from an entry of gp160 which contains the gp120 sqeuence. The UniProtKB entry of the gp160 protein used is found here and the sequence of the gp120 protein, shown as the highlighted residues within the gp160 protein sequence, is found here http://www.uniprot.org/blast/?about=P04578[33-511] -> This address couldn't be properly hyperlinked due to the [33-511] text causing problems with the linking format.
- The fasta form of the gp120 protein sequence was retrieved from the entry page and is shown here:
>sp|P04578|33-511 KLWVTVYYGVPVWKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPNPQEVVLVNVTENFNMWKNDMVEQMHEDIISL WDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFNISTSIRGKVQKEYAFFYKLDIIPIDNDTTSY KLTSCNTSVITQACPKVSFEPIPIHYCAPAGFAILKCNNKTFNGTGPCTNVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVI RSVNFTDNAKTIIVQLNTSVEINCTRPNNNTRKRIRIQRGPGRAFVTIGKIGNMRQAHCNISRAKWNNTLKQIASKLRE QFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFN
- From the list of gp160 proteins found when searching for gp120 in the first step 5 sequences were chosen to be used in the multiple retrieval exercise. The UniProt ID numbers of the 5 sequences are P04578, P03377, P03375, P35961, and P05877. From the options for downloading the sequences I chose the FASTA format, the txt version of the combined sequence FASTA file can be found here
Reading a Swiss-Prot Entry
- As with the first activity a UniProt verified gp120 protein could not be found so I will be working with the gp160 entry instead.
- The UniProt entry of the gp160 protein used can be found here
- The entry itself is very in-depth and contains a lot of information. Some of the major features of the entry include:
- The protein name along with the names of the proteins that result from the cleavage of the original protein
- The protein sequence, source organism, and in this case the viral host.
- In-depth description of known functions and mechanisms of function.
- Known Secondary Structure and ontologies
- Details about protein domains, cellular location, additional miscellaneous information.
- References to the studies and labs used to create all the information found in the UniProt protein entry.
'ORFing your DNA sequence
- The NCBI ORF Finder can be found here
- The sequence used for this experiment was found by searching the NCBI nucleotide database for gp120 of HIV-1. The NCBI entry page for the sequence chosen can be found here and the fasta form is shown below
>gi|328550457|gb|JF701706.1| HIV-1 isolate gp120_Oct_10 from USA vpu protein (vpu) and envelope glycoprotein (env) genes, partial cds CAGAAAGAGCAGAAGACAGTGGCAATGAGAGTGAAGGGGATCAGGAAGAATTATCAGCACTTGTGGACAT GGGGCATCATGCTCCTTGGGATGTTAATGATCTGTAGTGCTGCAGGAAATTGGTGGGTCACAGTCTATTA TGGAGTRCCTGTGTGGAAAGAAGCAACCACCACTCTATTTTGTGCATCAGATGCTAAAGCATATGATACA GAGGTACATAATGTTTGGGCCACACATGCCTGTGTACCCACAGACCCCAACCCACAAGAAATATTATTGA AAAATGTGACAGAAAATTTTAACATGTGGAAAAATGGCATGGTAGAACAAATGCATGAGGATATAATCAG TTTATGGGATCAAAGCCTAAAGCCATGTGTGAAATTAACCCCACTCTGTGTTACTTTAAATTGCACTARC TTGAATGTTACTAATACCACTGCTACTAACACAACGAATAATGGCGGGACAACAATGGCGGGAGAAATGA GAAACTGCTCTTTCAATGTCACCACAAGCATAGGAAATAGGAGACAAAAAGAATATGCGCTTTTGTATAA ACATGATATAGTACCAATAGATAATAGTACYAACTATATACTAATAAGTTGTAACACCTCAGTCATTACA CAGGCCTGTCCAAAGATATCCTTTGAACCAATTCCCATACATTATTGTGCCCCAGCTGGTTTTGCGATTC TAAAGTGTAAYGAGAAGAAGTTCAATGGCACAGGACCATGTAAAAATGTCAGCACAGTACAATGTACACA TGGAATTAAGCCAGTAGTATCAACTCAACTGTTGTTAAATGGCAGTCTAGCAGAAGAAGAGGTAGTAATT AGATCTGAAAATTTCACAAACAATGCTAAAACCATAATAGTACAGCTAAACAGTCCTGTATTAATTAATT GTACAAGACCCAACAACAATACAAGAAAAGGTATACGGATAGGACCAGGGAGAACATTCWTTGCAACAGA AAGAATAATAGGAGATATAAGACAAGCACATTGYAATCTTAGTAGAGAACAATGGAATAACACTTTAGAA AAGGTAGCTGCAAAATTAAGAGAACAATTTGAAAATAAGACAATAATCTTTAATCACTCCTCAGGAGGGG ACCCAGAAATTGTAATGCACAGTTTTAATTGTGGAGGRGAATTTTTCTATTGTAATACAACACAGCTGTT TAATAGTACTTGGAATAGTACAGGGTCAAATAACRCTAAAGGAGATGAMGTTATCACACTCCCATGCAGA ATAAAACAAATTGTAAATATGTGGCAGGAAGTAGGAAAAGCAATGTATGCCCCTCCCATCAGWGGACAAA TTAATTGTTCGTCAAATATTACAGGGCTGCTATTAACAAGAGACGGYGGTAATAATAATAACMTCCAAAA TGAGACCTTCAGACCTGGAGGAGGAAATATGAAGGACAATTGGAGAAGTGAATTATATAAATATAAAGTA GTAAARATACAACCATTAGGA
- The ORF's for the gp120 sequence were analyzed by placing the gp120 sequence ORF Finder box and pressing OrfFind
- The ORF Finder output for the gp120 sequence is shown below:
- The results of the ORF finder tell us the amino acid sequences that will be made through translation of the sequence in the six different ORFs shown. In this case, since the gp120 sequence used was determined from a gp120 protein isolate, the ORF containing the longest or most representative amino acid sequence can usually be assumed to be the correct ORF or the ORF most likely to be biologically relevant.
- Based on the results of the ORF Finder for the gp120 sequence it can assumed that the +1 ORF is the most likely to be biologically relevant for the sequence.
Working with a Single Protein Sequence
HIV Structure Project
Links
- Chris Rhodes User Page
- Week 2 Journal
- Week 3 Journal
- Week 4 Journal
- Week 5 Journal
- Week 6 Journal
- Week 7 Journal
- Week 8 Journal
- Week 9 Journal
- Week 10 Journal
- Week 11 Journal
- Week 12 Journal
- Week 13 Journal
- Week 14 Journal
- Home Page
- Week 5 Assignment Page
- Week 6 Assignment Page
- Week 7 Assignment Page
- Week 8 Assignment Page
- Week 9 Assignment Page
- Week 10 Assignment Page
- Week 11 Assignment Page
- Week 12 Assignment Page
- Week 13 Assignment Page
- Week 14 Assignment Page