Alex A. Cardenas Week 9: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
No edit summary
No edit summary
Line 29: Line 29:
#*There are fewer differences when looking at the protein sequences when compared to the DNA sequences.  
#*There are fewer differences when looking at the protein sequences when compared to the DNA sequences.  
#*This can be accounted for because different DNA sequence codons can can still code for the same amino acid.  
#*This can be accounted for because different DNA sequence codons can can still code for the same amino acid.  
#*All rapid progressors that have AIDs (the last visits where CD4 T Cell coun is below 200), were ran against the amino acid sequences of the Moderate progressor 6. The results are shown:  
#*Amino acid sequences of all rapid progressors that have AIDs (the last visits where CD4 T Cell count was below 200), were ran against the amino acid sequences of the Moderate progressor 6. The results are shown:  


  >S11V4-5      EVIIRSENFSNNAKNIIVQLNESVVINCTRPDNTIKQRIIHIGPGRPFYTT-GIKGNIRQ
  >S11V4-5      EVIIRSENFSNNAKNIIVQLNESVVINCTRPDNTIKQRIIHIGPGRPFYTT-GIKGNIRQ
Line 48: Line 48:
  >S3V6-6      AHCNLSRAGWNSTLERIAIKLREQFQ-NKTIAFNQSS
  >S3V6-6      AHCNLSRAGWNSTLERIAIKLREQFQ-NKTIAFNQSS
               *:**:*  **.**: :. ****::  **** *.:.  
               *:**:*  **.**: :. ****::  **** *.:.  
#*Then all these were ran against the amino acid sequence of subject 13, a non progressor (used as a control). The multiple sequence alignment is shown below.
>S15V4-7        EVVIRSENFTNNAKIIIVHLNESVVINCTRPNNNTR-RKIHIGPGKTFYTG-DIIGNIRQ
>S1V5-4          EVVIRSENITNNAKIIIVQLNESVAISCTRPNNNIKQRIMHIRPGRAFYTK-DITEDIRQ
>S13V5-1[4]      EIVIRSENFTNNAKIIIVQLKESVEINCTRPGNNTR-RSINIGPGRAFYASRGIIGDIRQ
>S10V6-1        EVVIRSENFTDNAKTIIVHLNKSVEINCTRPNNNTR-RSINMGPGRAFYATGEIIGDIRQ
>S4V4-2          EVVIRSENFTNNAKIIIVQLNESVEINCTRPDNHTV-RKIPIGPGRSFYTT-GIVGDIRQ
>S11V4-5        EVIIRSENFSNNAKNIIVQLNESVVINCTRPDNTIKQRIIHIGPGRPFYTT-GIKGNIRQ
>S3V6-6          DIVIRSANFSDNAKTILVQLNETVVMNCTRPGNNTR-KRVTLGPGRVYYTTGQIIGDIRK
                  :::*** *:::*** *:*:*:::* :.****.*    : : : **: :*:  *  :**:


#Which of the procedures from Chapter 6 that you ran on the entire gp120 sequence are applicable to the V3 fragment you are working with now?  
>S15V4-7        AHCNISGSKWNNTLKQIVNKLREQFG-NKTIVFNQSS
#*How are they applicable?
>S1V5-4          AYCNISRTAWNNTLKQIVKKLREHFV-NKTIVFNHSS
>S13V5-1[4]      AYCNISKAKWDNTLGQVAAKLREQFR-NATIVFNQSS
>S10V6-1        AHCNLSRTKWNDTLKQVVAKLREQFR-NKTIIFTQSS
>S4V4-2          AHCNISKTKWNNTLKLIVNKLREQFG-NKTIIFNQSS
>S11V4-5        AHCNVSRGQWNKTLEQVVRKLREQYGPNKTIVFKQPI
>S3V6-6          AHCNLSRAGWNSTLERIAIKLREQFQ-NKTIAFNQSS
                  *:**:*  *:.**  :. ****::  * ** *.:.
#*As seen from the multiple sequence alignments above, there are conserved regions of the V3 loop. However, these amino acid sequences show that there is not a major conserved region in the seqeunces that have AIDs when compared to the sequences that do not have AIDs (subject 6 and 13).
#Which of the procedures from Chapter 6 that you ran on the entire gp120 sequence are applicable to the V3 fragment you are working with now? The procedures from Chapter 6 that are applicable to the V3 fragment that we are working with now are: doing primary structure analysis and finding domains in the protein.
#*How are they applicable? The primary strucutre analysis can be helpful because it can help us determine how the protein will fold. This will be helpful because it allows us to determine/see the hydrophobic regions, the coiled regions, and the hydrophilic regions that are within the protein. The finding domains procedure can help us in looking for a specific domain/sequence that will help is in determining the differeneces and similarities within the different V3 loop conformations. We can run multiple sequence alignments on this and determine how the different domains are similar.
#Chapter 11 contains procedures to use for working with protein 3D structures. Find the section on "Predicting the Secondary Structure of a Protein Sequence" and perform this on both the entire gp120 sequence and on the V3 fragment that we are now working with. You will compare the predictions with the actual structures.
#Chapter 11 contains procedures to use for working with protein 3D structures. Find the section on "Predicting the Secondary Structure of a Protein Sequence" and perform this on both the entire gp120 sequence and on the V3 fragment that we are now working with. You will compare the predictions with the actual structures.
#*Amino acid sequence of gp120 found below. Zeb found this because the pages in the book did not work.  
#*Amino acid sequence of gp120 found below. Zeb found this because the pages in the book did not work.  
Line 60: Line 78:
#* [http://www.ncbi.nlm.nih.gov/Structure/mmdb/mmdbsrv.cgi?uid=25248 Stanfield et al. (2003) structure 1NAK]. [[Image:Stanfield Structure1NAK.png]]
#* [http://www.ncbi.nlm.nih.gov/Structure/mmdb/mmdbsrv.cgi?uid=25248 Stanfield et al. (2003) structure 1NAK]. [[Image:Stanfield Structure1NAK.png]]
#Using StarBiochem to work with gp120 peptide and the ternary complex structures.  
#Using StarBiochem to work with gp120 peptide and the ternary complex structures.  
#*Locate the N-terminus and C-terminus of each (poly)peptide structure.
#*Locate all the secondary structure elements (found below. purple=helices; yellow=sheets; blue=coils. Do these match the predictions you made above? Yes for all of the helices --> they match. As for the sheets and coils, some of them match while others do not.  
#*Locate all the secondary structure elements (found below. purple=helices; yellow=sheets; blue=coils. Do these match the predictions you made above? Yes for all of the helices --> they match. As for the sheets and coils, some of them match while others do not.  
#**Secondary Structure 1GC1 - [[Image:Secondary 1GC1.png]]
#**Secondary Structure 1GC1 - [[Image:Secondary 1GC1.png]]

Revision as of 14:32, 31 October 2011

HIV Structure Project

  • Working with Robert and Zeb.
  • V3 loop conformation is the main area of study and comparing these regions with those who developed AIDs and those who did not. We will investigate these different conformations and how they compare with the given numbers such as CD4 T cell count, ratios, etc.
  • Rapid progressors sequences will be compared to moderate and non progressors. Subjects of interest are 1, 3, 4, 10, 11, and 15. Moderate progessors of interest 6 and non-progressor 13.
  • The following visits for each subject will be studied. We will be comparing pre and post AIDs (in relation to the # of CD4 T cell count). The conformational changes of the V3 loop will be looked at when the subjects have the highest CD4 T cell count and the lowest CD4 T cell count (except for the subjects 6 and 13).
    • Subject 1 - visit 1 and 3.
    • Subject 3 - visit 1 and 5.
    • Subject 4 - visit 1 and 4.
    • Subject 10 - visit 1 and 5.
    • Subject 11 - visit 1 and 4.
    • Subject 15 - visit 1 and 4.
    • Subject 6 (moderate progressor) - visit 9.
    • Subject 13 (non-progressor)- visit 5.

HIV Structure Project Exercises

  1. Converting DNA sequences into protein sequences.
    • The following amino acid sequences can be found [here].
    • This would be done using www.ncbi.nlm.nih.gov/gorf/gorf.html --> and then entering in each of the DNA sequences but the sequence data in BEDROCK already gives us the amino acid sequences.
  2. Perform a multiple sequence alignment on the protein sequences.
    • This was doing through ClustalW (www.ebi.ac.uk/clustalw.). The amino acid sequences were pasted into the ClustalW sequence box and the results were made.
    • Subject 1 multiple sequence alignment:
    • Subject 3 multiple sequence alignment:
    • Subject 4 multiple sequence alignment:
    • Subject 10 multiple sequence alignment:
    • Subject 11 multiple sequence alignment:
    • Subject 15 multiple sequence alignment:
    • Subject 6 multiple sequence alignment - ran with last visits of subjects above:
    • Subject 13 multiple sequence alignment - ran with all subjects last visits not including subject 6:
    • There are fewer differences when looking at the protein sequences when compared to the DNA sequences.
    • This can be accounted for because different DNA sequence codons can can still code for the same amino acid.
    • Amino acid sequences of all rapid progressors that have AIDs (the last visits where CD4 T Cell count was below 200), were ran against the amino acid sequences of the Moderate progressor 6. The results are shown:
>S11V4-5      EVIIRSENFSNNAKNIIVQLNESVVINCTRPDNTIKQRIIHIGPGRPFYTT-GIKGNIRQ
>S1V5-4       EVVIRSENITNNAKIIIVQLNESVAISCTRPNNNIKQRIMHIRPGRAFYTK-DITEDIRQ
>S4V4-2       EVVIRSENFTNNAKIIIVQLNESVEINCTRPDN-HTVRKIPIGPGRSFYTT-GIVGDIRQ
>S15V4-7      EVVIRSENFTNNAKIIIVHLNESVVINCTRPNN-NTRRKIHIGPGKTFYTG-DIIGNIRQ
>S10V6-1      EVVIRSENFTDNAKTIIVHLNKSVEINCTRPNN-NTRRSINMGPGRAFYATGEIIGDIRQ
>S6V9-3       EVVIRSANLTDNAKIIIVHLNESVEMNCTRPNN-NTRKGIHIGPGRAFYATGEIIGNIRQ
>S3V6-6       DIVIRSANFSDNAKTILVQLNETVVMNCTRPGN-NTRKRVTLGPGRVYYTTGQIIGDIRK
              :::*** *:::*** *:*:**::* :.****.*  . : : : **: :*:   *  :**:
>S11V4-5      AHCNVSRGQWNKTLEQVVRKLREQYGPNKTIVFKQPI
>S1V5-4       AYCNISRTAWNNTLKQIVKKLREHF-VNKTIVFNHSS
>S4V4-2       AHCNISKTKWNNTLKLIVNKLREQFG-NKTIIFNQSS
>S15V4-7      AHCNISGSKWNNTLKQIVNKLREQFG-NKTIVFNQSS
>S10V6-1      AHCNLSRTKWNDTLKQVVAKLREQFR-NKTIIFTQSS
>S6V9-3       AHCNLSRAPWNDTLKRIAIKLREQFK-NKTIAFNQSS
>S3V6-6       AHCNLSRAGWNSTLERIAIKLREQFQ-NKTIAFNQSS
              *:**:*   **.**: :. ****::  **** *.:. 
    • Then all these were ran against the amino acid sequence of subject 13, a non progressor (used as a control). The multiple sequence alignment is shown below.
>S15V4-7         EVVIRSENFTNNAKIIIVHLNESVVINCTRPNNNTR-RKIHIGPGKTFYTG-DIIGNIRQ
>S1V5-4          EVVIRSENITNNAKIIIVQLNESVAISCTRPNNNIKQRIMHIRPGRAFYTK-DITEDIRQ
>S13V5-1[4]      EIVIRSENFTNNAKIIIVQLKESVEINCTRPGNNTR-RSINIGPGRAFYASRGIIGDIRQ
>S10V6-1         EVVIRSENFTDNAKTIIVHLNKSVEINCTRPNNNTR-RSINMGPGRAFYATGEIIGDIRQ
>S4V4-2          EVVIRSENFTNNAKIIIVQLNESVEINCTRPDNHTV-RKIPIGPGRSFYTT-GIVGDIRQ
>S11V4-5         EVIIRSENFSNNAKNIIVQLNESVVINCTRPDNTIKQRIIHIGPGRPFYTT-GIKGNIRQ
>S3V6-6          DIVIRSANFSDNAKTILVQLNETVVMNCTRPGNNTR-KRVTLGPGRVYYTTGQIIGDIRK
                 :::*** *:::*** *:*:*:::* :.****.*    : : : **: :*:   *  :**:
>S15V4-7         AHCNISGSKWNNTLKQIVNKLREQFG-NKTIVFNQSS
>S1V5-4          AYCNISRTAWNNTLKQIVKKLREHFV-NKTIVFNHSS
>S13V5-1[4]      AYCNISKAKWDNTLGQVAAKLREQFR-NATIVFNQSS
>S10V6-1         AHCNLSRTKWNDTLKQVVAKLREQFR-NKTIIFTQSS
>S4V4-2          AHCNISKTKWNNTLKLIVNKLREQFG-NKTIIFNQSS
>S11V4-5         AHCNVSRGQWNKTLEQVVRKLREQYGPNKTIVFKQPI
>S3V6-6          AHCNLSRAGWNSTLERIAIKLREQFQ-NKTIAFNQSS
                 *:**:*   *:.**  :. ****::  * ** *.:. 
    • As seen from the multiple sequence alignments above, there are conserved regions of the V3 loop. However, these amino acid sequences show that there is not a major conserved region in the seqeunces that have AIDs when compared to the sequences that do not have AIDs (subject 6 and 13).
  1. Which of the procedures from Chapter 6 that you ran on the entire gp120 sequence are applicable to the V3 fragment you are working with now? The procedures from Chapter 6 that are applicable to the V3 fragment that we are working with now are: doing primary structure analysis and finding domains in the protein.
    • How are they applicable? The primary strucutre analysis can be helpful because it can help us determine how the protein will fold. This will be helpful because it allows us to determine/see the hydrophobic regions, the coiled regions, and the hydrophilic regions that are within the protein. The finding domains procedure can help us in looking for a specific domain/sequence that will help is in determining the differeneces and similarities within the different V3 loop conformations. We can run multiple sequence alignments on this and determine how the different domains are similar.
  2. Chapter 11 contains procedures to use for working with protein 3D structures. Find the section on "Predicting the Secondary Structure of a Protein Sequence" and perform this on both the entire gp120 sequence and on the V3 fragment that we are now working with. You will compare the predictions with the actual structures.
    • Amino acid sequence of gp120 found below. Zeb found this because the pages in the book did not work.
  3. Download the structure files for the papers we read in journal club from the NCBI Structure Database.
  4. Using StarBiochem to work with gp120 peptide and the ternary complex structures.
    • Locate all the secondary structure elements (found below. purple=helices; yellow=sheets; blue=coils. Do these match the predictions you made above? Yes for all of the helices --> they match. As for the sheets and coils, some of them match while others do not.
      • Secondary Structure 1GC1 -
      • Secondary Structure 1F58 -
      • Secondary Structure 2F58 -
      • Secondary Structure 1NAK -
    • Locate the V3 region and figure out which sequences from your alignment are present in the structures and which sequences are absent.
      • V3 region for secondary structure 1GC1 is not present --> has the whole gp120 excluding V3 region. Region 300G-328G is missing! In between the coil (blue) and sheet (yellow) in the following picture is where the V3 region should be.
      • V3 region for secondary structure 1F58 is found 313P-325P.
      • V3 region for secondary structure 2F58 is found 315P-324P.
      • V3 region for secondary structure 1NAK is found 312P-323P.


Links