BE.180:Assignment1: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 60: Line 60:
*Answer to written Q1 section:   
*Answer to written Q1 section:   
**Under certain conditions, the bacteria will express the mRFP protein encoded by the ORF.
**Under certain conditions, the bacteria will express the mRFP protein encoded by the ORF.
*One possible way of writing the code for assignment 1 is provided here: [[Image:Example.jpg]]
*One possible way of writing the code for assignment 1 is provided here: [[Image:spencers_1.txt]]
*If, after reviewing this code, you still have questions, please email the TAs.
*If, after reviewing this code, you still have questions, please email the TAs.

Revision as of 10:10, 2 March 2006

Assignment PDF

  • Download Assignment 1 PDF

Parts

Write your code so that it could take in any input file which has the following structure:
key1
value1
key2
value2
key3
value3...
  • Please plan to submit one .py file containing the code for both question 1 and question 2, named as yourathenaname_assignmentnumber.py. For example, for the first assignment, my file would be called spencers_1.py.
  • Your code should create two output files, one for question 1, called output1.txt, and one for question 2, called output2.txt.
  • NEW! output1.txt should contain only the DNA sequence as a single string.
  • NEW! output2.txt should contain one ORF per line, and nothing else.

Submission instructions

  • Please email your .py file to be180hw@gmail.com by 5pm Thursday. As stated above, this file will contain the code for question 1 and question 2. You do not need to send the input or output files since we will run your code with our Parts.txt file as the input. (Note capital P in Parts.txt!)
  • On a paper copy of the pset pdf, please hand write your answers to question 0 as well as the answer to this question: What will this composite part do when placed inside a living bacterium? Please place this paper in the box outside Drew Endy's office (68-564) by 5pm Thursday. This box will be available starting Wednesday at 5pm.
  • Late psets will NOT be accepted.

Questions and Clarifications

  • Note that the stop codon TAA must be in frame, i.e. a multiple of 3 basepairs away from the ATG. For example, ATGxxxxxxTAA would be in frame, but ATGxxxxxTAA would not be. (x is any basepair)
  • Is it significant that the barcode is CAPS and the other parts are lower case?
    • NO/no.
  • Can an ORF be any length over 50, or should its length be a multiple of some small integer?
    • An ORF should be a length that is a multiple of three, the number of base pairs that comprise a codon
  • Does the ORF include the start ATG and stop TAA? Suppose the DNA string is "ATG...TAA": is the ORF "..." or "ATG..." or "ATG...TAA" or "...TAA"?
    • The ORF includes the "start" ATG and "stop" TAA.
  • Can ORFs overlap? Suppose the DNA string is "ATG...TAAxxxTAA". The first ORF is obviously (modulo previous question) "ATG...TAA". Is "ATG...TAAxxxTAA" also an ORF? It meets the specification of "a string starting with ATG and ending with TAA". One could imagine a similar situation with overlapping starting tags: "ATG...ATGxxxTAA" might have both "ATG...ATGxxxTAA" and "ATGxxxTAA".
    • Yes, ORFs can overlap.
    • Although "ATG...TAAxxxTAA" has a small chance of occurring in biology, for the purposes of this programming assignment, please end ORFs at the first in-frame TAA.
  • For Q2, ATG...TAA...TAA isn't an ORF, but what if ATG...TAA is less than 50 bp and ATG...TAA...TAA is >50bp?
    • Still not an ORF (assuming the TAA's are in frame). The >50bp is something humans have used as a qualifier to weed out things that are not ORFs, since we've observed that ORFs are usually >50bp. The biology of translation will still see TAA as a stop codon and stop translation at the first TAA, making the sequence less than 50bp.


Solutions and General Comments

  • Answers to Q0:
    • False, because == checks for equality
    • 4
    • True, because we previously set a equal to b, so now a and b are equal
    • 1 2 3 4 (each of these will be on a new line)
    • 2 3 4 5 (each of these will be on a new line)
  • Answer to written Q1 section:
    • Under certain conditions, the bacteria will express the mRFP protein encoded by the ORF.
  • One possible way of writing the code for assignment 1 is provided here: File:Spencers 1.txt
  • If, after reviewing this code, you still have questions, please email the TAs.