Harvard:Biophysics 101/2007/Notebook:Katie Fifer/2007-2-8: Difference between revisions
From OpenWetWare
Jump to navigationJump to search
No edit summary |
|||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
== Script == | |||
#!/usr/bin/env python | #!/usr/bin/env python | ||
Line 52: | Line 54: | ||
== Output == | == Output == | ||
* program run to generate 10,000 strings | |||
H 49831 | H 49831 | ||
HH 22372 | HH 22372 | ||
Line 74: | Line 77: | ||
== Testing == | == Testing == | ||
* Output for just 5 strings so you can double check by hand | |||
all strings: ['HHTTHTHTTT', 'THTTTTTTTH', 'TTTHHTHTHT', 'HTHTHTHTTH', 'HTTHHHTTTT'] | |||
H 19 | |||
HH 4 | |||
HHH 1 | |||
HHHH 0 | |||
HHHHH 0 | |||
HHHHHH 0 | |||
HHHHHHH 0 | |||
HHHHHHHH 0 | |||
HHHHHHHHH 0 | |||
HHHHHHHHHH 0 | |||
T 31 | |||
TT 16 | |||
TTT 9 | |||
TTTT 5 | |||
TTTTT 3 | |||
TTTTTT 2 | |||
TTTTTTT 1 | |||
TTTTTTTT 0 | |||
TTTTTTTTT 0 | |||
TTTTTTTTTT 0 |
Latest revision as of 15:28, 7 February 2007
Script
#!/usr/bin/env python # Katie Fifer # asst2.py # 2/7/07 # Description: A script to generate 10,000 strings of 10 random # coinflips (H or T) and outputs the tally of contiguous (overlapping # stretches of 2,3,4,5,6,7,8,9, and 10 H's or T's in that set of # 10,000 10-mers import random # set constants num_strings = 10000 num_flips = 10 max_repeat = 10 all_strings = [ ] # random number generation # generate a list of new strings for i in range(num_strings): new_string = .join([random.choice(['H','T']) for n in range (num_flips)]) all_strings.append(new_string) # figure out how many overlapping stretches of H's there are. will do # this for each string for each substring. in other words will find # all instances of 'HH' in each of the strings, and then all instances # of 'HHH' in each of the strings etc. def analyze (letter): for i in range(max_repeat): # generate the substring to search for. the i + 1 is to account # for the fact that i starts at 0 substr = .join([letter for n in range (i + 1)]) # for each of the strings in the list, find the number of # instances of the substring just set (overlapping) total = 0 for j in range(num_strings): curr_string = all_strings[j] count = 0 pos = curr_string.find(substr, 0) while not pos == -1: count = count + 1 total = total + 1 pos = curr_string.find(substr, pos + 1) print substr, total analyze('H') analyze('T')
Output
- program run to generate 10,000 strings
H 49831 HH 22372 HHH 9860 HHHH 4232 HHHHH 1813 HHHHHH 754 HHHHHHH 313 HHHHHHHH 127 HHHHHHHHH 44 HHHHHHHHHH 8 T 50169 TT 22622 TTT 10065 TTTT 4401 TTTTT 1937 TTTTTT 824 TTTTTTT 341 TTTTTTTT 122 TTTTTTTTT 37 TTTTTTTTTT 7
Testing
- Output for just 5 strings so you can double check by hand
all strings: ['HHTTHTHTTT', 'THTTTTTTTH', 'TTTHHTHTHT', 'HTHTHTHTTH', 'HTTHHHTTTT'] H 19 HH 4 HHH 1 HHHH 0 HHHHH 0 HHHHHH 0 HHHHHHH 0 HHHHHHHH 0 HHHHHHHHH 0 HHHHHHHHHH 0 T 31 TT 16 TTT 9 TTTT 5 TTTTT 3 TTTTTT 2 TTTTTTT 1 TTTTTTTT 0 TTTTTTTTT 0 TTTTTTTTTT 0