Harvard:Biophysics 101/2007/Notebook:Xiaodi Wu/2007-3-15
From OpenWetWare
Input sequence:
>example1 CACCCTCGCCAGTTACGAGCTGCCGAGCCGCTTCCTAGGCTCTCTGCGAATACGGACACG CATGCCACCCACAACAACTTTTTAAAAGAATCAGACGTGTGAAGGATTCTATTCGAATTA CTTCTGCTCTCTGCTTTTATCACTTCACTGTGGGTCTGGGCGCGGGCTTTCTGCCAGCTC CGCGGACGCTGCCTTCGTCCAGCCGCAGAGGCCCCGCGGTCAGGGTCCCGCGTGCGGGGT ACCGGGGGCAGAACCAGCGCGTGACCGGGGTCCGCGGTGCCGCAACGCCCCGGGTCTGCG CAGAGGCCCCTGCAGTCCCTGCCCGGCCCAGTCCGAGCTTCCCGGGCGGGCCCCCAGTCC GGCGATTTGCAGGAACTTTCCCCGGCGCTCCCACGCGAAGC
First step: Align this sequence on NCBI Blast. The result:
>ref|NT_030059.12|Hs10_30314 Homo sapiens chromosome 10 genomic contig, reference assembly
Length=44617998
Features flanking this part of subject sequence:
3895 bp at 5' side: hypothetical protein
425 bp at 3' side: HtrA serine peptidase 1
Score = 736 bits (398), Expect = 0.0
Identities = 400/401 (99%), Gaps = 0/401 (0%)
Strand=Plus/Plus
Query 1 CACCCTCGCCAGTTACGAGCTGCCGAGCCGCTTCCTAGGCTCTCTGCGAATACGGACACG 60
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 42968870 CACCCTCGCCAGTTACGAGCTGCCGAGCCGCTTCCTAGGCTCTCTGCGAATACGGACACG 42968929
Query 61 CATGCCACCCACAACAACTTTTTAAAAGAATCAGACGTGTGAAGGATTCTATTCGAATTA 120
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 42968930 CATGCCACCCACAACAACTTTTTAAAAGAATCAGACGTGTGAAGGATTCTATTCGAATTA 42968989
Query 121 CTTCTGCTCTCTGCTTTTATCACTTCACTGTGGGTCTGGGCGCGGGCTTTCTGCCAGCTC 180
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 42968990 CTTCTGCTCTCTGCTTTTATCACTTCACTGTGGGTCTGGGCGCGGGCTTTCTGCCAGCTC 42969049
Query 181 CGCGGACGCTGCCTTCGTCCAGCCGCAGAGGCCCCGCGGTCAGGGTCCCGCGTGCGGGGT 240
|||||||||||||||||||| |||||||||||||||||||||||||||||||||||||||
Sbjct 42969050 CGCGGACGCTGCCTTCGTCCGGCCGCAGAGGCCCCGCGGTCAGGGTCCCGCGTGCGGGGT 42969109
Query 241 ACCGGGGGCAGAACCAGCGCGTGACCGGGGTCCGCGGTGCCGCAACGCCCCGGGTCTGCG 300
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 42969110 ACCGGGGGCAGAACCAGCGCGTGACCGGGGTCCGCGGTGCCGCAACGCCCCGGGTCTGCG 42969169
Query 301 CAGAGGCCCCTGCAGTCCCTGCCCGGCCCAGTCCGAGCTTCCCGGGCGGGCCCCCAGTCC 360
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Sbjct 42969170 CAGAGGCCCCTGCAGTCCCTGCCCGGCCCAGTCCGAGCTTCCCGGGCGGGCCCCCAGTCC 42969229
Query 361 GGCGATTTGCAGGAACTTTCCCCGGCGCTCCCACGCGAAGC 401
|||||||||||||||||||||||||||||||||||||||||
Sbjct 42969230 GGCGATTTGCAGGAACTTTCCCCGGCGCTCCCACGCGAAGC 42969270
Conclusion -- this sequence we have obtained is on chromosome 10, and there is one SNP apparent.
Second step: Look at genome browser to find where exactly this is: 10q25, bases 124210300 to 124210800 (Source: [124204783.56%3A124217283.44-r&QUERY_NUMBER=1&RID=1173811506-1957-7708633758.BLASTQ2&GOTO=124210471.01human:10:bp&rsize=1562.484999999404 clicky!]
Third step: Look up SNPs at Entrez SNP (query: "10[CHR] AND 124210300:124210800[CHRPOS]") Two exist:
1: rs11200638 [Homo sapiens]
AGCTCCGCGGACGCTGCCTTCGTCC[A/G]GCCGCAGAGGCCCCGCGGTCAGGGT
2: rs2672598 [Homo sapiens]
CGCCGGACTGGGGGCCCGCCCGGGA[A/G]GCTCGGACTGGGCCGGGCAGGGACT