User talk:Hossein Azari Soufiani: Difference between revisions

From OpenWetWare
Jump to navigationJump to search
Line 283: Line 283:
Because the genes are not the only factor that play essential rules we also need to design our core to be adaptive to add any kind of information that user likes about his or her diet weight and etc. The core should use a real time clustering for data which helps us to have instant prediction for any user also data base will get accurate over time based on user reports.  
Because the genes are not the only factor that play essential rules we also need to design our core to be adaptive to add any kind of information that user likes about his or her diet weight and etc. The core should use a real time clustering for data which helps us to have instant prediction for any user also data base will get accurate over time based on user reports.  
Real time processing of data in different cores will helps us to distribute complexity in a kind of parallel way and we don't need to be worried of high complexity.
Real time processing of data in different cores will helps us to distribute complexity in a kind of parallel way and we don't need to be worried of high complexity.
==Decision and inference Box. Nov 10th ==
With learning process I mean updating some parameters for a model. we can choose more parameters that can also choose one model among some models and the update that model. and this paparmeters dimension will grow bigger and bigger.
Two good properties that our decision process needs:
1) Flexible for adding a new model to check(Or flexible to add parameters)
2) Ability to update the parameters with just using a new experiment result(I mean it doesn't need all data again.)
For a given case study we can:
i)  Make our initial model space based on models that already exist for that case study.
ii)  Write a program to fit the data and find the goodness of fit for all the model and find the best model.(This can be done with methods like Maximum likelihood or just exhaustive search.)
iii) Use algorithms to design an updating approach using recursive estimation methods.(This can be done with Kalman filtering idea because we will have noise in our data.)

Revision as of 08:54, 10 November 2009

Report of Second Homework

The paper "Foundations for the Engineering Biology" was really interesting for me because I am an engineer and I like to see the problems from engineering point of view. The paper was written in a very classic engineering manner and it made understanding of our position in Bioengineering more clear for me.

Installing and preparing Python was very different than the other programs which I used before like Matlab, C++. I enjoyed using it, specially the object oriented programing ability makes it really powerful. Plot for Python is very similar to Matlab and we have same commands with a little bit difference. Here you see a plot for three different growth rates for exponential function plotted with different colors.

Code for Third Homework

import random import numpy as np

print print

Code='cggagcagctcactattcacccgatgagaggggaggagagagagagaaaatgtcctttaggccggttcctcttacttggcagagggaggc tgctattctccgcctgcatttctttttctggattacttagttatggcctttgcaaaggcaggggtatttgttttgatgcaaacctcaatccctccc cttctttgaatggtgtgccccaccccccgggtcgcctgcaacctaggcggacgctaccatggcgtagacagggagggaaagaagtgtgcagaaggc aagcccggaggcactttcaagaatgagcatatctcatcttcccggagaaaaaaaaaaaagaatggtacgtctgagaatgaaattttgaaagagtgc aatgatgggtcgtttgataatttgtcgggaaaaacaatctacctgttatctagctttgggctaggccattccagttccagacgcaggctgaacgtc gtgaagcggaaggggcgggcccgcaggcgtccgtgtggtcctccgtgcagccctcggcccgagccggttcttcctggtaggaggcggaactcgaat tcatttctcccgctgccccatctcttagctcgcggttgtttcattccgcagtttcttcccatgcacctgccgcgtaccggccactttgtgccgtac ttacgtcatctttttcctaaatcgaggtggcatttacacacagcgccagtgcacacagcaagtgcacaggaagatgagttttggcccctaaccgct ccgtgatgcctaccaagtcacagacccttttcatcgtcccagaaacgtttcatcacgtctcttcccagtcgattcccgaccccacctttattttga tctccataaccattttgcctgttggagaacttcatatagaatggaatcaggatgggcgctgtggctcacgcctgcactttggctcacgcctgcact ttgggaggccgaggcgggcggattacttgaggataggagttccagaccagcgtggccaacgtggtg' RCCode=Code TempCode=Code

print 'Code=',Code print print


pro1=range(1,339) pro2=range(1,339) pro3=range(1,339) prom1=range(1,339) prom2=range(1,339) prom3=range(1,339)

  1. ----------------------Problem one --------------------------------------------


GCcontent=0 for i in range(0,len(Code)-1):

if Code[i]=='c': GCcontent=GCcontent+1 elif Code[i]=='g': GCcontent=GCcontent+1



print 'GC Content=',GCcontent print print print print

  1. ----------------------Problem two---------------------------------------------

for i in range(0,len(Code)-1):

if Code[len(Code)-1-i]=='c': RCCode=RCCode[:i]+'g'+RCCode[i+1:] if Code[len(Code)-1-i]=='g': RCCode=RCCode[:i]+'c'+RCCode[i+1:] if Code[len(Code)-1-i]=='t': RCCode=RCCode[:i]+'a'+RCCode[i+1:] if Code[len(Code)-1-i]=='a': RCCode=RCCode[:i]+'t'+RCCode[i+1:]

print 'Recerse Complement=:', RCCode print print

  1. ----------------------Problem Three---------------------------------------------

Here we put the table That I didn't put because it doesn't look good!!!

for i in range(0,338):

Temp1=Code[3*i]+Code[3*i+1]+Code[3*i+2] Temp2=Code[3*i+1]+Code[3*i+2]+Code[3*i+3] Temp3=Code[3*i+2]+Code[3*i+3]+Code[3*i+4]

pro1[i]=standard[Temp1] pro2[i]=standard[Temp2] pro3[i]=standard[Temp3]

Temp1=RCCode[3*i]+RCCode[3*i+1]+RCCode[3*i+2] Temp2=RCCode[3*i+1]+RCCode[3*i+2]+RCCode[3*i+3] Temp3=RCCode[3*i+2]+RCCode[3*i+3]+RCCode[3*i+4]

prom1[i]=standard[Temp1] prom2[i]=standard[Temp2] prom3[i]=standard[Temp3]


print 'Sequence of (+1) frame' print pro1

print 'Sequence of (+2) frame' print pro2

print 'Sequence of (+3) frame' print pro3

print 'Sequence of (-1) frame' print prom1

print 'Sequence of (-2) frame' print prom2

print 'Sequence of (-3) frame' print prom3


  1. -----------------------------Problem Four---------------------------

counter=0 for j in range(0,1000):

       Code=TempCode
       for i in range(0,10):
               Te=random.random()
               Te=np.fix(100*Te)
               Te2=random.random()
               Te2=int(np.fix(10*Te2))%3
               
               


               if   (Code[100*i+int(Te)]=='c') and (Te2==1):
                     Code=Code[:100*i+int(Te)]+'g'+Code[100*i+int(Te)+1:]
                     
               
               
               elif   (Code[100*i+int(Te)]=='c') and (Te2==2):
                     Code=Code[:100*i+int(Te)]+'t'+Code[100*i+int(Te)+1:]


               elif   (Code[100*i+int(Te)]=='c') and (Te2==0):
                     Code=Code[:100*i+int(Te)]+'a'+Code[100*i+int(Te)+1:]



               elif   (Code[100*i+int(Te)]=='t') and (Te2==1):
                     Code=Code[:100*i+int(Te)]+'g'+Code[100*i+int(Te)+1:]
               
               
               elif   (Code[100*i+int(Te)]=='t') and (Te2==2):
                     Code=Code[:100*i+int(Te)]+'c'+Code[100*i+int(Te)+1:]


               elif   (Code[100*i+int(Te)]=='t') and (Te2==0):
                     Code=Code[:100*i+int(Te)]+'a'+Code[100*i+int(Te)+1:]



               elif   (Code[100*i+int(Te)]=='g') and (Te2==1):
                     Code=Code[:100*i+int(Te)]+'c'+Code[100*i+int(Te)+1:]
               
               
               elif   (Code[100*i+int(Te)]=='g') and (Te2==2):
                     Code=Code[:100*i+int(Te)]+'t'+Code[100*i+int(Te)+1:]


               elif   (Code[100*i+int(Te)]=='g') and (Te2==0):
                     Code=Code[:100*i+int(Te)]+'a'+Code[100*i+int(Te)+1:]



               elif   (Code[100*i+int(Te)]=='a') and (Te2==1):
                     Code=Code[:100*i+int(Te)]+'g'+Code[100*i+int(Te)+1:]
               
               
               elif   (Code[100*i+int(Te)]=='a') and (Te2==2):
                     Code=Code[:100*i+int(Te)]+'t'+Code[100*i+int(Te)+1:]


               elif   (Code[100*i+int(Te)]=='a') and (Te2==0):
                     Code=Code[:100*i+int(Te)]+'c'+Code[100*i+int(Te)+1:]



       pro21=range(1,339)
       pro22=range(1,339)
       pro23=range(1,339)



       for i in range(0,338):
               Temp1=Code[3*i]+Code[3*i+1]+Code[3*i+2]
               Temp2=Code[3*i+1]+Code[3*i+2]+Code[3*i+3]   
               Temp3=Code[3*i+2]+Code[3*i+3]+Code[3*i+4]
               pro21[i]=standard[Temp1]
               pro22[i]=standard[Temp2]
               pro23[i]=standard[Temp3]


       for i in range(0,len(pro21)-1):
               if   pro21[i]=='*' and pro1[i]!='*':
                       counter=counter+1
   

print 'Percent of Premature Termination=',counter,'/1000' print print

print 'Before Mutation:', pro1 print print

print 'After Mutation:', pro21

input()

Project

I want to try and understand the information structure in different evolutionary dynamics.

By Information structure I mean looking at entire path of evolution and trying to understand a unique structure for that.

We already have a big lab of evolutionary dynamics in the nature and human society. For example considering a replication and selection structure for language evolution will give us a good sense of how replication and mutation iteratively make selection to vary. We can call teaching or spreading a word among people replication and selection will happen if a word is useful and easy to use. This path will lead us to build structures which we call it grammar and then grammar will act as a new selection for new objects which we want to generate. Mutation is the change in the words or making new words.


Generalizing evolutionary idea in this way helps us to develop a measurement to understand this big sample we have so far and think about future of that. There will be big parallels among different paths like evolution of language , evolution of science and etc. Also we will see some differences like the rate of evolution which will give us some intrinsic property of this path for different cases.

Project ,Oct 8th

                   "Information is the only difference of life and matter" 

You will see this in many different ideas about life. What looks best to pursue in evolutionary path is the Information amount in different stages. Intuitively the information is our knowledge about environment like temperature, taste and etc.

We should define exactly what is information. The best definition for information is the amount of uncertainty in a process.

So we need to have two different kind of information, One is the information is in the nature and the second kind is the information in a living object, For example our understanding and reaction.

First kind of information have been around in nature all time during the evolution but it was not constant because every change in nature affects that.

Second kind is the information a living object have gathered in evolutionary path and uses that. For example in a cell it is cells understanding of environment and reacting to them and cell have this in its DNA.

So what happens in the evolution is that the second kind of information gets more and more using the first kind. Sometimes it doesn't need it and forget it and sometimes it uses that for millions of years.

I make an example to clarify what I think: Seasons in nature used to be something that human 15000 years ago couldn't identify as a periodic action in a year. So coming of spring was a very informatics event for them as starting life again. So seasons is an information in the nature but because we understand periodic nature of it now it meas we also have this information. Trees found out this fact millions of years ago and adapted themselves.

Even these days there is many events in climate change that we can not understand and they are informatics for us. With trying to evolve our understand of these we make our information more.


After this example we need to measure this information. The best point to start is to look at DNA as a basic information center of cell and try to find a measure for unpredictability of DNA region.

Because we don't have the generator of the DNA we need to rely on the samples that we have from this source of uncertainty.

So to start, we need to look at same region in human gene for different persons and try to estimate the properties of the generator and this will give us how much informative is that generator.

To get these sequences I need to use a data base. But I am not sure if I can get this kind of data from the websites in the assignment part.


What can I do in Project. Oct 29

I can participate in genome interaction project with helping in studying and understanding the interaction rules to use statistical models and try to find the best matching algorithms. We can also redefine our problem and try to find a new model for gene interactions because we are going to add more futures onto our project which definitely needs different modeling and design. Also I can help in designing a part to use the information and cluster it over time which it gets information to train itself as a machine learning project to predict about a new question that we ask it.

Because the genes are not the only factor that play essential rules we also need to design our core to be adaptive to add any kind of information that user likes about his or her diet weight and etc. The core should use a real time clustering for data which helps us to have instant prediction for any user also data base will get accurate over time based on user reports. Real time processing of data in different cores will helps us to distribute complexity in a kind of parallel way and we don't need to be worried of high complexity.


Decision and inference Box. Nov 10th

With learning process I mean updating some parameters for a model. we can choose more parameters that can also choose one model among some models and the update that model. and this paparmeters dimension will grow bigger and bigger.

Two good properties that our decision process needs:

1) Flexible for adding a new model to check(Or flexible to add parameters)

2) Ability to update the parameters with just using a new experiment result(I mean it doesn't need all data again.)


For a given case study we can:

i) Make our initial model space based on models that already exist for that case study.

ii) Write a program to fit the data and find the goodness of fit for all the model and find the best model.(This can be done with methods like Maximum likelihood or just exhaustive search.)

iii) Use algorithms to design an updating approach using recursive estimation methods.(This can be done with Kalman filtering idea because we will have noise in our data.)