Parameter Estimation Discussion Feb 6, 2006
These notes were entered by Ty based on his recollection/perspective of the meeting. Feel free to add to and edit.
There were a few questions posed at the start of the meeting, not all of which were answered during the meeting (either because they fell outside of the scope of the meeting, or because we just couldn't answer them).
Q. What algorithms for parameter estimation exists?
Q. How well do these algorithms work?
Parameter sensitivity for parameter estimation
Ty's work on parameter sensitivity was discussed. Let's say we're interested in estimating parameters in a model of the receptor/G protein portion of the pheromone response pathway. Ideally we'd like to be able to design experiments to maximize the information about the different parameters in the model in order to get the best estimate possible. One way to frame this issue is around parameter sensitivity. If we can design an experiment that increases the dependence of the measureable output on a parameter, that experiment should be potentially more informative than other experiments, at least with regards to that one parameter. Since we're interested in using time-varying stimuli, you could imagine checking the parameter sensitivity for hundreds of (experimentally acheivable) stimulus timecourses. We could then select the timecourses that maximally increased the sensitivity to different parameters, and use these timecourses experimentally.
One caveate to basing the stimulus timecourses entirely off of parameter sensitivity is that it does not take into account possible correlation between parameters. For example, we might find that a give experiment increases the sensitivity to both the forward (kf) and reverse (kr) rate constants of a binding reaction. However, it may be that the experimental results depend entirely on the dissociation constant (KD = kr / kf). So the increase in sensitivity can give us a false hope that we can sucessfully estimate the forward and reverse rate constants.
Parameter sensitivity to direct research Parameter sensitivity can be used ot select which parameters are most important to measure in order to gain the most information about the model. Given the current state of understanding, the parameters with the highest sensitivity are the ones that govern the behavior of the model. So the direct measurement of these parameters will have the greatest impact on the model behavior, and thus on the predictive power of the model.
Parameter sensitivity as a testable hypothesis
It is also worth noting that in one sense, parameter sensitivity is a property of the system that could be tested experimentally. For example, we might find that the simulation output is not at all sensitive to changes in abundance in Parameter sensitivity is acheivable experimentally by changing abundances or kinetics. ie model is sensitive to X, so lets do experiment and see if biology is senstitive to changes in X. This is one way that you could try to validate your model.
Parameter estimation and experiment optimization
During the discussion the question was raised as the the difference between parameter estimation and experiment optimization (the Parameter sensitivity for parameter estimation discussed above is an exmaple of the latter). Parameter estimation is just a specific case of an optimization problem. In this case, we want to minimize the cost or objective function by varying certain parameters (which results in an estimate for these parameters). Experiment optimization is a sightly different optimization problem. For experiment optimization, we determine some metric (such as the Fisher information D or E criteria) which we want to maximize by varying some parameters that describe the experiment. In order to describe the experiment with a reasonable (i.e. easily estimatable) number of parameters, we need to greatly limit the amount of 'experiment space' that we are searching.
Parameter estimation from constant or time-varying stimuli
From computational/algorithmic standpoint, parameter estimation from time-varying stimuli is no different than parameter estimation from constant stimuli.
Discriminating between models given sufficient;y different behavior of the models is easy if we can enumerate and build the models of interest. In this case all we have to do is compare experimental results with the simulation results, and eliminate models that do not agree with the expriments. This is a much more difficult task if we would like an optimization routine to automatically identify model structures that are consisitent with experimental data given a single model starting point (i.e. without enumeration of the other possiblities).
Other Model Analysis
Koichi Takahashi talked about metabolic control analysis, which is closely related to parameter sensitivity. It's not entirely clear to me (Ty) what the differences are, but it may be worth looking into.
Below is a pdf where I give a trivial, but probably useful, connection between two big problems that were mentioned at the retreat:
- What's up with estimating parameters from time courses of observations of concentrations (say from quantitative Western blot)?
- What's up with estimating parameters by applying time-dependent stimuli (doses) to the yeast and measuring their responses? (As Ty did in simulation.)
This is not rocket science; I just realized it over morning coffee, so have no math-phobic qualms. I hope it proves useful in thinking about these problems.
There are also some papers and notes on parameter estimation from ODEs on Larry's page.
From final state to initial state
I was posed a question at the end of our meeting about parameter estimation, one which I thought might bear some of my patented "dissertation on the straightforward" treatment. The question, as I recall it, was this:
If you know the reaction rates perfectly, and you know the final state of the system perfectly, can you determine the initial state of the system?
First, let's think about it from the stochastic point of view. Suppose we consider the simple system
A -> A'
where we know the rate of the reaction precisely, r/sec.. Suppose that, at the end of 1 second of simulation, there are 5 molecules of A left, along with 45 molecules of A'. The question would be, then, how many molecules of A and of A' did we start with. Evidently, it depends on the luck of the draw, how many molecules of A decomposed into A' over the second that we were simulating. The best we could hope for would be the (Poisson-like) distribution of probability over the initial states, given the final state.
In very simple cases like this example, this distribution can frequently be described in closed form, but normally I would expect to have very little grasp on it. Recall that one of Shuki's grand schemes is computationally to push probability distributions forward in time, to determine the probability of final states conditional on an initial state.
Now let's think about the problem from a deterministic point of view, which I think is actually more interesting. First, we can run a differential equation backward in time just as well as forward, by making the change of variables u = -t. In an abstract sense, the answer to the question in the deterministic case is therefore yes.
That being said, I think one might want to meditate on the following example, which is both common and cautionary: Consider more or less any biochemical system that has "achieved" an equilibrium state, say the alpha pathway at its basal equilibrium state. Even knowing all the reaction rates and knowing the concentrations of all the species, we cannot realistically tell how long the system has been at equilibrium or what non-equilibrium state it may have occupied at a past time.
But how does this square with being able to run the equation backward in time? Well, in the abstract realm of differential equations, the system never actually reaches equilibrium, it merely approaches it more and more closely. Yes, you can try to run the equations backward in time to figure out your initial state based on your final, infinitesimal difference from true equilibrium. But the result would not even be computationally reliable, let alone useful for dealing with empirical data, since tiny, unavoidable errors in computations involving the infinitesimal differences from true equilibrium are magnified as we work backward in time. The backward equation is sensitive to initial conditions near the equilibrium precisely because our effectively reaching the the equilibrium state does not depend on where, or in a sense when, we start going there. (Ah, life in a nutshell!)
To sound even more weirdly philosophical, the extent to which our future is inevitable matches the extent to which our past is unpredictable. (Well, really, un-post-dictable.)
- What you summarized in the pdf is what's in E-Cell's toolbox. Numerically this works nicely in many cases, and we should give it a try. But let me add one more thing to the three points you wrote for me at the end of your report; if the number of unknowns (unknown rate coefficients plus unknown initial concentrations) exceed, say, 3-5, it becomes extremely hard to make well-grounded scientific arguments on it. It is practically impossible to provide sufficient amount of dataset to constrain the system to converge to a single solution, plus at this stage of the project we cannot be 100% sure about if the structure of the simulated model is identical to the real system. As a result, it is highly likely that fitting error would be distributed over all parameters to be estimated. This is fine for many engineering problems, in which the purpose is to make the system to behave as we want. But the point here is we need to be extremely careful when we feedback the result of parameter estimation to our model. It may be safer to conceive it as more like a means of analysis, from which we could get some indirect info on how the system works, rather than a means of determining parameters in our model.
- Another class of approach you pointed out, and I used to pursue in E-Cell project, is time series analysis which makes use of statistical means rather than error minimization. Such as linear filters, auto-regressive models, and Karman filters. As you pointed out, what we can do with linear filters is limited when the system is nonlinear. But, I think (a) we can get some info with linear analysis, and (b) there are some non-linear analysis methods that might work, with which my expericne is limited.
- The situation that this kind of things works best is when we have highly fluctuating signals. However, I think it might be worth trying to apply these methods to Ty's data IF the cell responds in the same frequency domain as that of the time-varying stimuli given, or if we can give the stimuli at the frequency domain of that level. Otherwise, usefulness of most of time-series analysis techniques that I know of would be less likely.
I had been intending, before suggesting that, to make a stab at trying to pull together some of the interesting threads that have been started here, which are in addition to whatever Ty has been able to do.
- First, here, we have fairly clearly different sets of "easy mutual intelligibility groups". Larry, or Larry and Joyce, comprise one such group. Rich and Andrew, _I believe_, comprise another. Koichi may comprise a third. Ty may represent a fourth.
- Not all things thought to be obvious by one group of people may be understood at all by other groups.
- For examples, neither Koichi nor Nathan (to pick on only two people) gave evidence of being aware of the nature of Ty's simulated data... randomly varying spike pulses. Nathan (to pick on Nathan) was unaware that one might get from a model that did not explicitly incorporate space. Some people have told me (to pick on me) the issue of estimating rates of reactions from time dependent output is no different from the general optimization problems associated with "parameter" estimation, while in my own mind it's not clear to me that the fact that we have starting concentrations of monomeric protein species, and "system architecture", might not somehow constrain or simplify the quest for rates. Some discussions of parameter estimation being well known and doable seem to only hold for linear systems, while I (to pick on me again) remains unclear on which approaches might hold for non-linear systems, which might only apply to linear systems, whether we might not be able to "linearize" subsets of our living systems to make them more tractable, etc.
- So at this point I still think we have probably the hardest cross-cultural problem we have ever addressed (at least, the one with the largest number of different "ethnic groups" to try to communicate across) and are far from where we want to be.
- One suggestion is to focus on a question or two, and one of those would have to be estimation, constraint, delimitation, whatever, of rates of intracellular biochemical reactions from output in response to time variant input. Not "parameters" but rates. Not "periodic time variant input" but rather "time variant input". If you want to add to rates "numbers of individual heteroligomeric molecular species" you can. But we can't be saying "system identification" or "system architecture" either. Each of the words in the Summer 2004 challenge problem matters.
- This discussion, as Drew pointed out, might be appropriate for testing wiki's. I don't want to get bogged down in format, though. In the meantime I hope people will continue to communicate pre-meeting.
Koichi's questions to Ty
- Have you done or considered systematically giving stationary stimuli of unchanging frequency, rather than randomly generated ones? (such as single-frequency pulses or sin waves)
- I actually have been working primarily with either 2 or three pulses, not necessarily evenly spaced. On one of my slides I showed a rather rapidly changing input signal. This is mostly to get the point across that the stimulus timecourse of interests could potentially be wildly varying and non-intuitive in nature. I agree with you that a randomly generated stimulus timecourse is probably not the most interesting/relevant.
- Specifically addressing the idea of sine waves and pulse trains, I've looked at these somewhat, but the problem is that non-linear systems cannot just be chartacterized by their frequency response, so it's unclear exactly what you'd be looking for. Pretty much any stimulus timecourse, the system acts acting like a low-pass filter. But not a very good low pass filter because rapid increases in stimulus result in rapid response, but rapid decreases in stimulus result is very slow response.
- I would like to stick with pulses over more complex stimulus timecourses because they are acheivable experimentally with my setup.
- I ask Q1 because my intuition says (a) something in the frequency spectra of the input might be causing the change in the rate you observed, and if so, (b) it is more likely that a dominant, single frequency component is contributing to it, rather than combinations of frequencies are. Have you considered these possibilities and done some simulation runs to this regard? If your feeling is encouraging, but haven't done anything yet, I think we could try a frequency-sweeping experiment. We could plot a (input-frequency vs change in rate) graph. We might get either a linear, exponential, or bifurcation-like correlation, or we might get nothing. Who knows. (This is a quite similar computation than what we do for bifurcation analysis. E-Cell is a high-performance, generic platform for this kind of computations. I and Nathan can help you.)
- I'm not sure that I know what you mean by the change in rate that I observed...
- But again, in terms of frequency sweeping, I've done a bit of this in matlab, and the results haven't indicated that there is anything interesting there (other than the low-pass filter behavior).
- The kind of observation you got reminds me of the famous textbook example of Ca2+-Calmodulin system in nerve cells, which decodes spike frequencies resulting from action potential firing. This system essentially constitutes some kind of a high-pass or possibly band-pass filter. Although it doesn't exhibit much physiological similarity, and we, or at least I, cannot immediately think of biological meaning of such a frequency dependent behaviors, I suspect this analogy might help us understanding something because the underlying mechanism is exactly the same (mass-action reactions, basically). Have you considered this possibility?
- I can hand-wave one such reason for a low-pass filter, or at the very least something that checks for sustained activation of the pathway- this is so that the cell doesn't activate mating activities (transcription, cell cycle arrest, etc.) for a transient blip of activity, perhaps from a random burst of G-protein dissociation (actually, probably mostly resulting from internal, stochastic activations, since the off-rate of pheromone from the receptor is so low). This idea is consistent with the observed "dip" in MAPK phosphorylation; perhaps you'd see a reversion to near basal-levels of MAPK phosphorylation if the upstream signal didn't last beyond this dip (i.e. sustained for more than a minute or two). Ideally this would be tested by switching on and off an early event (Ste20 activity?) for various periods. -Rich
- What was the characteristic timescale (or frequency spectra) of the time-varying stimuli you gave, in comparison to typical timescales of reaction coefficients in the model?
- Experimentally, I can create vary the input (in a binary fashion) on a timescale of 1s. The singal propagation from the receptor to recruitment of Ste5 occurs on the timescale ~5s (which let's assume is limiting in activation), and the dissociation of pheromone from the receptor has a timescale of several minutes (which is limiting in deactivation).-tmt
An interesting review
The paper below seems to cover every mathematical techniques we've discussed so far. Plus something more.
- Crampin et al. 2004 Mathematical and computational techniques to deduce complex biochemical reaction mechanisms, Progress in Biophysics & Molecular Biology, 2004.
By executive order, the meeting will be at 2:00pm (PST) on Monday February 6th.
Thanks to all those who tried something new with the wiki. Maybe next time.
Below are the proposed meeting times. Please indicate which times you would be able to attend.
- Afonso P, da Conceicao Cunha M. Assessing Parameter Identifiability of Activated Sludge Model Number 1. J Environ Eng 2002 Aug 1; 128(8) 748-754 J Environ Eng
- Bernaerts K, Gysemans KP, Nhan Minh T, and Van Impe JF. . pmid:15854701.