CT03 - MFBM-02

MFBM-02 Contributed Talks

Friday, July 18 from 2:40pm - 3:40pm in Salon 15/16

SMB2025 Follow

Ismaila Muhammed

Khalifa University

"Data-driven Construction of Reduced Size Models Using Computational Singular Perturbation Method."

Most biological systems have underlying multiple spatial or temporal scales that require reduced-order models to capture their essential dynamics and analyze them. However, traditional model reduction techniques, such as Computational Singular Perturbation (CSP), rely on the availability of the governing or dynamical equations, which are often unknown from data in biomedical applications. To address this limitation, we propose a data-driven CSP framework that integrates Sparse Identification of Nonlinear Dynamics (SINDy) and Neural Networks to extract time-scale separated models directly from data. Our approach is validated on the Michaelis-Menten enzyme kinetics model, a well-established multiscale system, by identifying reduced models for the standard Quasi-Steady-State Approximation (sQSSA) and reverse Quasi-Steady-State Approximation (rQSSA). When the full model cannot be identified by SINDy due to noise, we use Neural Networks to estimate the Jacobian matrix, allowing CSP to determine the regions where reduced models are valid. We further analyze Partial Equilibrium Approximation (PEA) case, where the dynamics span both sQSSA and rQSSA regimes, requiring dataset splitting to accurately identify region-specific models. The results demonstrate that SINDy struggles in the presence of noise to identify full model from data that have underlying timescale evolution, but remains effective for identifying reduced models when dataset are partitioned correctly.

Adelle Coster

School of Mathematics & Statistics, UNSW, Sydney Australia

"Cellular protein transport: Queuing models and parameter estimation in stochastic systems"

Real-world systems, especially in biology, exhibit significant complexity and inherent limitations in observability. What methods can enhance our understanding of the mechanisms underlying their functionality? Additionally, how can we develop and test explanatory models within a stochastic environment? Evaluating the effectiveness of these models requires quantitative measurements of the disparity between model outputs and observed data. While mean-field, deterministic models have well-established approaches for such assessments, stochastic systems—particularly those constrained by multiple data types—need carefully designed quantitative comparison methods. Methods for inferring the parameters of stochastic models generally require analytical forms of the model solutions, large data sets, summary statistics, or assumptions on the distribution of model outputs. These approaches can be limiting if you wish to preserve the information in the variability of the data but you do not have sufficient data to reliably fit distributions or determine robust statistics. We present a hierarchical approach to develop a distance measure for the direct comparison of model output distributions to experimentally observed distributions, avoiding any assumptions about distributions and the need to choose summary statistics. Our distance measure allows for constraining the model with multiple experiments, not necessarily of the same type, such that each experiment constrains some, or all, of the model parameters. We use this distance for parameter estimation with our queuing model of intracellular GLUT4 translocation. We will explore some practical considerations when using the distance for parameter inference, such as the effects of model output sampling and experimental error. Fitting the queuing model to data allowed us to uncover a possible mechanism of GLUT4 sequestration and release in response to insulin. Authors: Brock D. Sherlock and Adelle C.F. Coster

John Vastola

Harvard University

"Bayesian inference of chemical reaction network parameters given reaction degeneracy: an approximate analytic solution"

Although chemical reaction networks (CRN) provide performant and biophysically plausible models for explaining single-cell genomic data, inference of reaction network parameters in this setting usually assumes available data points can be viewed as independent samples from a steady state distribution. Less is known about how to perform efficient parameter inference in the case that there is a continuous-time data stream, which adds complexity like nontrivial correlations between samples from different times. In the continuous-time setting, one has two natural questions: (i) given a set of reactions that could plausibly explain the observed data stream, what are reasonable estimates of the associated reaction rate parameters? and (ii) what is the minimal set of reactions necessary to explain the data? Both questions can be formalized as Bayesian inference problems, with the former concerning the inference of a model-dependent parameter posterior, and the latter concerning ‘structure’ inference. If one can assume each possible reaction has a different stoichiometry vector, there is a well-known analytic solution to both problems; if reactions can have the same stoichiometry vector (i.e., there is reaction degeneracy), both problems become substantially more difficult, and no analytic solution is known. We present the first approximate analytic solution to both problems, which is valid when the number of observations becomes sufficiently large. In its regime of validity, this solution allows its user to avoid expensive likelihood computations that can involve summing over an exponentially large number of terms. We discuss interesting consequences of this solution, like the fact that ‘simpler’ models with fewer reactions are preferred over more complex ones, and the fact that the parameter posteriors of non-identifiable models are strongly prior-dependent.

#SMB2025 Follow

Annual Meeting for the Society for Mathematical Biology, 2025.