MS05 - MFBM-13 Part 3 of 4

Modern methods in the data-driven modeling of biological systems (Part 3)

Wednesday, July 16 at 10:20am

SMB2025 SMB2025 Follow


Share this

Organizers:

Cody FitzGerald (Northwestern University), Rainey Lyons (CU Boulder), Nora Heitzman-Breen (CU Boulder), Susan Rogowski (NCSU)

Description:

Due to recent developments in laboratory technology and data collection techniques, there is an abundance of large and complex datasets resulting from a vast array of biological experiments. This surge of data demands the development of novel data-driven techniques to generate robust, interpretable, and generalizable models of biological systems. The purpose of this minisymposium is to present modern advances in data-driven methods for modeling biological dynamics in the areas of parameter estimation, scientific machine learning, algorithmic model selection, and weak form methods. This minisymposium also aims to discuss common challenges which appear in the context of data-driven modeling, such as sparse data, unobserved states, noisy data, structural and practical identifiability issues, and incorporating multiple biological scales. Applications for such methods will span many active areas of biological research, including cell migration, physiology, neuroscience, epidemiology, and ecology.



David Bortz

CU Boulder
"Weak form Scientific Machine Learning"
The creation and inference of mathematical models is central to modern scientific discovery in the life sciences. As more realism is demanded of models, however, the conventional framework of biology-guided model proposal, discretization, parameter estimation, and model refinement becomes unwieldy, expensive, and computationally daunting. Recent advances in Weak form-based Scientific Machine Learning (WSciML) allow for the creation and inference of interpretable models directly from data via advanced numerical functional analysis, computational statistics, and numerical linear algebra techniques. This class of methods completely bypasses the need for forward-solve numerical discretizations and yields both parsimonious mathematical models and efficient parameter estimates. These methods are orders of magnitude faster and more accurate than traditional approaches and far more robust to the high noise levels common to data in the biological sciences. The combination of these features in a single framework provides a compelling alternative to both traditional modeling approaches as well as modern black-box neural networks. In this talk, I will present our weak form approach, describing our equation learning (WSINDy) and parameter estimation (WENDy) algorithms. I will demonstrate these performance properties via applications to several canonical problems in structured population modeling, cell migration, and mathematical epidemiology.



Alasdair Hastewell

NITMB
"Discovering dynamical models from partial biological observations with degeneracy-robust algorithms."
Complex multi-component biological, chemical, and physical systems are often only partially observable despite rapid advancements in experimental measurement techniques. Discovering predictive nonlinear dynamical models and their parameters directly from incomplete, noisy experimental observations is essential for understanding such systems. A key challenge introduced by partial observations is model degeneracy, where multiple structurally distinct models match observed variables. By mapping a large class of partially observed ordinary differential equations to a simpler model form and using differential equations sensitivity analysis, we can account for symmetries that lead to degenerate models and efficiently use the resulting structure to develop a dynamical systems inference framework that is robust both to noise and partial observations. After validating our method on the FitzHigh-Nagumo oscillator and a model of cell signaling, we will demonstrate the framework’s broad applicability to various experimental datasets from biological oscillators to animal locomotion.



Xiaojun Wu

University of Southern California
"Data-driven model discovery and model selection for noisy biological systems"
Biological systems exhibit complex dynamics that differential equations can often adeptly represent. Ordinary differential equation models are widespread; until recently their construction has required extensive prior knowledge of the system. Machine learning methods offer alternative means of model construction: differential equation models can be learnt from data via model discovery using sparse identification of nonlinear dynamics (SINDy). However, SINDy struggles with realistic levels of biological noise and is limited in its ability to incorporate prior knowledge of the system. We propose a data-driven framework for model discovery and model selection using hybrid dynamical systems: partial models containing missing terms. Neural networks are used to approximate the unknown dynamics of a system, enabling the denoising of the data while simultaneously learning the latent dynamics. Simulations from the fitted neural network are then used to infer models using sparse regression. We show, via model selection, that model discovery using hybrid dynamical systems outperforms alternative approaches. We find it possible to infer models correctly up to high levels of biological noise of different types. We demonstrate the potential to learn models from sparse, noisy data in application to a canonical cell state transition using data derived from single-cell transcriptomics. Overall, this approach provides a practical framework for model discovery in biology in cases where data are noisy and sparse, of particular utility when the underlying biological mechanisms are partially but incompletely known.



Nora Heitzman-Breen

CU Boulder
"Weak-form parameter inference of epidemiological systems"
Compartmental modeling of epidemiological systems provides insights into biological and behavioral interactions that cannot be observed directly. Accurate and robust parameter estimation is critical to applying such models as decision-making tools for public health interventions. A common approach to parameter estimation is nonlinear least squares using a forward solver. However, as model size and noise in data increase these methods become computationally expensive to maintain accuracy in parameter estimates. In this work, we establish a weak-form based method of parameter estimation for systems with unobserved variables by generating weak-form input-output equations using differential algebra techniques. We show that practical identifiability of the weak-form system can be assessed more quickly in comparison to output error based parameter estimation using criteria informed by the additive error in observed variables. Finally, we demonstrate that using a weak-form based method to estimate transmission rates is computationally efficient and robust to noise, even for complex variations of the classical SIR model.



SMB2025
#SMB2025 Follow
Annual Meeting for the Society for Mathematical Biology, 2025.