MS03 - MFBM-03

Methods for whole cell modelling (Part 1)

Tuesday, July 15 at 10:20am

SMB2025 SMB2025 Follow


Share this

Organizers:

Jennifer Flegg (University of Melbourne), Prof Mat Simpson, Queensland University of Technology

Description:

Modelling whole cells has been identified as a “grand challenge of 21st century science”. However, we currently lack the mathematics, the modelling technologies, and the computational frameworks to understand and predict cellular behaviour. Without mathematical models we cannot understand cellular life, or explore new ways of rationally guiding or designing cellular behaviour. This minisymposium aims to bring together the latest research in methods for whole cell modelling. A variety of methods will be showcased including energy-based modelling methods, machine learning, parameter identifiability, model selection and more. The minisymposium is organised by the Australian Research Council Centre of Excellence in the Mathematical Analysis of Cellular Systems (MACSYS). MACSYS is a 7 year multi-institutional centre that involves several SMB members. The aim of MACSYS is to generate the mathematical, statistical and computational technologies required to make biology predictive; establish mathematical whole cell models for in silico biology as a powerful complement to traditional in vivo and in vitro approaches; tackle fundamental biological problems; and establish a world-leading research and biotechnology translation environment.



Ruth Baker

University of Oxford
"Optimal experimental design for parameter estimation in the presence of observation noise"
Using mathematical models to assist in the interpretation of experiments is becoming increasingly important in research across applied mathematics, in particular in fields such as biology and ecology. In this context, accurate parameter estimation is crucial; model parameters are used to both quantify observed behaviour, characterise behaviours that cannot be directly measured and make quantitative predictions. The extent to which parameter estimates are constrained by the quality and quantity of available data is known as parameter identifiability, and it is widely understood that for many dynamical models the uncertainty in parameter estimates can vary over orders of magnitude as the time points at which data are collected are varied. In this talk I will outline recent research that uses both local and global sensitivity measures within an optimisation algorithm to determine the observation times that give rise to the lowest uncertainty in parameter estimates. Applying the framework to models in which the observation noise is both correlated and uncorrelated demonstrates that correlations in observation noise can significantly impact the optimal time points for observing a system, and highlights that proper consideration of observation noise should be a crucial part of the experimental design process.



Yong See Foo

University of Melbourne
"Quantifying structural uncertainty in chemical reaction network inference"
Dynamical systems in biochemistry are complex, and one often does not have comprehensive knowledge about the interactions involved. Chemical reaction network (CRN) inference aims to identify, from observing species concentrations, the unknown reactions between the species. Most approaches focus on identifying a single, most likely CRN, without addressing uncertainty about the resulting network structure. However, it is important to quantify structural uncertainty to have confidence in our inference and predictions. To this end, I will discuss how to construct posterior distributions over CRN structures. This is done by keeping a large set of suboptimal solutions found in an optimisation framework with sparse regularisation, in contrast to existing optimisation approaches which discard suboptimal solutions. I will show that inducing reaction sparsity with nonconvex penalty functions results in more parsimonious CRNs compared to the popular lasso regularisation. In a real-data example where multiple CRNs have been previously proposed, reactions proposed from different literature can be simultaneously recovered under structural uncertainty. Moreover, posterior correlations between reactions help identify where structural ambiguities are present. This can be translated into alternative reaction pathways suggested by the available data, which guide the efforts of future experimental design.



Michael Pan

The University of Melbourne
"Thermodynamic modelling of membrane transport processes using bond graphs"
Cellular systems are physical systems, and are therefore governed by the laws of physics and thermodynamics. Energy is fundamental to our understanding of membrane transporters, which will only operate in the direction of decreasing chemical potential. Despite this, energy is often ignored in mathematical models of transporters, leading to unrealistic behaviours analogous to perpetual motion machines. In this talk, we outline a general physics-based framework (the bond graph) that explicitly models energy and therefore inherently accounts for thermodynamic constraints in membrane transporters. We show that this framework also provides a natural means of modelling the voltage dependence of electrogenic transporters. We demonstrate the utility of the bond graph approach in modelling the cardiac Na+/K+ ATPase (sodium-potassium pump) and discuss potential extensions of this approach for whole-cell modelling.



Jean (Jiayu) Wen

The Australian National University
"Advancing Genomic Foundation Models with Electra-Style Pretraining: Efficient and Interpretable Insights into Gene Regulation"
Pre-training large language models on genomic sequences has emerged as a powerful strategy for capturing biologically meaningful representations. While masked language modeling (MLM)-based methods, such as DNABERT and Nucleotide Transformer, achieve strong performance, they are hindered by inefficiencies due to partial token supervision and high computational demands. To address these limitations, we introduce the first Electra-style pretraining framework for genomic foundation models, replacing the MLM objective with a replaced-token detection task that employs a discriminator network to distinguish tokens replaced by a generator, enabling dense token-level supervision and significantly accelerating training. Unlike conventional methods that tokenize genomic sequences into 6-mers, our model operates at single nucleotide resolution, enhancing both efficiency and interpretability. We pre-train our model on the human genome and fine-tune it across a spectrum of downstream genomic prediction tasks, spanning epigenetics, transcriptional regulation, and post-transcriptional processes, including identification of regulatory elements such as promoters and enhancers, prediction of histone modifications, assessment of chromatin accessibility, as well as prediction of RNA-protein interactions, RNA modifications, RNA stability, translational efficiency, and microRNA binding sites. By addressing these diverse tasks, our model contributes to the advancement of whole cell modeling, which requires an integrated understanding of genomic, transcriptomic, and proteomic interactions. Our approach achieves a 28-fold reduction in pretraining time compared to MLM-based methods while surpassing their performance in most downstream evaluations, with benchmarking against state-of-the-art genomic models. Comprehensive ablation studies illuminate the key factors driving this improved efficiency and effectiveness. Furthermore, the use of 1-mer tokenization allows for nucleotide-level resolution, greatly enhancing the model's interpretability, with visualization and attention analyses demonstrating its ability to capture biologically relevant sequence motifs at a fine-grained level, providing deeper insights into genomic regulatory mechanisms. This work underscores the potential of Electra-style pretraining as a computationally efficient and effective strategy for advancing genomic representation learning, with broad implications for systems biology and whole cell modeling.



SMB2025
#SMB2025 Follow
Annual Meeting for the Society for Mathematical Biology, 2025.