Using Machine Learning to Discover Parsimonious and Physically-Interpretable Representations of Catchment-Scale Rainfall-Runoff Dynamics
- URL: http://arxiv.org/abs/2412.04845v5
- Date: Sat, 08 Nov 2025 04:54:47 GMT
- Title: Using Machine Learning to Discover Parsimonious and Physically-Interpretable Representations of Catchment-Scale Rainfall-Runoff Dynamics
- Authors: Yuan-Heng Wang, Hoshin V. Gupta,
- Abstract summary: We develop parsimonious minimally-optimal representations that can facilitate better insight regarding system functioning.<n>We find that both physical interpretability and good predictive performance can be achieved using a distributed-state network.<n>The results indicate that MCP-based ML models with only a few layers (up to two) and relativity few physical flow pathways can play a significant role in ML-based streamflow modelling.
- Score: 0.8594140167290097
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Due largely to challenges associated with physical interpretability of machine learning (ML) methods, and because model interpretability is key to credibility in management applications, many scientists and practitioners are hesitant to discard traditional physical-conceptual (PC) modeling approaches despite their poorer predictive performance. Here, we examine how to develop parsimonious minimally-optimal representations that can facilitate better insight regarding system functioning. The term minimally-optimal indicates that the desired outcome can be achieved with the smallest possible effort and resources, while parsimony is widely held to support understanding. Accordingly, we suggest that ML-based modeling should use computational units that are inherently physically-interpretable, and explore how generic network architectures comprised of Mass-Conserving-Perceptron can be used to model dynamical systems in a physically-interpretable manner. In the context of spatially-lumped catchment-scale modeling, we find that both physical interpretability and good predictive performance can be achieved using a distributed-state network with context-dependent gating and information sharing across nodes. The distributed-state mechanism ensures a sufficient number of temporally-evolving properties of system storage while information-sharing ensures proper synchronization of such properties. The results indicate that MCP-based ML models with only a few layers (up to two) and relativity few physical flow pathways (up to three) can play a significant role in ML-based streamflow modelling.
Related papers
- Towards CONUS-Wide ML-Augmented Conceptually-Interpretable Modeling of Catchment-Scale Precipitation-Storage-Runoff Dynamics [3.442981587977714]
This study uses ML-augmented physically-interpretable-scale models of varying catchment complexity based in the Mass-Conserving Perceptron (MCP)<n>Results were evaluated using attribute masks such as snow regime, forest cover, and climate zone.<n> Benchmark comparisons show that physically-interpretable mass-conserving MCP-based models can achieve performance comparable to data-based models based in the Long Short-Term Memory network (LSTM) architecture.
arXiv Detail & Related papers (2025-10-02T22:47:01Z) - Beyond Benchmarks: Understanding Mixture-of-Experts Models through Internal Mechanisms [55.1784306456972]
Mixture-of-Experts (MoE) architectures have emerged as a promising direction, offering efficiency and scalability by activating only a subset of parameters during inference.<n>We use an internal metric to investigate the mechanisms of MoE architecture by explicitly incorporating routing mechanisms and analyzing expert-level behaviors.<n>We uncover several findings: (1) neuron utilization decreases as models evolve, reflecting stronger generalization; (2) training exhibits a dynamic trajectory, where benchmark performance alone provides limited signal; (3) task completion emerges from collaborative contributions of multiple experts, with shared experts driving concentration; and (4) activation patterns at the neuron level provide a fine-grained proxy for data diversity.
arXiv Detail & Related papers (2025-09-28T15:13:38Z) - Towards explainable decision support using hybrid neural models for logistic terminal automation [1.5364433104428317]
This paper presents a novel framework for interpretable-by-design neural system dynamics modeling.<n>The proposed hybrid approach enables the construction of neural network models that operate on semantically meaningful and actionable variables.<n>The framework is conceived to be applied to real-world case-studies from the EU-funded project AutoMoTIF.
arXiv Detail & Related papers (2025-09-09T10:41:08Z) - Foundation Model for Skeleton-Based Human Action Understanding [56.89025287217221]
This paper presents a Unified Skeleton-based Dense Representation Learning framework.<n>USDRL consists of a Transformer-based Dense Spatio-Temporal (DSTE), Multi-Grained Feature Decorrelation (MG-FD), and Multi-Perspective Consistency Training (MPCT)
arXiv Detail & Related papers (2025-08-18T02:42:16Z) - MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science [62.96434290874878]
Current Multi-Modal Large Language Models (MLLM) have shown strong capabilities in general visual reasoning tasks.
We develop a new framework, named Multi-Modal Scientific Reasoning with Physics Perception and Simulation (MAPS) based on an MLLM.
MAPS decomposes expert-level multi-modal reasoning task into physical diagram understanding via a Physical Perception Model (PPM) and reasoning with physical knowledge via a simulator.
arXiv Detail & Related papers (2025-01-18T13:54:00Z) - Data-Efficient Inference of Neural Fluid Fields via SciML Foundation Model [49.06911227670408]
We show that SciML foundation model can significantly improve the data efficiency of inferring real-world 3D fluid dynamics with improved generalization.
We equip neural fluid fields with a novel collaborative training approach that utilizes augmented views and fluid features extracted by our foundation model.
arXiv Detail & Related papers (2024-12-18T14:39:43Z) - Benchmarks as Microscopes: A Call for Model Metrology [76.64402390208576]
Modern language models (LMs) pose a new challenge in capability assessment.
To be confident in our metrics, we need a new discipline of model metrology.
arXiv Detail & Related papers (2024-07-22T17:52:12Z) - Large Language Model-Based Interpretable Machine Learning Control in Building Energy Systems [3.0309252269809264]
This paper investigates and explores Interpretable Machine Learning (IML), a branch of Machine Learning (ML) that enhances transparency and understanding of models and their inferences.
We develop an innovative framework that combines the principles of Shapley values and the in-context learning feature of Large Language Models (LLMs)
The paper presents a case study to demonstrate the feasibility of the developed IML framework for model predictive control-based precooling under demand response events in a virtual testbed.
arXiv Detail & Related papers (2024-02-14T21:19:33Z) - Towards Interpretable Physical-Conceptual Catchment-Scale Hydrological Modeling using the Mass-Conserving-Perceptron [1.1510009152620668]
This study sets the stage for interpretable regional-scale MCP-based hydrological modeling (using large sample data) by using neural architecture search to determine appropriate minimal representations for catchments in different hydroclimatic regimes.
arXiv Detail & Related papers (2024-01-25T21:26:49Z) - A Mass-Conserving-Perceptron for Machine Learning-Based Modeling of Geoscientific Systems [1.1510009152620668]
We propose a physically-interpretable Mass Conserving Perceptron (MCP) as a way to bridge the gap between PC-based and ML-based modeling approaches.
The MCP exploits the inherent isomorphism between the directed graph structures underlying both PC models and GRNNs to explicitly represent the mass-conserving nature of physical processes.
arXiv Detail & Related papers (2023-10-12T18:09:33Z) - Understanding Self-attention Mechanism via Dynamical System Perspective [58.024376086269015]
Self-attention mechanism (SAM) is widely used in various fields of artificial intelligence.
We show that intrinsic stiffness phenomenon (SP) in the high-precision solution of ordinary differential equations (ODEs) also widely exists in high-performance neural networks (NN)
We show that the SAM is also a stiffness-aware step size adaptor that can enhance the model's representational ability to measure intrinsic SP.
arXiv Detail & Related papers (2023-08-19T08:17:41Z) - Scaling Vision-Language Models with Sparse Mixture of Experts [128.0882767889029]
We show that mixture-of-experts (MoE) techniques can achieve state-of-the-art performance on a range of benchmarks over dense models of equivalent computational cost.
Our research offers valuable insights into stabilizing the training of MoE models, understanding the impact of MoE on model interpretability, and balancing the trade-offs between compute performance when scaling vision-language models.
arXiv Detail & Related papers (2023-03-13T16:00:31Z) - Differentiable modeling to unify machine learning and physical models
and advance Geosciences [38.92849886903847]
We outline the concepts, applicability, and significance of differentiable geoscientific modeling (DG)
"Differentiable" refers to accurately and efficiently calculating gradients with respect to model variables.
Preliminary evidence suggests DG offers better interpretability and causality than Machine Learning.
arXiv Detail & Related papers (2023-01-10T15:24:14Z) - Scientific Inference With Interpretable Machine Learning: Analyzing Models to Learn About Real-World Phenomena [4.312340306206884]
Interpretable machine learning offers a solution by analyzing models holistically to derive interpretations.
Current IML research is focused on auditing ML models rather than leveraging them for scientific inference.
We present a framework for designing IML methods-termed 'property descriptors' that illuminate not just the model, but also the phenomenon it represents.
arXiv Detail & Related papers (2022-06-11T10:13:21Z) - Combining Machine Learning and Agent-Based Modeling to Study Biomedical
Systems [0.0]
Agent-based modeling (ABM) is a well-established paradigm for simulating complex systems via interactions between constituent entities.
Machine learning (ML) refers to approaches whereby statistical algorithms 'learn from data on their own, without imposing a priori theories of system behavior.
arXiv Detail & Related papers (2022-06-02T15:19:09Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Meta-learning using privileged information for dynamics [66.32254395574994]
We extend the Neural ODE Process model to use additional information within the Learning Using Privileged Information setting.
We validate our extension with experiments showing improved accuracy and calibration on simulated dynamics tasks.
arXiv Detail & Related papers (2021-04-29T12:18:02Z) - Using machine-learning modelling to understand macroscopic dynamics in a
system of coupled maps [0.0]
We consider a case study the macroscopic motion emerging from a system of globally coupled maps.
We build a coarse-grained Markov process for the macroscopic dynamics both with a machine learning approach and with a direct numerical computation of the transition probability of the coarse-grained process.
We are able to infer important information about the effective dimension of the attractor, the persistence of memory effects and the multi-scale structure of the dynamics.
arXiv Detail & Related papers (2020-11-08T15:38:12Z) - A Trainable Optimal Transport Embedding for Feature Aggregation and its
Relationship to Attention [96.77554122595578]
We introduce a parametrized representation of fixed size, which embeds and then aggregates elements from a given input set according to the optimal transport plan between the set and a trainable reference.
Our approach scales to large datasets and allows end-to-end training of the reference, while also providing a simple unsupervised learning mechanism with small computational cost.
arXiv Detail & Related papers (2020-06-22T08:35:58Z) - Towards Interpretable Deep Learning Models for Knowledge Tracing [62.75876617721375]
We propose to adopt the post-hoc method to tackle the interpretability issue for deep learning based knowledge tracing (DLKT) models.
Specifically, we focus on applying the layer-wise relevance propagation (LRP) method to interpret RNN-based DLKT model.
Experiment results show the feasibility using the LRP method for interpreting the DLKT model's predictions.
arXiv Detail & Related papers (2020-05-13T04:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.