Differentiable modeling to unify machine learning and physical models
and advance Geosciences
- URL: http://arxiv.org/abs/2301.04027v2
- Date: Wed, 27 Dec 2023 02:15:41 GMT
- Title: Differentiable modeling to unify machine learning and physical models
and advance Geosciences
- Authors: Chaopeng Shen, Alison P. Appling, Pierre Gentine, Toshiyuki Bandai,
Hoshin Gupta, Alexandre Tartakovsky, Marco Baity-Jesi, Fabrizio Fenicia,
Daniel Kifer, Li Li, Xiaofeng Liu, Wei Ren, Yi Zheng, Ciaran J. Harman,
Martyn Clark, Matthew Farthing, Dapeng Feng, Praveen Kumar, Doaa Aboelyazeed,
Farshid Rahmani, Hylke E. Beck, Tadd Bindas, Dipankar Dwivedi, Kuai Fang,
Marvin H\"oge, Chris Rackauckas, Tirthankar Roy, Chonggang Xu, Binayak
Mohanty, Kathryn Lawson
- Abstract summary: We outline the concepts, applicability, and significance of differentiable geoscientific modeling (DG)
"Differentiable" refers to accurately and efficiently calculating gradients with respect to model variables.
Preliminary evidence suggests DG offers better interpretability and causality than Machine Learning.
- Score: 38.92849886903847
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Process-Based Modeling (PBM) and Machine Learning (ML) are often perceived as
distinct paradigms in the geosciences. Here we present differentiable
geoscientific modeling as a powerful pathway toward dissolving the perceived
barrier between them and ushering in a paradigm shift. For decades, PBM offered
benefits in interpretability and physical consistency but struggled to
efficiently leverage large datasets. ML methods, especially deep networks,
presented strong predictive skills yet lacked the ability to answer specific
scientific questions. While various methods have been proposed for ML-physics
integration, an important underlying theme -- differentiable modeling -- is not
sufficiently recognized. Here we outline the concepts, applicability, and
significance of differentiable geoscientific modeling (DG). "Differentiable"
refers to accurately and efficiently calculating gradients with respect to
model variables, critically enabling the learning of high-dimensional unknown
relationships. DG refers to a range of methods connecting varying amounts of
prior knowledge to neural networks and training them together, capturing a
different scope than physics-guided machine learning and emphasizing first
principles. Preliminary evidence suggests DG offers better interpretability and
causality than ML, improved generalizability and extrapolation capability, and
strong potential for knowledge discovery, while approaching the performance of
purely data-driven ML. DG models require less training data while scaling
favorably in performance and efficiency with increasing amounts of data. With
DG, geoscientists may be better able to frame and investigate questions, test
hypotheses, and discover unrecognized linkages.
Related papers
- MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities [72.68829963458408]
We present MergeNet, which learns to bridge the gap of parameter spaces of heterogeneous models.
The core mechanism of MergeNet lies in the parameter adapter, which operates by querying the source model's low-rank parameters.
MergeNet is learned alongside both models, allowing our framework to dynamically transfer and adapt knowledge relevant to the current stage.
arXiv Detail & Related papers (2024-04-20T08:34:39Z) - Enhancing Dynamical System Modeling through Interpretable Machine
Learning Augmentations: A Case Study in Cathodic Electrophoretic Deposition [0.8796261172196743]
We introduce a comprehensive data-driven framework aimed at enhancing the modeling of physical systems.
As a demonstrative application, we pursue the modeling of cathodic electrophoretic deposition (EPD), commonly known as e-coating.
arXiv Detail & Related papers (2024-01-16T14:58:21Z) - Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective [106.92016199403042]
We empirically investigate knowledge transfer from larger to smaller models through a parametric perspective.
We employ sensitivity-based techniques to extract and align knowledge-specific parameters between different large language models.
Our findings highlight the critical factors contributing to the process of parametric knowledge transfer.
arXiv Detail & Related papers (2023-10-17T17:58:34Z) - Physics Inspired Hybrid Attention for SAR Target Recognition [61.01086031364307]
We propose a physics inspired hybrid attention (PIHA) mechanism and the once-for-all (OFA) evaluation protocol to address the issues.
PIHA leverages the high-level semantics of physical information to activate and guide the feature group aware of local semantics of target.
Our method outperforms other state-of-the-art approaches in 12 test scenarios with same ASC parameters.
arXiv Detail & Related papers (2023-09-27T14:39:41Z) - Beyond Convergence: Identifiability of Machine Learning and Deep
Learning Models [0.0]
We investigate the notion of model parameter identifiability through a case study focused on parameter estimation from motion sensor data.
We employ a deep neural network to estimate subject-wise parameters, including mass, stiffness, and equilibrium leg length.
The results show that while certain parameters can be identified from the observation data, others remain unidentifiable.
arXiv Detail & Related papers (2023-07-21T03:40:53Z) - MinT: Boosting Generalization in Mathematical Reasoning via Multi-View
Fine-Tuning [53.90744622542961]
Reasoning in mathematical domains remains a significant challenge for small language models (LMs)
We introduce a new method that exploits existing mathematical problem datasets with diverse annotation styles.
Experimental results show that our strategy enables a LLaMA-7B model to outperform prior approaches.
arXiv Detail & Related papers (2023-07-16T05:41:53Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Bayesian Active Learning for Discrete Latent Variable Models [19.852463786440122]
Active learning seeks to reduce the amount of data required to fit the parameters of a model.
latent variable models play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines.
arXiv Detail & Related papers (2022-02-27T19:07:12Z) - From calibration to parameter learning: Harnessing the scaling effects
of big data in geoscientific modeling [2.9897531698031403]
We propose a differentiable parameter learning framework that efficiently learns a global mapping between inputs and parameters.
As training data increases, dPL achieves better performance, more physical coherence, and better generalizability.
We demonstrate examples that learned from soil moisture and streamflow, where dPL drastically outperformed existing evolutionary and regionalization methods.
arXiv Detail & Related papers (2020-07-30T21:38:56Z) - Modeling System Dynamics with Physics-Informed Neural Networks Based on
Lagrangian Mechanics [3.214927790437842]
Two main modeling approaches often fail to meet requirements: first principles methods suffer from high bias, whereas data-driven modeling tends to have high variance.
We present physics-informed neural ordinary differential equations (PINODE), a hybrid model that combines the two modeling techniques to overcome the aforementioned problems.
Our findings are of interest for model-based control and system identification of mechanical systems.
arXiv Detail & Related papers (2020-05-29T15:10:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.