Differentiable modeling to unify machine learning and physical models
and advance Geosciences
- URL: http://arxiv.org/abs/2301.04027v2
- Date: Wed, 27 Dec 2023 02:15:41 GMT
- Title: Differentiable modeling to unify machine learning and physical models
and advance Geosciences
- Authors: Chaopeng Shen, Alison P. Appling, Pierre Gentine, Toshiyuki Bandai,
Hoshin Gupta, Alexandre Tartakovsky, Marco Baity-Jesi, Fabrizio Fenicia,
Daniel Kifer, Li Li, Xiaofeng Liu, Wei Ren, Yi Zheng, Ciaran J. Harman,
Martyn Clark, Matthew Farthing, Dapeng Feng, Praveen Kumar, Doaa Aboelyazeed,
Farshid Rahmani, Hylke E. Beck, Tadd Bindas, Dipankar Dwivedi, Kuai Fang,
Marvin H\"oge, Chris Rackauckas, Tirthankar Roy, Chonggang Xu, Binayak
Mohanty, Kathryn Lawson
- Abstract summary: We outline the concepts, applicability, and significance of differentiable geoscientific modeling (DG)
"Differentiable" refers to accurately and efficiently calculating gradients with respect to model variables.
Preliminary evidence suggests DG offers better interpretability and causality than Machine Learning.
- Score: 38.92849886903847
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Process-Based Modeling (PBM) and Machine Learning (ML) are often perceived as
distinct paradigms in the geosciences. Here we present differentiable
geoscientific modeling as a powerful pathway toward dissolving the perceived
barrier between them and ushering in a paradigm shift. For decades, PBM offered
benefits in interpretability and physical consistency but struggled to
efficiently leverage large datasets. ML methods, especially deep networks,
presented strong predictive skills yet lacked the ability to answer specific
scientific questions. While various methods have been proposed for ML-physics
integration, an important underlying theme -- differentiable modeling -- is not
sufficiently recognized. Here we outline the concepts, applicability, and
significance of differentiable geoscientific modeling (DG). "Differentiable"
refers to accurately and efficiently calculating gradients with respect to
model variables, critically enabling the learning of high-dimensional unknown
relationships. DG refers to a range of methods connecting varying amounts of
prior knowledge to neural networks and training them together, capturing a
different scope than physics-guided machine learning and emphasizing first
principles. Preliminary evidence suggests DG offers better interpretability and
causality than ML, improved generalizability and extrapolation capability, and
strong potential for knowledge discovery, while approaching the performance of
purely data-driven ML. DG models require less training data while scaling
favorably in performance and efficiency with increasing amounts of data. With
DG, geoscientists may be better able to frame and investigate questions, test
hypotheses, and discover unrecognized linkages.
Related papers
- MergeNet: Knowledge Migration across Heterogeneous Models, Tasks, and Modalities [72.68829963458408]
We present MergeNet, which learns to bridge the gap of parameter spaces of heterogeneous models.
The core mechanism of MergeNet lies in the parameter adapter, which operates by querying the source model's low-rank parameters.
MergeNet is learned alongside both models, allowing our framework to dynamically transfer and adapt knowledge relevant to the current stage.
arXiv Detail & Related papers (2024-04-20T08:34:39Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Enhancing Dynamical System Modeling through Interpretable Machine
Learning Augmentations: A Case Study in Cathodic Electrophoretic Deposition [0.8796261172196743]
We introduce a comprehensive data-driven framework aimed at enhancing the modeling of physical systems.
As a demonstrative application, we pursue the modeling of cathodic electrophoretic deposition (EPD), commonly known as e-coating.
arXiv Detail & Related papers (2024-01-16T14:58:21Z) - Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective [106.92016199403042]
We empirically investigate knowledge transfer from larger to smaller models through a parametric perspective.
We employ sensitivity-based techniques to extract and align knowledge-specific parameters between different large language models.
Our findings highlight the critical factors contributing to the process of parametric knowledge transfer.
arXiv Detail & Related papers (2023-10-17T17:58:34Z) - Physics Inspired Hybrid Attention for SAR Target Recognition [61.01086031364307]
We propose a physics inspired hybrid attention (PIHA) mechanism and the once-for-all (OFA) evaluation protocol to address the issues.
PIHA leverages the high-level semantics of physical information to activate and guide the feature group aware of local semantics of target.
Our method outperforms other state-of-the-art approaches in 12 test scenarios with same ASC parameters.
arXiv Detail & Related papers (2023-09-27T14:39:41Z) - When Geoscience Meets Foundation Models: Towards General Geoscience Artificial Intelligence System [6.445323648941926]
Geoscience foundation models (GFMs) are a paradigm-shifting solution, integrating extensive cross-disciplinary data to enhance the simulation and understanding of Earth system dynamics.
The unique strengths of GFMs include flexible task specification, diverse input-output capabilities, and multi-modal knowledge representation.
This review offers a comprehensive overview of emerging geoscientific research paradigms, emphasizing the untapped opportunities at the intersection of advanced AI techniques and geoscience.
arXiv Detail & Related papers (2023-09-13T08:44:09Z) - Beyond Convergence: Identifiability of Machine Learning and Deep
Learning Models [0.0]
We investigate the notion of model parameter identifiability through a case study focused on parameter estimation from motion sensor data.
We employ a deep neural network to estimate subject-wise parameters, including mass, stiffness, and equilibrium leg length.
The results show that while certain parameters can be identified from the observation data, others remain unidentifiable.
arXiv Detail & Related papers (2023-07-21T03:40:53Z) - Differentiable Agent-based Epidemiology [71.81552021144589]
We introduce GradABM: a scalable, differentiable design for agent-based modeling that is amenable to gradient-based learning with automatic differentiation.
GradABM can quickly simulate million-size populations in few seconds on commodity hardware, integrate with deep neural networks and ingest heterogeneous data sources.
arXiv Detail & Related papers (2022-07-20T07:32:02Z) - Bayesian Active Learning for Discrete Latent Variable Models [19.852463786440122]
Active learning seeks to reduce the amount of data required to fit the parameters of a model.
latent variable models play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines.
arXiv Detail & Related papers (2022-02-27T19:07:12Z) - Modeling System Dynamics with Physics-Informed Neural Networks Based on
Lagrangian Mechanics [3.214927790437842]
Two main modeling approaches often fail to meet requirements: first principles methods suffer from high bias, whereas data-driven modeling tends to have high variance.
We present physics-informed neural ordinary differential equations (PINODE), a hybrid model that combines the two modeling techniques to overcome the aforementioned problems.
Our findings are of interest for model-based control and system identification of mechanical systems.
arXiv Detail & Related papers (2020-05-29T15:10:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.