Related papers: Learning Deep Hybrid Models with Sharpness-Aware Minimization

Learning Deep Hybrid Models with Sharpness-Aware Minimization

URL: http://arxiv.org/abs/2602.06837v1
Date: Fri, 06 Feb 2026 16:27:19 GMT
Title: Learning Deep Hybrid Models with Sharpness-Aware Minimization
Authors: Naoya Takeishi,
Abstract summary: We propose to focus on the flatness of loss minima in learning hybrid models, aiming to make the model as simple as possible.<n>We employ the idea of sharpness-aware minimization and adapt it to the hybrid modeling setting.<n> Numerical experiments show that the SAM-based method works well across different choices of models and datasets.
Score: 4.8941886361557625
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Hybrid modeling, the combination of machine learning models and scientific mathematical models, enables flexible and robust data-driven prediction with partial interpretability. However, effectively the scientific models may be ignored in prediction due to the flexibility of the machine learning model, making the idea of hybrid modeling pointless. Typically some regularization is applied to hybrid model learning to avoid such a failure case, but the formulation of the regularizer strongly depends on model architectures and domain knowledge. In this paper, we propose to focus on the flatness of loss minima in learning hybrid models, aiming to make the model as simple as possible. We employ the idea of sharpness-aware minimization and adapt it to the hybrid modeling setting. Numerical experiments show that the SAM-based method works well across different choices of models and datasets.

Related papers

Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching.<n>We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy.<n>Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z)
Learnable & Interpretable Model Combination in Dynamical Systems Modeling [0.0]
This work briefly discusses which types of model are usually combined in dynamical systems modeling.<n>We propose a class of models that is capable of expressing mixed algebraic, discrete, and differential equation-based models.<n>Finally, we propose a new wildcard architecture that is capable of describing arbitrary combinations of models in an easy-to-interpret fashion.
arXiv Detail & Related papers (2024-06-12T11:17:11Z)
Hybrid$^2$ Neural ODE Causal Modeling and an Application to Glycemic Response [5.754225700181611]
We show how to achieve a win-win, state-of-the-art predictive performance emphand causal validity. We demonstrate our ability to achieve a win-win, state-of-the-art predictive performance emphand causal validity in the challenging task of modeling glucose dynamics post-exercise in individuals with type 1 diabetes.
arXiv Detail & Related papers (2024-02-27T06:01:56Z)
Online Calibration of Deep Learning Sub-Models for Hybrid Numerical Modeling Systems [34.50407690251862]
We present an efficient and practical online learning approach for hybrid systems. We demonstrate that the method, called EGA for Euler Gradient Approximation, converges to the exact gradients in the limit of infinitely small time steps. Results show significant improvements over offline learning, highlighting the potential of end-to-end online learning for hybrid modeling.
arXiv Detail & Related papers (2023-11-17T17:36:26Z)
Online simulator-based experimental design for cognitive model selection [74.76661199843284]
We propose BOSMOS: an approach to experimental design that can select between computational models without tractable likelihoods. In simulated experiments, we demonstrate that the proposed BOSMOS technique can accurately select models in up to 2 orders of magnitude less time than existing LFI alternatives.
arXiv Detail & Related papers (2023-03-03T21:41:01Z)
Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning [56.50123642237106]
Common practice in model-based reinforcement learning is to learn models that model every aspect of the agent's environment. We argue that such models are not particularly well-suited for performing scalable and robust planning in lifelong reinforcement learning scenarios. We propose new kinds of models that only model the relevant aspects of the environment, which we call "minimal value-minimal partial models"
arXiv Detail & Related papers (2023-01-24T16:40:01Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [47.432215933099016]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.<n>This creates a barrier to fusing knowledge across individual models to yield a better single model.<n>We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
Integrating Physics-Based Modeling with Machine Learning for Lithium-Ion Batteries [4.946066838162504]
This paper proposes two new frameworks to integrate physics-based models with machine learning to achieve high-precision modeling for LiBs. The frameworks are characterized by informing the machine learning model of the state information of the physical model. The study further expands to conduct aging-aware hybrid modeling, leading to the design of a hybrid model conscious of the state-of-health to make prediction.
arXiv Detail & Related papers (2021-12-24T07:39:02Z)
Hybrid modeling of the human cardiovascular system using NeuralFMUs [0.0]
We show that the hybrid modeling process is more comfortable, needs less system knowledge and is less error-prone compared to modeling solely based on first principle. The resulting hybrid model has improved in computation performance, compared to a pure first principle white-box model. The considered use-case can serve as example for other modeling and simulation applications in and beyond the medical domain.
arXiv Detail & Related papers (2021-09-10T13:48:43Z)
Model-agnostic multi-objective approach for the evolutionary discovery of mathematical models [55.41644538483948]
In modern data science, it is more interesting to understand the properties of the model, which parts could be replaced to obtain better results. We use multi-objective evolutionary optimization for composite data-driven model learning to obtain the algorithm's desired properties.
arXiv Detail & Related papers (2021-07-07T11:17:09Z)
Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors. We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method. Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
Hybrid modeling: Applications in real-time diagnosis [64.5040763067757]
We outline a novel hybrid modeling approach that combines machine learning inspired models and physics-based models. We are using such models for real-time diagnosis applications.
arXiv Detail & Related papers (2020-03-04T00:44:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.