Explainable data-driven modeling via mixture of experts: towards
effective blending of grey and black-box models
- URL: http://arxiv.org/abs/2401.17118v1
- Date: Tue, 30 Jan 2024 15:53:07 GMT
- Title: Explainable data-driven modeling via mixture of experts: towards
effective blending of grey and black-box models
- Authors: Jessica Leoni, Valentina Breschi, Simone Formentin, Mara Tanelli
- Abstract summary: We propose a comprehensive framework based on a "mixture of experts" rationale.
This approach enables the data-based fusion of diverse local models.
We penalize abrupt variations in the expert's combination to enhance interpretability.
- Score: 6.331947318187792
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional models grounded in first principles often struggle with accuracy
as the system's complexity increases. Conversely, machine learning approaches,
while powerful, face challenges in interpretability and in handling physical
constraints. Efforts to combine these models often often stumble upon
difficulties in finding a balance between accuracy and complexity. To address
these issues, we propose a comprehensive framework based on a "mixture of
experts" rationale. This approach enables the data-based fusion of diverse
local models, leveraging the full potential of first-principle-based priors.
Our solution allows independent training of experts, drawing on techniques from
both machine learning and system identification, and it supports both
collaborative and competitive learning paradigms. To enhance interpretability,
we penalize abrupt variations in the expert's combination. Experimental results
validate the effectiveness of our approach in producing an interpretable
combination of models closely resembling the target phenomena.
Related papers
- Causality can systematically address the monsters under the bench(marks) [64.36592889550431]
Benchmarks are plagued by various biases, artifacts, or leakage.
Models may behave unreliably due to poorly explored failure modes.
causality offers an ideal framework to systematically address these challenges.
arXiv Detail & Related papers (2025-02-07T17:01:37Z) - Beyond Interpolation: Extrapolative Reasoning with Reinforcement Learning and Graph Neural Networks [18.982541044390384]
This study focuses on the impact of the inductive bias of the architecture, different reward systems and the role of recurrent modeling in enabling sequential reasoning.
We show how these elements contribute to successful extrapolation on increasingly complex puzzles.
arXiv Detail & Related papers (2025-02-06T08:07:35Z) - Orthogonal projection-based regularization for efficient model augmentation [2.6071013155805556]
Deep-learning-based nonlinear system identification has shown the ability to produce reliable and highly accurate models in practice.
Black-box models lack physical interpretability, and often a considerable part of the learning effort is spent on capturing already expected/known behavior.
A potential solution is to integrate prior physical knowledge directly into the model structure, combining the strengths of physics-based modeling and deep-learning identification.
arXiv Detail & Related papers (2025-01-10T10:33:13Z) - Imitate, Explore, and Self-Improve: A Reproduction Report on Slow-thinking Reasoning Systems [92.89673285398521]
o1-like reasoning systems have demonstrated remarkable capabilities in solving complex reasoning tasks.
We introduce an imitate, explore, and self-improve'' framework to train the reasoning model.
Our approach achieves competitive performance compared to industry-level reasoning systems.
arXiv Detail & Related papers (2024-12-12T16:20:36Z) - Dynamic Post-Hoc Neural Ensemblers [55.15643209328513]
In this study, we explore employing neural networks as ensemble methods.
Motivated by the risk of learning low-diversity ensembles, we propose regularizing the model by randomly dropping base model predictions.
We demonstrate this approach lower bounds the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - Model-Agnostic Interpretation Framework in Machine Learning: A
Comparative Study in NBA Sports [0.2937071029942259]
We propose an innovative framework to reconcile the trade-off between model performance and interpretability.
Our approach is centered around modular operations on high-dimensional data, which enable end-to-end processing while preserving interpretability.
We have extensively tested our framework and validated its superior efficacy in achieving a balance between computational efficiency and interpretability.
arXiv Detail & Related papers (2024-01-05T04:25:21Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer [59.43462055143123]
The Mixture of Experts (MoE) has emerged as a highly successful technique in deep learning.
In this study, we shed light on the homogeneous representation problem, wherein experts in the MoE fail to specialize and lack diversity.
We propose an alternating training strategy that encourages each expert to update in a direction to the subspace spanned by other experts.
arXiv Detail & Related papers (2023-10-15T07:20:28Z) - A Competitive Learning Approach for Specialized Models: A Solution for
Complex Physical Systems with Distinct Functional Regimes [0.0]
We propose a novel competitive learning approach for obtaining data-driven models of physical systems.
The primary idea behind the proposed approach is to employ dynamic loss functions for a set of models that are trained concurrently on the data.
arXiv Detail & Related papers (2023-07-19T23:29:40Z) - Joint Training of Deep Ensembles Fails Due to Learner Collusion [61.557412796012535]
Ensembles of machine learning models have been well established as a powerful method of improving performance over a single model.
Traditionally, ensembling algorithms train their base learners independently or sequentially with the goal of optimizing their joint performance.
We show that directly minimizing the loss of the ensemble appears to rarely be applied in practice.
arXiv Detail & Related papers (2023-01-26T18:58:07Z) - A non-cooperative meta-modeling game for automated third-party
calibrating, validating, and falsifying constitutive laws with parallelized
adversarial attacks [6.113400683524824]
The evaluation of models, especially for high-risk and high-regret engineering applications, requires efficient and rigorous third-party calibration, validation and falsification.
This work attempts to introduce concepts from game theory and machine learning techniques to overcome many of these existing difficulties.
We introduce an automated meta-modeling game where two competing AI agents generate experimental data to calibrate a given model and to explore its weakness, in order to improve experiment design and model robustness through competition.
arXiv Detail & Related papers (2020-04-13T18:43:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.