Statistical Inference of Minimally Complex Models
- URL: http://arxiv.org/abs/2008.00520v2
- Date: Mon, 27 Sep 2021 22:32:38 GMT
- Title: Statistical Inference of Minimally Complex Models
- Authors: Cl\'elia de Mulatier, Paolo P. Mazza, Matteo Marsili
- Abstract summary: Minimally Complex Models (MCMs) are spin models with interactions of arbitrary order.
We show that Bayesian model selection restricted to these models is computationally feasible.
Their evidence, which trades off goodness-of-fit against model complexity, can be computed easily without any parameter fitting.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Finding the model that best describes a high dimensional dataset is a
daunting task. For binary data, we show that this becomes feasible when
restricting the search to a family of simple models, that we call Minimally
Complex Models (MCMs). These are spin models, with interactions of arbitrary
order, that are composed of independent components of minimal complexity
(Beretta et al., 2018). They tend to be simple in information theoretic terms,
which means that they are well-fitted to specific types of data, and are
therefore easy to falsify. We show that Bayesian model selection restricted to
these models is computationally feasible and has many other advantages. First,
their evidence, which trades off goodness-of-fit against model complexity, can
be computed easily without any parameter fitting. This allows selecting the
best MCM among all, even though the number of models is astronomically large.
Furthermore, MCMs can be inferred and sampled from without any computational
effort. Finally, model selection among MCMs is invariant with respect to
changes in the representation of the data. MCMs portray the structure of
dependencies among variables in a simple way, as illustrated in several
examples, and thus provide robust predictions on dependencies in the data. MCMs
contain interactions of any order between variables, and thus may reveal the
presence of interactions of order higher than pairwise.
Related papers
- AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization [86.8133939108057]
We propose AdaMMS, a novel model merging method tailored for heterogeneous MLLMs.
Our method tackles the challenges in three steps: mapping, merging and searching.
As the first model merging method capable of merging heterogeneous MLLMs without labeled data, AdaMMS outperforms previous model merging methods on various vision-language benchmarks.
arXiv Detail & Related papers (2025-03-31T05:13:02Z) - Inferring High-Order Couplings with Neural Networks [3.55026004901472]
We introduce a new method that maps Restricted Boltzmann Machines to generalized Potts models, allowing for the extraction of interactions up to any specified order.
Our validation on synthetic datasets confirms the method's ability to recover two- and three-body interactions accurately.
When applied to protein sequence data, the framework competently reconstructs protein contact maps and provides performance comparable to the best inverse Potts models.
arXiv Detail & Related papers (2025-01-10T17:01:09Z) - Model aggregation: minimizing empirical variance outperforms minimizing
empirical error [0.29008108937701327]
We propose a data-driven framework that aggregates predictions from diverse models into a single, more accurate output.
It is non-intrusive - treating models as black-box functions - model-agnostic, requires minimal assumptions, and can combine outputs from a wide range of models.
We show how it successfully integrates traditional solvers with machine learning models to improve both robustness and accuracy.
arXiv Detail & Related papers (2024-09-25T18:33:21Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Induced Model Matching: Restricted Models Help Train Full-Featured Models [1.4963011898406866]
We consider scenarios where a very accurate (often small) predictive model using restricted features is available when training a full-featured (often larger) model.
How can the restricted model be useful to the full model?
We introduce a methodology called Induced Model Matching (IMM)
IMM aligns the context-restricted, or induced, version of the large model with the restricted model.
arXiv Detail & Related papers (2024-02-19T20:21:09Z) - Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL [57.745700271150454]
We study the sample complexity of reinforcement learning in Mean-Field Games (MFGs) with model-based function approximation.
We introduce the Partial Model-Based Eluder Dimension (P-MBED), a more effective notion to characterize the model class complexity.
arXiv Detail & Related papers (2024-02-08T14:54:47Z) - Representation Surgery for Multi-Task Model Merging [57.63643005215592]
Multi-task learning (MTL) compresses the information from multiple tasks into a unified backbone to improve computational efficiency and generalization.
Recent work directly merges multiple independently trained models to perform MTL instead of collecting their raw data for joint training.
By visualizing the representation distribution of existing model merging schemes, we find that the merged model often suffers from the dilemma of representation bias.
arXiv Detail & Related papers (2024-02-05T03:39:39Z) - Sample Complexity Characterization for Linear Contextual MDPs [67.79455646673762]
Contextual decision processes (CMDPs) describe a class of reinforcement learning problems in which the transition kernels and reward functions can change over time with different MDPs indexed by a context variable.
CMDPs serve as an important framework to model many real-world applications with time-varying environments.
We study CMDPs under two linear function approximation models: Model I with context-varying representations and common linear weights for all contexts; and Model II with common representations for all contexts and context-varying linear weights.
arXiv Detail & Related papers (2024-02-05T03:25:04Z) - Exact and general decoupled solutions of the LMC Multitask Gaussian Process model [28.32223907511862]
The Linear Model of Co-regionalization (LMC) is a very general model of multitask gaussian process for regression or classification.
Recent work has shown that under some conditions the latent processes of the model can be decoupled, leading to a complexity that is only linear in the number of said processes.
We here extend these results, showing from the most general assumptions that the only condition necessary to an efficient exact computation of the LMC is a mild hypothesis on the noise model.
arXiv Detail & Related papers (2023-10-18T15:16:24Z) - Bayesian Learning of Coupled Biogeochemical-Physical Models [28.269731698116257]
Predictive models for marine ecosystems are used for a variety of needs.
Due to sparse measurements and limited understanding of the myriad of ocean processes, there is significant uncertainty.
We develop a Bayesian model learning methodology that allows handling in the space of candidate models and discovery of new models.
arXiv Detail & Related papers (2022-11-12T17:49:18Z) - PAC Reinforcement Learning for Predictive State Representations [60.00237613646686]
We study online Reinforcement Learning (RL) in partially observable dynamical systems.
We focus on the Predictive State Representations (PSRs) model, which is an expressive model that captures other well-known models.
We develop a novel model-based algorithm for PSRs that can learn a near optimal policy in sample complexity scalingly.
arXiv Detail & Related papers (2022-07-12T17:57:17Z) - Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.
Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z) - Revisiting minimum description length complexity in overparameterized
models [38.21167656112762]
We provide an extensive theoretical characterization of MDL-COMP for linear models and kernel methods.
For kernel methods, we show that MDL-COMP informs minimax in-sample error, and can decrease as the dimensionality of the input increases.
We also prove that MDL-COMP bounds the in-sample mean squared error (MSE)
arXiv Detail & Related papers (2020-06-17T22:45:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.