A theoretical framework for overfitting in energy-based modeling
- URL: http://arxiv.org/abs/2501.19158v1
- Date: Fri, 31 Jan 2025 14:21:02 GMT
- Title: A theoretical framework for overfitting in energy-based modeling
- Authors: Giovanni Catania, Aurélien Decelle, Cyril Furtlehner, Beatriz Seoane,
- Abstract summary: We investigate the impact of limited data on training pairwise energy-based models for inverse problems aimed at identifying interaction networks.
We dissect training trajectories across the eigenbasis of the coupling matrix, exploiting the independent evolution of eigenmodes.
We show that finite data corrections can be accurately modeled through random matrix theory calculations.
- Score: 5.1337384597700995
- License:
- Abstract: We investigate the impact of limited data on training pairwise energy-based models for inverse problems aimed at identifying interaction networks. Utilizing the Gaussian model as testbed, we dissect training trajectories across the eigenbasis of the coupling matrix, exploiting the independent evolution of eigenmodes and revealing that the learning timescales are tied to the spectral decomposition of the empirical covariance matrix. We see that optimal points for early stopping arise from the interplay between these timescales and the initial conditions of training. Moreover, we show that finite data corrections can be accurately modeled through asymptotic random matrix theory calculations and provide the counterpart of generalized cross-validation in the energy based model context. Our analytical framework extends to binary-variable maximum-entropy pairwise models with minimal variations. These findings offer strategies to control overfitting in discrete-variable models through empirical shrinkage corrections, improving the management of overfitting in energy-based generative models.
Related papers
- Connectivity Shapes Implicit Regularization in Matrix Factorization Models for Matrix Completion [2.8948274245812335]
We investigate the implicit regularization of matrix factorization for solving matrix completion problems.
We empirically discover that the connectivity of observed data plays a crucial role in the implicit bias.
Our work reveals the intricate interplay between data connectivity, training dynamics, and implicit regularization in matrix factorization models.
arXiv Detail & Related papers (2024-05-22T15:12:14Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - CoCoGen: Physically-Consistent and Conditioned Score-based Generative Models for Forward and Inverse Problems [1.0923877073891446]
This work extends the reach of generative models into physical problem domains.
We present an efficient approach to promote consistency with the underlying PDE.
We showcase the potential and versatility of score-based generative models in various physics tasks.
arXiv Detail & Related papers (2023-12-16T19:56:10Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Structured Radial Basis Function Network: Modelling Diversity for
Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions.
A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems.
It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z) - Optimal regularizations for data generation with probabilistic graphical
models [0.0]
Empirically, well-chosen regularization schemes dramatically improve the quality of the inferred models.
We consider the particular case of L 2 and L 1 regularizations in the Maximum A Posteriori (MAP) inference of generative pairwise graphical models.
arXiv Detail & Related papers (2021-12-02T14:45:16Z) - Emergent fractal phase in energy stratified random models [0.0]
We study the effects of partial correlations in kinetic hopping terms of long-range random matrix models on their localization properties.
We show that any deviation from the completely correlated case leads to the emergent non-ergodic delocalization in the system.
arXiv Detail & Related papers (2021-06-07T18:00:01Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference.
We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.