Related papers: Preventing Model Collapse via Contraction-Conditioned Neural Filters

Preventing Model Collapse via Contraction-Conditioned Neural Filters

URL: http://arxiv.org/abs/2512.00757v1
Date: Sun, 30 Nov 2025 06:48:21 GMT
Title: Preventing Model Collapse via Contraction-Conditioned Neural Filters
Authors: Zongjian Han, Yiran Liang, Ruiwen Wang, Yiwei Luo, Yilin Huang, Xiaotong Song, Dongqing Wei,
Abstract summary: Our approach completely eliminates the dependence on increasing sample sizes within an unbiased estimation framework.<n>We develop specialized neural network architectures and loss functions that enable the filter to actively learn contraction conditions.<n> Experimental results show that our neural network filter effectively learns contraction conditions and prevents model collapse under fixed sample size settings.
Score: 0.7720091178131443
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper presents a neural network filter method based on contraction operators to address model collapse in recursive training of generative models. Unlike \cite{xu2024probabilistic}, which requires superlinear sample growth ($O(t^{1+s})$), our approach completely eliminates the dependence on increasing sample sizes within an unbiased estimation framework by designing a neural filter that learns to satisfy contraction conditions. We develop specialized neural network architectures and loss functions that enable the filter to actively learn contraction conditions satisfying Assumption 2.3 in exponential family distributions, thereby ensuring practical application of our theoretical results. Theoretical analysis demonstrates that when the learned contraction conditions are satisfied, estimation errors converge probabilistically even with constant sample sizes, i.e., $\limsup_{t\to\infty}\mathbb{P}(\|\mathbf{e}_t\|>δ)=0$ for any $δ>0$. Experimental results show that our neural network filter effectively learns contraction conditions and prevents model collapse under fixed sample size settings, providing an end-to-end solution for practical applications.

Related papers

Generative Modeling with Continuous Flows: Sample Complexity of Flow Matching [60.37045080890305]
We provide the first analysis of the sample complexity for flow-matching based generative models.<n>We decompose the velocity field estimation error into neural-network approximation error, statistical error due to the finite sample size, and optimization error due to the finite number of optimization steps for estimating the velocity field.
arXiv Detail & Related papers (2025-12-01T05:14:25Z)
Scaling and renormalization in high-dimensional regression [72.59731158970894]
We present a unifying perspective on recent results on ridge regression.<n>We use the basic tools of random matrix theory and free probability, aimed at readers with backgrounds in physics and deep learning.<n>Our results extend and provide a unifying perspective on earlier models of scaling laws.
arXiv Detail & Related papers (2024-05-01T15:59:00Z)
A Pseudo-Semantic Loss for Autoregressive Models with Logical Constraints [87.08677547257733]
Neuro-symbolic AI bridges the gap between purely symbolic and neural approaches to learning. We show how to maximize the likelihood of a symbolic constraint w.r.t the neural network's output distribution. We also evaluate our approach on Sudoku and shortest-path prediction cast as autoregressive generation.
arXiv Detail & Related papers (2023-12-06T20:58:07Z)
Distribution learning via neural differential equations: a nonparametric statistical perspective [1.4436965372953483]
This work establishes the first general statistical convergence analysis for distribution learning via ODE models trained through likelihood transformations. We show that the latter can be quantified via the $C1$-metric entropy of the class $mathcal F$. We then apply this general framework to the setting of $Ck$-smooth target densities, and establish nearly minimax-optimal convergence rates for two relevant velocity field classes $mathcal F$: $Ck$ functions and neural networks.
arXiv Detail & Related papers (2023-09-03T00:21:37Z)
Understanding Pathologies of Deep Heteroskedastic Regression [25.509884677111344]
Heteroskedastic models predict both mean and residual noise for each data point. At one extreme, these models fit all training data perfectly, eliminating residual noise entirely. At the other, they overfit the residual noise while predicting a constant, uninformative mean. We observe a lack of middle ground, suggesting a phase transition dependent on model regularization strength.
arXiv Detail & Related papers (2023-06-29T06:31:27Z)
Conditional Distribution Function Estimation Using Neural Networks for Censored and Uncensored Data [0.0]
We consider estimating the conditional distribution function using neural networks for both censored and uncensored data. We show the proposed method possesses desirable performance, whereas the partial likelihood method yields biased estimates when model assumptions are violated.
arXiv Detail & Related papers (2022-07-06T01:12:22Z)
Robust Implicit Networks via Non-Euclidean Contractions [63.91638306025768]
Implicit neural networks show improved accuracy and significant reduction in memory consumption. They can suffer from ill-posedness and convergence instability. This paper provides a new framework to design well-posed and robust implicit neural networks.
arXiv Detail & Related papers (2021-06-06T18:05:02Z)
Asymptotic Risk of Overparameterized Likelihood Models: Double Descent Theory for Deep Neural Networks [12.132641563193582]
We investigate the risk of a general class of overvisibilityized likelihood models, including deep models. We demonstrate that several explicit models, such as parallel deep neural networks and ensemble learning, are in agreement with our theory.
arXiv Detail & Related papers (2021-02-28T13:02:08Z)
Provable Model-based Nonlinear Bandit and Reinforcement Learning: Shelve Optimism, Embrace Virtual Curvature [61.22680308681648]
We show that global convergence is statistically intractable even for one-layer neural net bandit with a deterministic reward. For both nonlinear bandit and RL, the paper presents a model-based algorithm, Virtual Ascent with Online Model Learner (ViOL)
arXiv Detail & Related papers (2021-02-08T12:41:56Z)
Goal-directed Generation of Discrete Structures with Conditional Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward. We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
Random Vector Functional Link Networks for Function Approximation on Manifolds [8.535815777849786]
We show that single layer neural-networks with random input-to-hidden layer weights and biases have seen success in practice. We further adapt this randomized neural network architecture to approximate functions on smooth, compact submanifolds of Euclidean space.
arXiv Detail & Related papers (2020-07-30T23:50:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.