Related papers: Shallow diffusion networks provably learn hidden low-dimensional structure

Shallow diffusion networks provably learn hidden low-dimensional structure

URL: http://arxiv.org/abs/2410.11275v1
Date: Tue, 15 Oct 2024 04:55:56 GMT
Title: Shallow diffusion networks provably learn hidden low-dimensional structure
Authors: Nicholas M. Boffi, Arthur Jacot, Stephen Tu, Ingvar Ziemann,
Abstract summary: Diffusion-based generative models provide a powerful framework for learning to sample from a complex target distribution. We show that these models provably adapt to simple forms of low dimensional structure, thereby avoiding the curse of dimensionality. We combine our results with recent analyses of sampling with diffusion models to provide an end-to-end sample complexity bound for learning to sample from structured distributions.
Score: 17.563546018565468
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion-based generative models provide a powerful framework for learning to sample from a complex target distribution. The remarkable empirical success of these models applied to high-dimensional signals, including images and video, stands in stark contrast to classical results highlighting the curse of dimensionality for distribution recovery. In this work, we take a step towards understanding this gap through a careful analysis of learning diffusion models over the Barron space of single layer neural networks. In particular, we show that these shallow models provably adapt to simple forms of low dimensional structure, thereby avoiding the curse of dimensionality. We combine our results with recent analyses of sampling with diffusion models to provide an end-to-end sample complexity bound for learning to sample from structured distributions. Importantly, our results do not require specialized architectures tailored to particular latent structures, and instead rely on the low-index structure of the Barron space to adapt to the underlying distribution.

Related papers

A precise asymptotic analysis of learning diffusion models: theory and insights [37.30894159200853]
We consider the problem of learning a flow or diffusion-based generative model parametrized by a two-layer auto-encoder. We derive a tight characterization of low-dimensional projections of the distribution of samples generated by the learned model.
arXiv Detail & Related papers (2025-01-07T16:56:40Z)
Nonparametric estimation of a factorizable density using diffusion models [3.5773675235837974]
In this paper, we study diffusion models as an implicit approach to nonparametric density estimation. We show that an implicit density estimator constructed from diffusion models adapts to the factorization structure and achieves the minimax optimal rate. In constructing the estimator, we design a sparse weight-sharing neural network architecture.
arXiv Detail & Related papers (2025-01-03T12:32:19Z)
Scaling Laws with Hidden Structure [2.474908349649168]
Recent advances suggest that text and image data contain such hidden structures, which help mitigate the curse of dimensionality. In this paper, we present a controlled experimental framework to test whether neural networks can indeed exploit such hidden factorial structures'' We find that they do leverage these latent patterns to learn discrete distributions more efficiently, and derive scaling laws linking model sizes, hidden factorizations, and accuracy.
arXiv Detail & Related papers (2024-11-02T22:32:53Z)
Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network. Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z)
Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering [15.326641037243006]
diffusion models can effectively learn the image distribution and generate new samples. We provide theoretical insights into this phenomenon by leveraging key empirical observations. We show that the minimal number of samples required to learn the underlying distribution scales linearly with the intrinsic dimensions.
arXiv Detail & Related papers (2024-09-04T04:14:02Z)
Structure-Guided Adversarial Training of Diffusion Models [27.723913809313125]
We introduce Structure-guided Adversarial training of Diffusion Models (SADM) We compel the model to learn manifold structures between samples in each training batch. SADM substantially improves existing diffusion transformers and outperforms existing methods in image generation and fine-tuning tasks.
arXiv Detail & Related papers (2024-02-27T15:05:13Z)
A Phase Transition in Diffusion Models Reveals the Hierarchical Nature of Data [51.03144354630136]
Recent advancements show that diffusion models can generate high-quality images. We study this phenomenon in a hierarchical generative model of data. We find that the backward diffusion process acting after a time $t$ is governed by a phase transition.
arXiv Detail & Related papers (2024-02-26T19:52:33Z)
Neural Network Parameter Diffusion [50.85251415173792]
Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also. generate high-performing neural network parameters.
arXiv Detail & Related papers (2024-02-20T16:59:03Z)
Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement [58.9768112704998]
Disentangled representation learning strives to extract the intrinsic factors within observed data. We introduce a new perspective and framework, demonstrating that diffusion models with cross-attention can serve as a powerful inductive bias. This is the first work to reveal the potent disentanglement capability of diffusion models with cross-attention, requiring no complex designs.
arXiv Detail & Related papers (2024-02-15T05:07:54Z)
The Hidden Linear Structure in Score-Based Models and its Application [2.1756081703276]
We show that for well-trained diffusion models, the learned score at a high noise scale is well approximated by the linear score of Gaussian. Our finding of the linear structure in the score-based model has implications for better model design and data pre-processing.
arXiv Detail & Related papers (2023-11-17T22:25:07Z)
Phasic Content Fusing Diffusion Model with Directional Distribution Consistency for Few-Shot Model Adaption [73.98706049140098]
We propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss. Specifically, we design a phasic training strategy with phasic content fusion to help our model learn content and style information when t is large. Finally, we propose a cross-domain structure guidance strategy that enhances structure consistency during domain adaptation.
arXiv Detail & Related papers (2023-09-07T14:14:11Z)
Learning multi-scale local conditional probability models of images [7.07848787073901]
Deep neural networks can learn powerful prior probability models for images, as evidenced by the high-quality generations obtained with recent score-based diffusion methods. But the means by which these networks capture complex global statistical structure, apparently without suffering from the curse of dimensionality, remain a mystery. We incorporate diffusion methods into a multi-scale decomposition, reducing dimensionality by assuming a stationary local Markov model for wavelet coefficients conditioned on coarser-scale coefficients.
arXiv Detail & Related papers (2023-03-06T09:23:14Z)
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.