Shallow diffusion networks provably learn hidden low-dimensional structure
- URL: http://arxiv.org/abs/2410.11275v1
- Date: Tue, 15 Oct 2024 04:55:56 GMT
- Title: Shallow diffusion networks provably learn hidden low-dimensional structure
- Authors: Nicholas M. Boffi, Arthur Jacot, Stephen Tu, Ingvar Ziemann,
- Abstract summary: Diffusion-based generative models provide a powerful framework for learning to sample from a complex target distribution.
We show that these models provably adapt to simple forms of low dimensional structure, thereby avoiding the curse of dimensionality.
We combine our results with recent analyses of sampling with diffusion models to provide an end-to-end sample complexity bound for learning to sample from structured distributions.
- Score: 17.563546018565468
- License:
- Abstract: Diffusion-based generative models provide a powerful framework for learning to sample from a complex target distribution. The remarkable empirical success of these models applied to high-dimensional signals, including images and video, stands in stark contrast to classical results highlighting the curse of dimensionality for distribution recovery. In this work, we take a step towards understanding this gap through a careful analysis of learning diffusion models over the Barron space of single layer neural networks. In particular, we show that these shallow models provably adapt to simple forms of low dimensional structure, thereby avoiding the curse of dimensionality. We combine our results with recent analyses of sampling with diffusion models to provide an end-to-end sample complexity bound for learning to sample from structured distributions. Importantly, our results do not require specialized architectures tailored to particular latent structures, and instead rely on the low-index structure of the Barron space to adapt to the underlying distribution.
Related papers
- A precise asymptotic analysis of learning diffusion models: theory and insights [37.30894159200853]
We consider the problem of learning a flow or diffusion-based generative model parametrized by a two-layer auto-encoder.
We derive a tight characterization of low-dimensional projections of the distribution of samples generated by the learned model.
arXiv Detail & Related papers (2025-01-07T16:56:40Z) - Nonparametric estimation of a factorizable density using diffusion models [3.5773675235837974]
In this paper, we study diffusion models as an implicit approach to nonparametric density estimation.
We show that an implicit density estimator constructed from diffusion models adapts to the factorization structure and achieves the minimax optimal rate.
In constructing the estimator, we design a sparse weight-sharing neural network architecture.
arXiv Detail & Related papers (2025-01-03T12:32:19Z) - Learning with Hidden Factorial Structure [2.474908349649168]
Recent advances suggest that text and image data contain such hidden structures, which help mitigate the curse of dimensionality.
We present a controlled experimental framework to test whether neural networks can indeed exploit such "hidden factorial structures"
arXiv Detail & Related papers (2024-11-02T22:32:53Z) - Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering [15.326641037243006]
diffusion models can effectively learn the image distribution and generate new samples.
We provide theoretical insights into this phenomenon by leveraging key empirical observations.
We show that the minimal number of samples required to learn the underlying distribution scales linearly with the intrinsic dimensions.
arXiv Detail & Related papers (2024-09-04T04:14:02Z) - A Phase Transition in Diffusion Models Reveals the Hierarchical Nature of Data [51.03144354630136]
Recent advancements show that diffusion models can generate high-quality images.
We study this phenomenon in a hierarchical generative model of data.
We find that the backward diffusion process acting after a time $t$ is governed by a phase transition.
arXiv Detail & Related papers (2024-02-26T19:52:33Z) - Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement [58.9768112704998]
Disentangled representation learning strives to extract the intrinsic factors within observed data.
We introduce a new perspective and framework, demonstrating that diffusion models with cross-attention can serve as a powerful inductive bias.
This is the first work to reveal the potent disentanglement capability of diffusion models with cross-attention, requiring no complex designs.
arXiv Detail & Related papers (2024-02-15T05:07:54Z) - Phasic Content Fusing Diffusion Model with Directional Distribution
Consistency for Few-Shot Model Adaption [73.98706049140098]
We propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss.
Specifically, we design a phasic training strategy with phasic content fusion to help our model learn content and style information when t is large.
Finally, we propose a cross-domain structure guidance strategy that enhances structure consistency during domain adaptation.
arXiv Detail & Related papers (2023-09-07T14:14:11Z) - Learning multi-scale local conditional probability models of images [7.07848787073901]
Deep neural networks can learn powerful prior probability models for images, as evidenced by the high-quality generations obtained with recent score-based diffusion methods.
But the means by which these networks capture complex global statistical structure, apparently without suffering from the curse of dimensionality, remain a mystery.
We incorporate diffusion methods into a multi-scale decomposition, reducing dimensionality by assuming a stationary local Markov model for wavelet coefficients conditioned on coarser-scale coefficients.
arXiv Detail & Related papers (2023-03-06T09:23:14Z) - Score Approximation, Estimation and Distribution Recovery of Diffusion
Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace.
We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated.
The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z) - Anomaly Detection on Attributed Networks via Contrastive Self-Supervised
Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks.
Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair.
A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.