Score-based generative models learn manifold-like structures with
constrained mixing
- URL: http://arxiv.org/abs/2311.09952v1
- Date: Thu, 16 Nov 2023 15:15:15 GMT
- Title: Score-based generative models learn manifold-like structures with
constrained mixing
- Authors: Li Kevin Wenliang, Ben Moran
- Abstract summary: How do score-based generative models learn the data distribution supported on a low-dimensional manifold?
We investigate the score model of a trained SBM through its linear approximations and subspaces spanned by local feature vectors.
We find that the learned vector field mixes samples by a non-conservative field within the manifold, although it denoises with normal projections as if there is an energy function in off-manifold directions.
- Score: 2.843124313496295
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: How do score-based generative models (SBMs) learn the data distribution
supported on a low-dimensional manifold? We investigate the score model of a
trained SBM through its linear approximations and subspaces spanned by local
feature vectors. During diffusion as the noise decreases, the local
dimensionality increases and becomes more varied between different sample
sequences. Importantly, we find that the learned vector field mixes samples by
a non-conservative field within the manifold, although it denoises with normal
projections as if there is an energy function in off-manifold directions. At
each noise level, the subspace spanned by the local features overlap with an
effective density function. These observations suggest that SBMs can flexibly
mix samples with the learned score field while carefully maintaining a
manifold-like structure of the data distribution.
Related papers
- Reconstructing Galaxy Cluster Mass Maps using Score-based Generative Modeling [9.386611764730791]
We present a novel approach to reconstruct gas and dark matter projected density maps of galaxy clusters using score-based generative modeling.
Our diffusion model takes in mock SZ and X-ray images as conditional observations, and generates realizations of corresponding gas and dark matter maps by sampling from a learned data posterior.
arXiv Detail & Related papers (2024-10-03T18:00:03Z) - Understanding the Local Geometry of Generative Model Manifolds [14.191548577311904]
We study the relationship between the textitlocal geometry of the learned manifold and downstream generation.
We provide quantitative and qualitative evidence showing that for a given latent, the local descriptors are correlated with generation aesthetics, artifacts, uncertainty, and even memorization.
arXiv Detail & Related papers (2024-08-15T17:59:06Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Diffusion map particle systems for generative modeling [0.0]
We propose a novel diffusion map particle system (DMPS) for generative modeling, based on diffusion maps and Laplacian-adjusted Wasserstein gradient descent (LAWGD)
Diffusion maps are used to approximate the generator of the corresponding Langevin diffusion process from samples, and hence to learn the underlying data-generating manifold. LAWGD enables efficient sampling from the target distribution given a suitable choice of kernel, which we construct here via a spectral approximation of the generator, computed with diffusion maps.
Our method requires no offline training and minimal tuning, and can outperform other approaches on data sets of moderate dimension.
arXiv Detail & Related papers (2023-04-01T02:07:08Z) - Score Approximation, Estimation and Distribution Recovery of Diffusion
Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace.
We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated.
The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z) - Unsupervised Learning of Sampling Distributions for Particle Filters [80.6716888175925]
We put forward four methods for learning sampling distributions from observed measurements.
Experiments demonstrate that learned sampling distributions exhibit better performance than designed, minimum-degeneracy sampling distributions.
arXiv Detail & Related papers (2023-02-02T15:50:21Z) - Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain.
We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions.
We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z) - Reweighted Manifold Learning of Collective Variables from Enhanced Sampling Simulations [2.6009298669020477]
We provide a framework based on anisotropic diffusion maps for manifold learning.
We show that our framework reverts the biasing effect yielding CVs that correctly describe the equilibrium density.
We show that it can be used in many manifold learning techniques on data from both standard and enhanced sampling simulations.
arXiv Detail & Related papers (2022-07-29T08:59:56Z) - Contrastive Neighborhood Alignment [81.65103777329874]
We present Contrastive Neighborhood Alignment (CNA), a manifold learning approach to maintain the topology of learned features.
The target model aims to mimic the local structure of the source representation space using a contrastive loss.
CNA is illustrated in three scenarios: manifold learning, where the model maintains the local topology of the original data in a dimension-reduced space; model distillation, where a small student model is trained to mimic a larger teacher; and legacy model update, where an older model is replaced by a more powerful one.
arXiv Detail & Related papers (2022-01-06T04:58:31Z) - Sampling in Combinatorial Spaces with SurVAE Flow Augmented MCMC [83.48593305367523]
Hybrid Monte Carlo is a powerful Markov Chain Monte Carlo method for sampling from complex continuous distributions.
We introduce a new approach based on augmenting Monte Carlo methods with SurVAE Flows to sample from discrete distributions.
We demonstrate the efficacy of our algorithm on a range of examples from statistics, computational physics and machine learning, and observe improvements compared to alternative algorithms.
arXiv Detail & Related papers (2021-02-04T02:21:08Z) - Flows for simultaneous manifold learning and density estimation [12.451050883955071]
manifold-learning flows (M-flows) represent datasets with a manifold structure more faithfully.
M-flows learn the data manifold and allow for better inference than standard flows in the ambient data space.
arXiv Detail & Related papers (2020-03-31T02:07:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.