Distributional Autoencoders Know the Score
- URL: http://arxiv.org/abs/2502.11583v3
- Date: Mon, 27 Oct 2025 00:26:42 GMT
- Title: Distributional Autoencoders Know the Score
- Authors: Andrej Leban,
- Abstract summary: The Distributional Principal Autoencoder (DPA) combines distributionally correct reconstruction with principal-component-like interpretability of the encodings.<n>We provide exact theoretical guarantees on both fronts.<n>We show that a single model can learn the data distribution and its intrinsic dimension with exact guarantees simultaneously.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Distributional Principal Autoencoder (DPA) combines distributionally correct reconstruction with principal-component-like interpretability of the encodings. In this work, we provide exact theoretical guarantees on both fronts. First, we derive a closed-form relation linking each optimal level-set geometry to the data-distribution score. This result explains DPA's empirical ability to disentangle factors of variation of the data, as well as allows the score to be recovered directly from samples. When the data follows the Boltzmann distribution, we demonstrate that this relation yields an approximation of the minimum free-energy path for the Mueller-Brown potential in a single fit. Second, we prove that if the data lies on a manifold that can be approximated by the encoder, latent components beyond the manifold dimension are conditionally independent of the data distribution - carrying no additional information - and thus reveal the intrinsic dimension. Together, these results show that a single model can learn the data distribution and its intrinsic dimension with exact guarantees simultaneously, unifying two longstanding goals of unsupervised learning.
Related papers
- Efficiently Verifiable Proofs of Data Attribution [9.05608916348947]
We propose an interactive verification paradigm for data attribution.<n>We provide formal completeness, soundness, and efficiency guarantees in the sense of Probably-Approximately-Correct (PAC) verification.
arXiv Detail & Related papers (2025-08-14T17:36:01Z) - Watermarking Generative Categorical Data [9.087950471621653]
Our method embeds secret signals by splitting the data distribution into two components and modifying one distribution based on a deterministic relationship with the other.
To verify the watermark, we introduce an insertion inverse algorithm and detect its presence by measuring the total variation distance between the inverse-decoded data and the original distribution.
arXiv Detail & Related papers (2024-11-16T21:57:45Z) - Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional dependencies for general score-mismatched diffusion samplers.<n>We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.<n>This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Adaptive Learning of the Latent Space of Wasserstein Generative Adversarial Networks [7.958528596692594]
We propose a novel framework called the latent Wasserstein GAN (LWGAN)
It fuses the Wasserstein auto-encoder and the Wasserstein GAN so that the intrinsic dimension of the data manifold can be adaptively learned.
We show that LWGAN is able to identify the correct intrinsic dimension under several scenarios.
arXiv Detail & Related papers (2024-09-27T01:25:22Z) - Learned Compression of Encoding Distributions [1.4732811715354455]
entropy bottleneck is a common component used in many learned compression models.
We propose a method that adapts the encoding distribution to match the latent data distribution for a specific input.
Our method achieves a Bjontegaard-Delta (BD)-rate gain of -7.10% on the Kodak test dataset.
arXiv Detail & Related papers (2024-06-18T21:05:51Z) - Distributional Principal Autoencoders [2.519266955671697]
Dimension reduction techniques usually lose information in the sense that reconstructed data are not identical to the original data.
We propose Distributional Principal Autoencoder (DPA) that consists of an encoder that maps high-dimensional data to low-dimensional latent variables.
For reconstructing data, the DPA decoder aims to match the conditional distribution of all data that are mapped to a certain latent value.
arXiv Detail & Related papers (2024-04-21T12:52:04Z) - Beyond the Known: Adversarial Autoencoders in Novelty Detection [2.7486022583843233]
In novelty detection, the goal is to decide if a new data point should be categorized as an inlier or an outlier.
We use a similar framework but with a lightweight deep network, and we adopt a probabilistic score with reconstruction error.
Our results indicate that our approach is effective at learning the target class, and it outperforms recent state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2024-04-06T00:04:19Z) - Fact Checking Beyond Training Set [64.88575826304024]
We show that the retriever-reader suffers from performance deterioration when it is trained on labeled data from one domain and used in another domain.
We propose an adversarial algorithm to make the retriever component robust against distribution shift.
We then construct eight fact checking scenarios from these datasets, and compare our model to a set of strong baseline models.
arXiv Detail & Related papers (2024-03-27T15:15:14Z) - SMaRt: Improving GANs with Score Matching Regularity [114.43433222721025]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.<n>We find that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.<n>We show that our approach can consistently boost the performance of various state-of-the-art GANs on real-world datasets with pre-trained diffusion models acting as the approximate score function.
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - Symmetric Equilibrium Learning of VAEs [56.56929742714685]
We view variational autoencoders (VAEs) as decoder-encoder pairs, which map distributions in the data space to distributions in the latent space and vice versa.
We propose a Nash equilibrium learning approach, which is symmetric with respect to the encoder and decoder and allows learning VAEs in situations where both the data and the latent distributions are accessible only by sampling.
arXiv Detail & Related papers (2023-07-19T10:27:34Z) - Probabilistic Matching of Real and Generated Data Statistics in Generative Adversarial Networks [0.6906005491572401]
We propose a method to ensure that the distributions of certain generated data statistics coincide with the respective distributions of the real data.
We evaluate the method on a synthetic dataset and a real-world dataset and demonstrate improved performance of our approach.
arXiv Detail & Related papers (2023-06-19T14:03:27Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - Score Approximation, Estimation and Distribution Recovery of Diffusion
Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace.
We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated.
The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z) - Learning the joint distribution of two sequences using little or no
paired data [16.189575655434844]
We present a noisy channel generative model of two sequences, for example text and speech.
We show that even tiny amount of paired data is sufficient to learn to relate the two modalities when a massive amount of unpaired data is available.
arXiv Detail & Related papers (2022-12-06T18:56:15Z) - Convergent autoencoder approximation of low bending and low distortion
manifold embeddings [5.5711773076846365]
We propose and analyze a novel regularization for learning the encoder component of an autoencoder.
The loss functional is computed via Monte Carlo integration with different sampling strategies for pairs of points on the input manifold.
Our main theorem identifies a loss functional of the embedding map as the $Gamma$-limit of the sampling-dependent loss functionals.
arXiv Detail & Related papers (2022-08-22T10:31:31Z) - Discrete Key-Value Bottleneck [95.61236311369821]
Deep neural networks perform well on classification tasks where data streams are i.i.d. and labeled data is abundant.
One powerful approach that has addressed this challenge involves pre-training of large encoders on volumes of readily available data, followed by task-specific tuning.
Given a new task, however, updating the weights of these encoders is challenging as a large number of weights needs to be fine-tuned, and as a result, they forget information about the previous tasks.
We propose a model architecture to address this issue, building upon a discrete bottleneck containing pairs of separate and learnable key-value codes.
arXiv Detail & Related papers (2022-07-22T17:52:30Z) - Self-Conditioned Generative Adversarial Networks for Image Editing [61.50205580051405]
Generative Adversarial Networks (GANs) are susceptible to bias, learned from either the unbalanced data, or through mode collapse.
We argue that this bias is responsible not only for fairness concerns, but that it plays a key role in the collapse of latent-traversal editing methods when deviating away from the distribution's core.
arXiv Detail & Related papers (2022-02-08T18:08:24Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - Neural Distributed Source Coding [59.630059301226474]
We present a framework for lossy DSC that is agnostic to the correlation structure and can scale to high dimensions.
We evaluate our method on multiple datasets and show that our method can handle complex correlations and state-of-the-art PSNR.
arXiv Detail & Related papers (2021-06-05T04:50:43Z) - A new framework for experimental design using Bayesian Evidential
Learning: the case of wellhead protection area [0.0]
We predict the wellhead protection area (WHPA), the shape and extent of which is influenced by the distribution of hydraulic conductivity (K), from a small number of tracing experiments (predictors)
Our first objective is to make predictions of the WHPA within the Bayesian Evidential Learning framework, which aims to find a direct relationship between predictor and target using machine learning.
Our second objective is to extend BEL to identify the optimal design of data source locations that minimizes the posterior uncertainty of the WHPA.
arXiv Detail & Related papers (2021-05-12T09:40:28Z) - Representation Learning for Sequence Data with Deep Autoencoding
Predictive Components [96.42805872177067]
We propose a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space.
We encourage this latent structure by maximizing an estimate of predictive information of latent feature sequences, which is the mutual information between past and future windows at each time step.
We demonstrate that our method recovers the latent space of noisy dynamical systems, extracts predictive features for forecasting tasks, and improves automatic speech recognition when used to pretrain the encoder on large amounts of unlabeled data.
arXiv Detail & Related papers (2020-10-07T03:34:01Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.