EMPEROR: Efficient Moment-Preserving Representation of Distributions
- URL: http://arxiv.org/abs/2509.16379v1
- Date: Fri, 19 Sep 2025 19:42:52 GMT
- Title: EMPEROR: Efficient Moment-Preserving Representation of Distributions
- Authors: Xinran Liu, Shansita D. Sharma, Soheil Kolouri,
- Abstract summary: We introduce EMPEROR, a framework for representing high-dimensional probability measures in neural network representations.<n>Unlike global pooling operations, EMPEROR encodes a feature distribution through its statistical moments.
- Score: 20.606870676059234
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce EMPEROR (Efficient Moment-Preserving Representation of Distributions), a mathematically rigorous and computationally efficient framework for representing high-dimensional probability measures arising in neural network representations. Unlike heuristic global pooling operations, EMPEROR encodes a feature distribution through its statistical moments. Our approach leverages the theory of sliced moments: features are projected onto multiple directions, lightweight univariate Gaussian mixture models (GMMs) are fit to each projection, and the resulting slice parameters are aggregated into a compact descriptor. We establish determinacy guarantees via Carleman's condition and the Cram\'er-Wold theorem, ensuring that the GMM is uniquely determined by its sliced moments, and we derive finite-sample error bounds that scale optimally with the number of slices and samples. Empirically, EMPEROR captures richer distributional information than common pooling schemes across various data modalities, while remaining computationally efficient and broadly applicable.
Related papers
- Gaussian-Mixture-Model Q-Functions for Policy Iteration in Reinforcement Learning [7.056697401102689]
This paper introduces a novel function-approximation role for Gaussian mixture models (GMMs) as direct surrogates for Q-function losses.<n>These parametric models, termed GMM-QFs, possess substantial representational capacity.<n>They are shown to be universal approximators over a broad class of functions.
arXiv Detail & Related papers (2025-12-21T15:00:32Z) - Adaptive Symmetrization of the KL Divergence [10.632997610787207]
Many tasks in machine learning can be described as or reduced to learning a probability distribution given a finite set of samples.<n>A common approach is to minimize a statistical divergence between the (empirical) data distribution and a parameterized distribution, e.g., a normalizing flow (NF) or an energy-based model (EBM)
arXiv Detail & Related papers (2025-11-14T10:41:59Z) - Error Bounds and Optimal Schedules for Masked Diffusions with Factorized Approximations [3.595215303316358]
Recently proposed generative models for discrete data, such as Masked Diffusion Models (MDMs), exploit conditional independence approximations.<n>We study the resulting computation-vs-accuracy trade-off, providing general error bounds (in relative entropy)<n>We then investigate the gain obtained by using non-constant schedule sizes.
arXiv Detail & Related papers (2025-10-29T14:11:03Z) - Generative Assignment Flows for Representing and Learning Joint Distributions of Discrete Data [2.6499018693213316]
We introduce a novel generative model for the representation of joint probability distributions of discrete random variables.<n>The approach uses measure transport by randomized assignment flows on the statistical submanifold of factorizing distributions.
arXiv Detail & Related papers (2024-06-06T21:58:33Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Scalable Dynamic Mixture Model with Full Covariance for Probabilistic Traffic Forecasting [14.951166842027819]
We propose a dynamic mixture of zero-mean Gaussian distributions for the time-varying error process.<n>The proposed method can be seamlessly integrated into existing deep-learning frameworks with only a few additional parameters to be learned.<n>We evaluate the proposed method on a traffic speed forecasting task and find that our method not only improves model horizons but also provides interpretabletemporal correlation structures.
arXiv Detail & Related papers (2022-12-10T22:50:00Z) - Algebraic Reduction of Hidden Markov Models [0.0]
We propose two algorithms that return models that exactly reproduce the single-time distribution of a given output process.
The reduction method exploits not only the structure of the observed output, but also its initial condition.
Optimal algorithms are derived for a class of HMM, namely ones.
arXiv Detail & Related papers (2022-08-11T02:46:05Z) - Wrapped Distributions on homogeneous Riemannian manifolds [58.720142291102135]
Control over distributions' properties, such as parameters, symmetry and modality yield a family of flexible distributions.
We empirically validate our approach by utilizing our proposed distributions within a variational autoencoder and a latent space network model.
arXiv Detail & Related papers (2022-04-20T21:25:21Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Efficient semidefinite-programming-based inference for binary and
multi-class MRFs [83.09715052229782]
We propose an efficient method for computing the partition function or MAP estimate in a pairwise MRF.
We extend semidefinite relaxations from the typical binary MRF to the full multi-class setting, and develop a compact semidefinite relaxation that can again be solved efficiently using the solver.
arXiv Detail & Related papers (2020-12-04T15:36:29Z) - Slice Sampling for General Completely Random Measures [74.24975039689893]
We present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables.
The efficacy of the proposed algorithm is evaluated on several popular nonparametric models.
arXiv Detail & Related papers (2020-06-24T17:53:53Z) - Fitting Laplacian Regularized Stratified Gaussian Models [0.0]
We consider the problem of jointly estimating multiple related zero-mean Gaussian distributions from data.
We propose a distributed method that scales to large problems, and illustrate the efficacy of the method with examples in finance, radar signal processing, and weather forecasting.
arXiv Detail & Related papers (2020-05-04T18:00:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.