Related papers: Fourier Sliced-Wasserstein Embedding for Multisets and Measures

Fourier Sliced-Wasserstein Embedding for Multisets and Measures

URL: http://arxiv.org/abs/2504.02544v2
Date: Mon, 14 Apr 2025 13:02:13 GMT
Title: Fourier Sliced-Wasserstein Embedding for Multisets and Measures
Authors: Tal Amir, Nadav Dym,
Abstract summary: We present a novel method to embed multisets and measures over $mathbbRd$ into Euclidean space.<n>We prove that it is impossible to embed distributions over $mathbbRd$ into Euclidean space in a bi-Lipschitz manner.
Score: 3.396731589928944
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present the Fourier Sliced-Wasserstein (FSW) embedding - a novel method to embed multisets and measures over $\mathbb{R}^d$ into Euclidean space. Our proposed embedding approximately preserves the sliced Wasserstein distance on distributions, thereby yielding geometrically meaningful representations that better capture the structure of the input. Moreover, it is injective on measures and bi-Lipschitz on multisets - a significant advantage over prevalent methods based on sum- or max-pooling, which are provably not bi-Lipschitz, and, in many cases, not even injective. The required output dimension for these guarantees is near-optimal: roughly $2 N d$, where $N$ is the maximal input multiset size. Furthermore, we prove that it is impossible to embed distributions over $\mathbb{R}^d$ into Euclidean space in a bi-Lipschitz manner. Thus, the metric properties of our embedding are, in a sense, the best possible. Through numerical experiments, we demonstrate that our method yields superior multiset representations that improve performance in practical learning tasks. Specifically, we show that (a) a simple combination of the FSW embedding with an MLP achieves state-of-the-art performance in learning the (non-sliced) Wasserstein distance; and (b) replacing max-pooling with the FSW embedding makes PointNet significantly more robust to parameter reduction, with only minor performance degradation even after a 40-fold reduction.

Related papers

Approximating fixed size quantum correlations in polynomial time [8.099700053397278]
We show that $varepsilon$-additive approximations of the optimal value of fixed-size two-player free games can be computed in time.<n>Our main result is based on novel Bose-symmetric quantum de Finetti theorems tailored for constrained quantum separability problems.
arXiv Detail & Related papers (2025-07-16T15:01:45Z)
New advances in universal approximation with neural networks of minimal width [4.424170214926035]
We show that autoencoders with leaky ReLU activations are universal approximators of $Lp$ functions. We broaden our results to show that smooth invertible neural networks can approximate $Lp(mathbbRd,mathbbRd)$ on compacta.
arXiv Detail & Related papers (2024-11-13T16:17:16Z)
Constructive Universal Approximation and Finite Sample Memorization by Narrow Deep ReLU Networks [0.0]
We show that any dataset with $N$ distinct points in $mathbbRd$ and $M$ output classes can be exactly classified.<n>We also prove a universal approximation theorem in $Lp(Omega; mathbbRm)$ for any bounded domain.<n>Our results offer a unified and interpretable framework connecting controllability, expressivity, and training dynamics in deep neural networks.
arXiv Detail & Related papers (2024-09-10T14:31:21Z)
Fourier Sliced-Wasserstein Embedding for Multisets and Measures [3.396731589928944]
We present a novel method to embed multisets and measures over $mathbbRd$ into Euclidean space. We show that our method yields superior representations of input multisets and offers practical advantage for learning on multiset data.
arXiv Detail & Related papers (2024-05-26T11:04:41Z)
Statistical and Computational Guarantees of Kernel Max-Sliced Wasserstein Distances [9.608373793625107]
kernel max-sliced (KMS) Wasserstein distance is developed for this purpose. We show that computing the KMS $2$-Wasserstein distance is NP-hard.
arXiv Detail & Related papers (2024-05-24T11:14:56Z)
Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators. Key to our solution is a novel projection technique based on ideas from harmonic analysis. Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z)
Polynomial Width is Sufficient for Set Representation with High-dimensional Features [69.65698500919869]
DeepSets is the most widely used neural network architecture for set representation. We present two set-element embedding layers: (a) linear + power activation (LP) and (b) linear + exponential activations (LE)
arXiv Detail & Related papers (2023-07-08T16:00:59Z)
A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions. We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$. We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z)
Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements. We derive an information-theoretic lower bound for the minimax risk under this setting. We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z)
Nystr\"om Kernel Mean Embeddings [92.10208929236826]
We propose an efficient approximation procedure based on the Nystr"om method. It yields sufficient conditions on the subsample size to obtain the standard $n-1/2$ rate. We discuss applications of this result for the approximation of the maximum mean discrepancy and quadrature rules.
arXiv Detail & Related papers (2022-01-31T08:26:06Z)
Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes [61.11090361892306]
Reward-free reinforcement learning (RL) considers the setting where the agent does not have access to a reward function during exploration. We show that this separation does not exist in the setting of linear MDPs. We develop a computationally efficient algorithm for reward-free RL in a $d$-dimensional linear MDP.
arXiv Detail & Related papers (2022-01-26T22:09:59Z)
Multiscale regression on unknown manifolds [13.752772802705978]
We construct low-dimensional coordinates on $mathcalM$ at multiple scales and perform multiscale regression by local fitting. We analyze the generalization error of our method by proving finite sample bounds in high probability on rich classes of priors. Our algorithm has quasilinear complexity in the sample size, with constants linear in $D$ and exponential in $d$.
arXiv Detail & Related papers (2021-01-13T15:14:31Z)
Riemannian Stochastic Proximal Gradient Methods for Nonsmooth Optimization over the Stiefel Manifold [7.257751371276488]
R-ProxSGD and R-ProxSPB are generalizations of proximal SGD and proximal SpiderBoost. R-ProxSPB algorithm finds an $epsilon$-stationary point with $O(epsilon-3)$ IFOs in the online case, and $O(n+sqrtnepsilon-3)$ IFOs in the finite-sum case.
arXiv Detail & Related papers (2020-05-03T23:41:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.