MESS: Manifold Embedding Motivated Super Sampling
- URL: http://arxiv.org/abs/2107.06566v1
- Date: Wed, 14 Jul 2021 09:07:54 GMT
- Title: MESS: Manifold Embedding Motivated Super Sampling
- Authors: Erik Thordsen and Erich Schubert
- Abstract summary: We propose a framework to generate virtual data points that faithful to an approximate embedding function underlying the manifold observable in the data.
For increasing intrinsic dimensionality of a data set the required data density introduces the need for very large data sets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many approaches in the field of machine learning and data analysis rely on
the assumption that the observed data lies on lower-dimensional manifolds. This
assumption has been verified empirically for many real data sets. To make use
of this manifold assumption one generally requires the manifold to be locally
sampled to a certain density such that features of the manifold can be
observed. However, for increasing intrinsic dimensionality of a data set the
required data density introduces the need for very large data sets, resulting
in one of the many faces of the curse of dimensionality. To combat the
increased requirement for local data density we propose a framework to generate
virtual data points that faithful to an approximate embedding function
underlying the manifold observable in the data.
Related papers
- Manifold Learning via Foliations and Knowledge Transfer [0.0]
We provide a natural geometric structure on the space of data employing a deep ReLU neural network trained as a classifier.
We show that the singular points of such foliation are contained in a measure zero set, and that a local regular foliation exists almost everywhere.
Experiments show that the data is correlated with leaves of such foliation.
arXiv Detail & Related papers (2024-09-11T16:53:53Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Datacube segmentation via Deep Spectral Clustering [76.48544221010424]
Extended Vision techniques often pose a challenge in their interpretation.
The huge dimensionality of data cube spectra poses a complex task in its statistical interpretation.
In this paper, we explore the possibility of applying unsupervised clustering methods in encoded space.
A statistical dimensional reduction is performed by an ad hoc trained (Variational) AutoEncoder, while the clustering process is performed by a (learnable) iterative K-Means clustering algorithm.
arXiv Detail & Related papers (2024-01-31T09:31:28Z) - Manifold Learning with Sparse Regularised Optimal Transport [0.17205106391379024]
Real-world datasets are subject to noisy observations and sampling, so that distilling information about the underlying manifold is a major challenge.
We propose a method for manifold learning that utilises a symmetric version of optimal transport with a quadratic regularisation.
We prove that the resulting kernel is consistent with a Laplace-type operator in the continuous limit, establish robustness to heteroskedastic noise and exhibit these results in simulations.
arXiv Detail & Related papers (2023-07-19T08:05:46Z) - T1: Scaling Diffusion Probabilistic Fields to High-Resolution on Unified
Visual Modalities [69.16656086708291]
Diffusion Probabilistic Field (DPF) models the distribution of continuous functions defined over metric spaces.
We propose a new model comprising of a view-wise sampling algorithm to focus on local structure learning.
The model can be scaled to generate high-resolution data while unifying multiple modalities.
arXiv Detail & Related papers (2023-05-24T03:32:03Z) - Convolutional Filtering on Sampled Manifolds [122.06927400759021]
We show that convolutional filtering on a sampled manifold converges to continuous manifold filtering.
Our findings are further demonstrated empirically on a problem of navigation control.
arXiv Detail & Related papers (2022-11-20T19:09:50Z) - Data-Efficient Learning via Minimizing Hyperspherical Energy [48.47217827782576]
This paper considers the problem of data-efficient learning from scratch using a small amount of representative data.
We propose a MHE-based active learning (MHEAL) algorithm, and provide comprehensive theoretical guarantees for MHEAL.
arXiv Detail & Related papers (2022-06-30T11:39:12Z) - A graph representation based on fluid diffusion model for multimodal
data analysis: theoretical aspects and enhanced community detection [14.601444144225875]
We introduce a novel model for graph definition based on fluid diffusion.
Our method is able to strongly outperform state-of-the-art schemes for community detection in multimodal data analysis.
arXiv Detail & Related papers (2021-12-07T16:30:03Z) - Flow Based Models For Manifold Data [11.344428134774475]
Flow-based generative models typically define a latent space with dimensionality identical to the observational space.
In many problems, the data does not populate the full ambient data-space that they reside in, rather a lower-dimensional manifold.
We propose to learn a manifold prior that affords benefits to both sample generation and representation quality.
arXiv Detail & Related papers (2021-09-29T06:48:01Z) - Manifold Density Estimation via Generalized Dequantization [9.090451761951101]
Some kinds of data are not well-modeled by supposing that their underlying geometry is Euclidean.
For instance, some kinds of data may be known to lie on the surface of a sphere.
We propose a method, inspired by the literature on "dequantization," which we interpret through a coordinate transformation of an ambient Euclidean space.
arXiv Detail & Related papers (2021-02-14T12:40:41Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.