DLME: Deep Local-flatness Manifold Embedding
- URL: http://arxiv.org/abs/2207.03160v1
- Date: Thu, 7 Jul 2022 08:46:17 GMT
- Title: DLME: Deep Local-flatness Manifold Embedding
- Authors: Zelin Zang and Siyuan Li and Di Wu and Ge Wang and Lei Shang and
Baigui Sun and Hao Li and Stan Z. Li
- Abstract summary: Deep Local-flatness Manifold Embedding (DLME) is a novel ML framework to obtain reliable manifold embedding by reducing distortion.
In the experiments, by showing the effectiveness of DLME on downstream classification, clustering, and visualization tasks, our results show that DLME outperforms SOTA ML & contrastive learning (CL) methods.
- Score: 41.86924171938867
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Manifold learning~(ML) aims to find low-dimensional embedding from
high-dimensional data. Previous works focus on handcraft or easy datasets with
simple and ideal scenarios; however, we find they perform poorly on real-world
datasets with under-sampling data. Generally, ML methods primarily model data
structure and subsequently process a low-dimensional embedding, where the poor
local connectivity of under-sampling data in the former step and inappropriate
optimization objectives in the later step will lead to \emph{structural
distortion} and \emph{underconstrained embedding}. To solve this problem, we
propose Deep Local-flatness Manifold Embedding (DLME), a novel ML framework to
obtain reliable manifold embedding by reducing distortion. Our proposed DLME
constructs semantic manifolds by data augmentation and overcomes
\emph{structural distortion} problems with the help of its smooth framework. To
overcome \emph{underconstrained embedding}, we design a specific loss for DLME
and mathematically demonstrate that it leads to a more suitable embedding based
on our proposed Local Flatness Assumption. In the experiments, by showing the
effectiveness of DLME on downstream classification, clustering, and
visualization tasks with three types of datasets (toy, biological, and image),
our experimental results show that DLME outperforms SOTA ML \& contrastive
learning (CL) methods.
Related papers
- Language Models as Zero-shot Lossless Gradient Compressors: Towards
General Neural Parameter Prior Models [66.1595537904019]
Large language models (LLMs) can act as gradient priors in a zero-shot setting.
We introduce LM-GC, a novel method that integrates LLMs with arithmetic coding.
arXiv Detail & Related papers (2024-09-26T13:38:33Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Simulation-Enhanced Data Augmentation for Machine Learning Pathloss
Prediction [9.664420734674088]
This paper introduces a novel simulation-enhanced data augmentation method for machine learning pathloss prediction.
Our method integrates synthetic data generated from a cellular coverage simulator and independently collected real-world datasets.
The integration of synthetic data significantly improves the generalizability of the model in different environments.
arXiv Detail & Related papers (2024-02-03T00:38:08Z) - Scalable manifold learning by uniform landmark sampling and constrained
locally linear embedding [0.6144680854063939]
We propose a scalable manifold learning (scML) method that can manipulate large-scale and high-dimensional data in an efficient manner.
We empirically validated the effectiveness of scML on synthetic datasets and real-world benchmarks of different types.
scML scales well with increasing data sizes and embedding dimensions, and exhibits promising performance in preserving the global structure.
arXiv Detail & Related papers (2024-01-02T08:43:06Z) - Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes [57.62036621319563]
We introduce CLLM, which leverages the prior knowledge of Large Language Models (LLMs) for data augmentation in the low-data regime.
We demonstrate the superior performance of CLLM in the low-data regime compared to conventional generators.
arXiv Detail & Related papers (2023-12-19T12:34:46Z) - An Adaptive Plug-and-Play Network for Few-Shot Learning [12.023266104119289]
Few-shot learning requires a model to classify new samples after learning from only a few samples.
Deep networks and complex metrics tend to induce overfitting, making it difficult to further improve the performance.
We propose plug-and-play model-adaptive resizer (MAR) and adaptive similarity metric (ASM) without any other losses.
arXiv Detail & Related papers (2023-02-18T13:25:04Z) - Minimizing the Accumulated Trajectory Error to Improve Dataset
Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory.
Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z) - DAS: Densely-Anchored Sampling for Deep Metric Learning [43.81322638018864]
We propose a Densely-Anchored Sampling (DAS) scheme that exploits the anchor's nearby embedding space to densely produce embeddings without data points.
Our method is effortlessly integrated into existing DML frameworks and improves them without bells and whistles.
arXiv Detail & Related papers (2022-07-30T02:07:46Z) - Deep Recursive Embedding for High-Dimensional Data [9.611123249318126]
We propose to combine deep neural networks (DNN) with mathematics-guided embedding rules for high-dimensional data embedding.
We introduce a generic deep embedding network (DEN) framework, which is able to learn a parametric mapping from high-dimensional space to low-dimensional space.
arXiv Detail & Related papers (2021-10-31T23:22:33Z) - Two-Dimensional Semi-Nonnegative Matrix Factorization for Clustering [50.43424130281065]
We propose a new Semi-Nonnegative Matrix Factorization method for 2-dimensional (2D) data, named TS-NMF.
It overcomes the drawback of existing methods that seriously damage the spatial information of the data by converting 2D data to vectors in a preprocessing step.
arXiv Detail & Related papers (2020-05-19T05:54:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.