Data Augmentation with Manifold Barycenters
- URL: http://arxiv.org/abs/2104.00925v1
- Date: Fri, 2 Apr 2021 08:07:21 GMT
- Title: Data Augmentation with Manifold Barycenters
- Authors: Iaroslav Bespalov, Nazar Buzun, Oleg Kachan and Dmitry V. Dylov
- Abstract summary: We propose a new way of representing the available knowledge in the manifold of data barycenters.
We apply this approach to the problem of landmarks detection and augment the available landmarks data within the dataset.
Our approach reduces the overfitting and improves the quality metrics both beyond the original data outcome and beyond the result obtained with classical augmentation methods.
- Score: 8.201100713224003
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The training of Generative Adversarial Networks (GANs) requires a large
amount of data, stimulating the development of new data augmentation methods to
alleviate the challenge. Oftentimes, these methods either fail to produce
enough new data or expand the dataset beyond the original knowledge domain. In
this paper, we propose a new way of representing the available knowledge in the
manifold of data barycenters. Such a representation allows performing data
augmentation based on interpolation between the nearest data elements using
Wasserstein distance. The proposed method finds cliques in the
nearest-neighbors graph and, at each sampling iteration, randomly draws one
clique to compute the Wasserstein barycenter with random uniform weights. These
barycenters then become the new natural-looking elements that one could add to
the dataset. We apply this approach to the problem of landmarks detection and
augment the available landmarks data within the dataset. Additionally, the idea
is validated on cardiac data for the task of medical segmentation. Our approach
reduces the overfitting and improves the quality metrics both beyond the
original data outcome and beyond the result obtained with classical
augmentation methods.
Related papers
- Personalized Federated Learning via Active Sampling [50.456464838807115]
This paper proposes a novel method for sequentially identifying similar (or relevant) data generators.
Our method evaluates the relevance of a data generator by evaluating the effect of a gradient step using its local dataset.
We extend this method to non-parametric models by a suitable generalization of the gradient step to update a hypothesis using the local dataset provided by a data generator.
arXiv Detail & Related papers (2024-09-03T17:12:21Z) - Data Augmentations in Deep Weight Spaces [89.45272760013928]
We introduce a novel augmentation scheme based on the Mixup method.
We evaluate the performance of these techniques on existing benchmarks as well as new benchmarks we generate.
arXiv Detail & Related papers (2023-11-15T10:43:13Z) - Exploring Data Redundancy in Real-world Image Classification through
Data Selection [20.389636181891515]
Deep learning models often require large amounts of data for training, leading to increased costs.
We present two data valuation metrics based on Synaptic Intelligence and gradient norms, respectively, to study redundancy in real-world image data.
Online and offline data selection algorithms are then proposed via clustering and grouping based on the examined data values.
arXiv Detail & Related papers (2023-06-25T03:31:05Z) - Dataset Distillation via Factorization [58.8114016318593]
We introduce a emphdataset factorization approach, termed emphHaBa, which is a plug-and-play strategy portable to any existing dataset distillation (DD) baseline.
emphHaBa explores decomposing a dataset into two components: data emphHallucination networks and emphBases.
Our method can yield significant improvement on downstream classification tasks compared with previous state of the arts, while reducing the total number of compressed parameters by up to 65%.
arXiv Detail & Related papers (2022-10-30T08:36:19Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Feature transforms for image data augmentation [74.12025519234153]
In image classification, many augmentation approaches utilize simple image manipulation algorithms.
In this work, we build ensembles on the data level by adding images generated by combining fourteen augmentation approaches.
Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method.
arXiv Detail & Related papers (2022-01-24T14:12:29Z) - Weakly Supervised Change Detection Using Guided Anisotropic Difusion [97.43170678509478]
We propose original ideas that help us to leverage such datasets in the context of change detection.
First, we propose the guided anisotropic diffusion (GAD) algorithm, which improves semantic segmentation results.
We then show its potential in two weakly-supervised learning strategies tailored for change detection.
arXiv Detail & Related papers (2021-12-31T10:03:47Z) - Uniform-in-Phase-Space Data Selection with Iterative Normalizing Flows [0.0]
A strategy is proposed to select data points such that they uniformly span the phase-space of the data.
An iterative method is used to accurately estimate the probability of the rare data points when only a small subset of the dataset is used to construct the probability map.
The proposed framework is demonstrated as a viable pathway to enable data-efficient machine learning when abundant data is available.
arXiv Detail & Related papers (2021-12-28T20:06:28Z) - Complex Wavelet SSIM based Image Data Augmentation [0.0]
We look at the MNIST handwritten dataset an image dataset used for digit recognition.
We take a detailed look into one of the most popular augmentation techniques used for this data set elastic deformation.
We propose to use a similarity measure called Complex Wavelet Structural Similarity Index Measure (CWSSIM) to selectively filter out the irrelevant data.
arXiv Detail & Related papers (2020-07-11T21:11:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.