MGAug: Multimodal Geometric Augmentation in Latent Spaces of Image
Deformations
- URL: http://arxiv.org/abs/2312.13440v2
- Date: Thu, 25 Jan 2024 18:31:49 GMT
- Title: MGAug: Multimodal Geometric Augmentation in Latent Spaces of Image
Deformations
- Authors: Tonmoy Hossain and Miaomiao Zhang
- Abstract summary: We propose a novel model that generates augmenting transformations in a multimodal latent space of geometric deformations.
Experimental results show that our proposed approach outperforms all baselines by significantly improved prediction accuracy.
- Score: 2.711740183729759
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Geometric transformations have been widely used to augment the size of
training images. Existing methods often assume a unimodal distribution of the
underlying transformations between images, which limits their power when data
with multimodal distributions occur. In this paper, we propose a novel model,
Multimodal Geometric Augmentation (MGAug), that for the first time generates
augmenting transformations in a multimodal latent space of geometric
deformations. To achieve this, we first develop a deep network that embeds the
learning of latent geometric spaces of diffeomorphic transformations (a.k.a.
diffeomorphisms) in a variational autoencoder (VAE). A mixture of multivariate
Gaussians is formulated in the tangent space of diffeomorphisms and serves as a
prior to approximate the hidden distribution of image transformations. We then
augment the original training dataset by deforming images using randomly
sampled transformations from the learned multimodal latent space of VAE. To
validate the efficiency of our model, we jointly learn the augmentation
strategy with two distinct domain-specific tasks: multi-class classification on
2D synthetic datasets and segmentation on real 3D brain magnetic resonance
images (MRIs). We also compare MGAug with state-of-the-art transformation-based
image augmentation algorithms. Experimental results show that our proposed
approach outperforms all baselines by significantly improved prediction
accuracy. Our code is publicly available at
https://github.com/tonmoy-hossain/MGAug.
Related papers
- Image-GS: Content-Adaptive Image Representation via 2D Gaussians [55.15950594752051]
We propose Image-GS, a content-adaptive image representation.
Using anisotropic 2D Gaussians as the basis, Image-GS shows high memory efficiency, supports fast random access, and offers a natural level of detail stack.
General efficiency and fidelity of Image-GS are validated against several recent neural image representations and industry-standard texture compressors.
We hope this research offers insights for developing new applications that require adaptive quality and resource control, such as machine perception, asset streaming, and content generation.
arXiv Detail & Related papers (2024-07-02T00:45:21Z) - Cross-domain and Cross-dimension Learning for Image-to-Graph
Transformers [50.576354045312115]
Direct image-to-graph transformation is a challenging task that solves object detection and relationship prediction in a single model.
We introduce a set of methods enabling cross-domain and cross-dimension transfer learning for image-to-graph transformers.
We demonstrate our method's utility in cross-domain and cross-dimension experiments, where we pretrain our models on 2D satellite images before applying them to vastly different target domains in 2D and 3D.
arXiv Detail & Related papers (2024-03-11T10:48:56Z) - Self-Supervised Learning from Non-Object Centric Images with a Geometric
Transformation Sensitive Architecture [7.825153552141346]
We propose a Geometric Transformation Sensitive Architecture to be sensitive to geometric transformations.
Our method encourages the student to be sensitive by predicting rotation and using targets that vary with those transformations.
Our approach demonstrates improved performance when using non-object-centric images as pretraining data.
arXiv Detail & Related papers (2023-04-17T06:32:37Z) - Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models.
Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples.
We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z) - Prediction of Geometric Transformation on Cardiac MRI via Convolutional
Neural Network [13.01021780124613]
We propose to learn features in medical images by training ConvNets to recognize the geometric transformation applied to images.
We present a simple self-supervised task that can easily predict the geometric transformation.
arXiv Detail & Related papers (2022-11-12T11:29:14Z) - Geo-SIC: Learning Deformable Geometric Shapes in Deep Image Classifiers [8.781861951759948]
This paper presents Geo-SIC, the first deep learning model to learn deformable shapes in a deformation space for an improved performance of image classification.
We introduce a newly designed framework that (i) simultaneously derives features from both image and latent shape spaces with large intra-class variations.
We develop a boosted classification network, equipped with an unsupervised learning of geometric shape representations.
arXiv Detail & Related papers (2022-10-25T01:55:17Z) - Orthonormal Convolutions for the Rotation Based Iterative
Gaussianization [64.44661342486434]
This paper elaborates an extension of rotation-based iterative Gaussianization, RBIG, which makes image Gaussianization possible.
In images its application has been restricted to small image patches or isolated pixels, because rotation in RBIG is based on principal or independent component analysis.
We present the emphConvolutional RBIG: an extension that alleviates this issue by imposing that the rotation in RBIG is a convolution.
arXiv Detail & Related papers (2022-06-08T12:56:34Z) - Feature transforms for image data augmentation [74.12025519234153]
In image classification, many augmentation approaches utilize simple image manipulation algorithms.
In this work, we build ensembles on the data level by adding images generated by combining fourteen augmentation approaches.
Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method.
arXiv Detail & Related papers (2022-01-24T14:12:29Z) - Deep and Shallow Covariance Feature Quantization for 3D Facial
Expression Recognition [7.773399781313892]
We propose a multi-modal 2D + 3D feature-based method for facial expression recognition.
We extract shallow features from the 3D images, and deep features using Convolutional Neural Networks (CNN) from the transformed 2D images.
High classification performances have been achieved on the BU-3DFE and Bosphorus datasets.
arXiv Detail & Related papers (2021-05-12T14:48:39Z) - The Geometry of Deep Generative Image Models and its Applications [0.0]
Generative adversarial networks (GANs) have emerged as a powerful unsupervised method to model the statistical patterns of real-world data sets.
These networks are trained to map random inputs in their latent space to new samples representative of the learned data.
The structure of the latent space is hard to intuit due to its high dimensionality and the non-linearity of the generator.
arXiv Detail & Related papers (2021-01-15T07:57:33Z) - FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning [64.32306537419498]
We propose a novel learned feature-based refinement and augmentation method that produces a varied set of complex transformations.
These transformations also use information from both within-class and across-class representations that we extract through clustering.
We demonstrate that our method is comparable to current state of art for smaller datasets while being able to scale up to larger datasets.
arXiv Detail & Related papers (2020-07-16T17:55:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.