Related papers: Random Field Augmentations for Self-Supervised Representation Learning

Random Field Augmentations for Self-Supervised Representation Learning

URL: http://arxiv.org/abs/2311.03629v1
Date: Tue, 7 Nov 2023 00:35:09 GMT
Title: Random Field Augmentations for Self-Supervised Representation Learning
Authors: Philip Andrew Mansfield, Arash Afkanpour, Warren Richard Morningstar, Karan Singhal
Abstract summary: We propose a new family of local transformations based on Gaussian random fields to generate image augmentations for self-supervised representation learning. We achieve a 1.7% top-1 accuracy improvement over baseline on ImageNet downstream classification, and a 3.6% improvement on out-of-distribution iNaturalist downstream classification. While mild transformations improve representations, we observe that strong transformations can degrade the structure of an image.
Score: 4.3543354293465155
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Self-supervised representation learning is heavily dependent on data augmentations to specify the invariances encoded in representations. Previous work has shown that applying diverse data augmentations is crucial to downstream performance, but augmentation techniques remain under-explored. In this work, we propose a new family of local transformations based on Gaussian random fields to generate image augmentations for self-supervised representation learning. These transformations generalize the well-established affine and color transformations (translation, rotation, color jitter, etc.) and greatly increase the space of augmentations by allowing transformation parameter values to vary from pixel to pixel. The parameters are treated as continuous functions of spatial coordinates, and modeled as independent Gaussian random fields. Empirical results show the effectiveness of the new transformations for self-supervised representation learning. Specifically, we achieve a 1.7% top-1 accuracy improvement over baseline on ImageNet downstream classification, and a 3.6% improvement on out-of-distribution iNaturalist downstream classification. However, due to the flexibility of the new transformations, learned representations are sensitive to hyperparameters. While mild transformations improve representations, we observe that strong transformations can degrade the structure of an image, indicating that balancing the diversity and strength of augmentations is important for improving generalization of learned representations.

Related papers

Exploring Kernel Transformations for Implicit Neural Representations [57.2225355625268]
Implicit neural representations (INRs) leverage neural networks to represent signals by mapping coordinates to their corresponding attributes. This work pioneers the exploration of the effect of kernel transformation of input/output while keeping the model itself unchanged. A byproduct of our findings is a simple yet effective method that combines scale and shift to significantly boost INR with negligible overhead.
arXiv Detail & Related papers (2025-04-07T04:43:50Z)
Self-supervised Transformation Learning for Equivariant Representations [26.207358743969277]
Unsupervised representation learning has significantly advanced various machine learning tasks. We propose Self-supervised Transformation Learning (STL), replacing transformation labels with transformation representations derived from image pairs. We demonstrate the approach's effectiveness across diverse classification and detection tasks, outperforming existing methods in 7 out of 11 benchmarks.
arXiv Detail & Related papers (2025-01-15T10:54:21Z)
Steerable Equivariant Representation Learning [36.138305341173414]
In this paper, we propose a method of learning representations that are instead equivariant to data augmentations. We demonstrate that our resulting steerable and equivariant representations lead to better performance on transfer learning and robustness.
arXiv Detail & Related papers (2023-02-22T12:42:45Z)
Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models. Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples. We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z)
Deep Diversity-Enhanced Feature Representation of Hyperspectral Images [87.47202258194719]
We rectify 3D convolution by modifying its topology to enhance the rank upper-bound. We also propose a novel diversity-aware regularization (DA-Reg) term that acts on the feature maps to maximize independence among elements. To demonstrate the superiority of the proposed Re$3$-ConvSet and DA-Reg, we apply them to various HS image processing and analysis tasks.
arXiv Detail & Related papers (2023-01-15T16:19:18Z)
Local Magnification for Data and Feature Augmentation [53.04028225837681]
We propose an easy-to-implement and model-free data augmentation method called Local Magnification (LOMA) LOMA generates additional training data by randomly magnifying a local area of the image. Experiments show that our proposed LOMA, though straightforward, can be combined with standard data augmentation to significantly improve the performance on image classification and object detection.
arXiv Detail & Related papers (2022-11-15T02:51:59Z)
Masked Autoencoders are Robust Data Augmentors [90.34825840657774]
Regularization techniques like image augmentation are necessary for deep neural networks to generalize well. We propose a novel perspective of augmentation to regularize the training process. We show that utilizing such model-based nonlinear transformation as data augmentation can improve high-level recognition tasks.
arXiv Detail & Related papers (2022-06-10T02:41:48Z)
Data augmentation with mixtures of max-entropy transformations for filling-level classification [88.14088768857242]
We address the problem of distribution shifts in test-time data with a principled data augmentation scheme for the task of content-level classification. We show that such a principled augmentation scheme, alone, can replace current approaches that use transfer learning or can be used in combination with transfer learning to improve its performance.
arXiv Detail & Related papers (2022-03-08T11:41:38Z)
Robust Training Using Natural Transformation [19.455666609149567]
We present NaTra, an adversarial training scheme to improve robustness of image classification algorithms. We target attributes of the input images that are independent of the class identification, and manipulate those attributes to mimic real-world natural transformations. We demonstrate the efficacy of our scheme by utilizing the disentangled latent representations derived from well-trained GANs.
arXiv Detail & Related papers (2021-05-10T01:56:03Z)
Invariant Deep Compressible Covariance Pooling for Aerial Scene Categorization [80.55951673479237]
We propose a novel invariant deep compressible covariance pooling (IDCCP) to solve nuisance variations in aerial scene categorization. We conduct extensive experiments on the publicly released aerial scene image data sets and demonstrate the superiority of this method compared with state-of-the-art methods.
arXiv Detail & Related papers (2020-11-11T11:13:07Z)
FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning [64.32306537419498]
We propose a novel learned feature-based refinement and augmentation method that produces a varied set of complex transformations. These transformations also use information from both within-class and across-class representations that we extract through clustering. We demonstrate that our method is comparable to current state of art for smaller datasets while being able to scale up to larger datasets.
arXiv Detail & Related papers (2020-07-16T17:55:31Z)
Group Equivariant Generative Adversarial Networks [7.734726150561089]
In this work, we explicitly incorporate inductive symmetry priors into the network architectures via group-equivariant convolutional networks. Group-convariants have higher expressive power with fewer samples and lead to better gradient feedback between generator and discriminator.
arXiv Detail & Related papers (2020-05-04T17:38:49Z)
Probabilistic Spatial Transformer Networks [0.6999740786886537]
We propose a probabilistic extension that estimates a transformation rather than a deterministic one. We show that these two properties lead to improved classification performance, robustness and model calibration. We further demonstrate that the approach generalizes to non-visual domains by improving model performance on time-series data.
arXiv Detail & Related papers (2020-04-07T18:22:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.