Deep Augmentation: Self-Supervised Learning with Transformations in
Activation Space
- URL: http://arxiv.org/abs/2303.14537v2
- Date: Mon, 26 Feb 2024 19:42:20 GMT
- Title: Deep Augmentation: Self-Supervised Learning with Transformations in
Activation Space
- Authors: Rickard Br\"uel-Gabrielsson, Tongzhou Wang, Manel Baradad, Justin
Solomon
- Abstract summary: We introduce Deep Augmentation, an approach to implicit data augmentation using dropout or PCA to transform a targeted layer within a neural network to improve performance and generalization.
We demonstrate Deep Augmentation through extensive experiments on contrastive learning tasks in NLP, computer vision, and graph learning.
- Score: 18.655316096015937
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We introduce Deep Augmentation, an approach to implicit data augmentation
using dropout or PCA to transform a targeted layer within a neural network to
improve performance and generalization. We demonstrate Deep Augmentation
through extensive experiments on contrastive learning tasks in NLP, computer
vision, and graph learning. We observe substantial performance gains with
Transformers, ResNets, and Graph Neural Networks as the underlying models in
contrastive learning, but observe inverse effects on the corresponding
supervised problems. Our analysis suggests that Deep Augmentation alleviates
co-adaption between layers, a form of "collapse." We use this observation to
formulate a method for selecting which layer to target; in particular, our
experimentation reveals that targeting deeper layers with Deep Augmentation
outperforms augmenting the input data. The simple network- and
modality-agnostic nature of this approach enables its integration into various
machine learning pipelines.
Related papers
- Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network.
Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z) - Mechanism of feature learning in convolutional neural networks [14.612673151889615]
We identify the mechanism of how convolutional neural networks learn from image data.
We present empirical evidence for our ansatz, including identifying high correlation between covariances of filters and patch-based AGOPs.
We then demonstrate the generality of our result by using the patch-based AGOP to enable deep feature learning in convolutional kernel machines.
arXiv Detail & Related papers (2023-09-01T16:30:02Z) - Deep Multi-Threshold Spiking-UNet for Image Processing [51.88730892920031]
This paper introduces the novel concept of Spiking-UNet for image processing, which combines the power of Spiking Neural Networks (SNNs) with the U-Net architecture.
To achieve an efficient Spiking-UNet, we face two primary challenges: ensuring high-fidelity information propagation through the network via spikes and formulating an effective training strategy.
Experimental results show that, on image segmentation and denoising, our Spiking-UNet achieves comparable performance to its non-spiking counterpart.
arXiv Detail & Related papers (2023-07-20T16:00:19Z) - Regularization Through Simultaneous Learning: A Case Study on Plant
Classification [0.0]
This paper introduces Simultaneous Learning, a regularization approach drawing on principles of Transfer Learning and Multi-task Learning.
We leverage auxiliary datasets with the target dataset, the UFOP-HVD, to facilitate simultaneous classification guided by a customized loss function.
Remarkably, our approach demonstrates superior performance over models without regularization.
arXiv Detail & Related papers (2023-05-22T19:44:57Z) - Reparameterization through Spatial Gradient Scaling [69.27487006953852]
Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training.
We present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks.
arXiv Detail & Related papers (2023-03-05T17:57:33Z) - Frozen Overparameterization: A Double Descent Perspective on Transfer
Learning of Deep Neural Networks [27.17697714584768]
We study the generalization behavior of transfer learning of deep neural networks (DNNs)
We show that the test error evolution during the target training has a more significant double descent effect when the target training dataset is sufficiently large.
Also, we show that the double descent phenomenon may make a transfer from a less related source task better than a transfer from a more related source task.
arXiv Detail & Related papers (2022-11-20T20:26:23Z) - Learnable Multi-level Frequency Decomposition and Hierarchical Attention
Mechanism for Generalized Face Presentation Attack Detection [7.324459578044212]
Face presentation attack detection (PAD) is attracting a lot of attention and playing a key role in securing face recognition systems.
We propose a dual-stream convolution neural networks (CNNs) framework to deal with unseen scenarios.
We successfully prove the design of our proposed PAD solution in a step-wise ablation study.
arXiv Detail & Related papers (2021-09-16T13:06:43Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Rethinking Skip Connection with Layer Normalization in Transformers and
ResNets [49.87919454950763]
Skip connection is a widely-used technique to improve the performance of deep neural networks.
In this work, we investigate how the scale factors in the effectiveness of the skip connection.
arXiv Detail & Related papers (2021-05-15T11:44:49Z) - FG-Net: Fast Large-Scale LiDAR Point CloudsUnderstanding Network
Leveraging CorrelatedFeature Mining and Geometric-Aware Modelling [15.059508985699575]
FG-Net is a general deep learning framework for large-scale point clouds understanding without voxelizations.
We propose a deep convolutional neural network leveraging correlated feature mining and deformable convolution based geometric-aware modelling.
Our approaches outperform state-of-the-art approaches in terms of accuracy and efficiency.
arXiv Detail & Related papers (2020-12-17T08:20:09Z) - Solving Sparse Linear Inverse Problems in Communication Systems: A Deep
Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems.
The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.