Exploiting Invariance in Training Deep Neural Networks
- URL: http://arxiv.org/abs/2103.16634v1
- Date: Tue, 30 Mar 2021 19:18:31 GMT
- Title: Exploiting Invariance in Training Deep Neural Networks
- Authors: Chengxi Ye, Xiong Zhou, Tristan McKinney, Yanfeng Liu, Qinggang Zhou,
Fedor Zhdanov
- Abstract summary: Inspired by two basic mechanisms in animal visual systems, we introduce a feature transform technique that imposes invariance properties in the training of deep neural networks.
The resulting algorithm requires less parameter tuning, trains well with an initial learning rate 1.0, and easily generalizes to different tasks.
Tested on ImageNet, MS COCO, and Cityscapes datasets, our proposed technique requires fewer iterations to train, surpasses all baselines by a large margin, seamlessly works on both small and large batch size training, and applies to different computer vision tasks of image classification, object detection, and semantic segmentation.
- Score: 4.169130102668252
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inspired by two basic mechanisms in animal visual systems, we introduce a
feature transform technique that imposes invariance properties in the training
of deep neural networks. The resulting algorithm requires less parameter
tuning, trains well with an initial learning rate 1.0, and easily generalizes
to different tasks. We enforce scale invariance with local statistics in the
data to align similar samples generated in diverse situations. To accelerate
convergence, we enforce a GL(n)-invariance property with global statistics
extracted from a batch that the gradient descent solution should remain
invariant under basis change. Tested on ImageNet, MS COCO, and Cityscapes
datasets, our proposed technique requires fewer iterations to train, surpasses
all baselines by a large margin, seamlessly works on both small and large batch
size training, and applies to different computer vision tasks of image
classification, object detection, and semantic segmentation.
Related papers
- Unsupervised convolutional neural network fusion approach for change
detection in remote sensing images [1.892026266421264]
We introduce a completely unsupervised shallow convolutional neural network (USCNN) fusion approach for change detection.
Our model has three features: the entire training process is conducted in an unsupervised manner, the network architecture is shallow, and the objective function is sparse.
Experimental results on four real remote sensing datasets indicate the feasibility and effectiveness of the proposed approach.
arXiv Detail & Related papers (2023-11-07T03:10:17Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Learning Compact Features via In-Training Representation Alignment [19.273120635948363]
In each epoch, the true gradient of the loss function is estimated using a mini-batch sampled from the training set.
We propose In-Training Representation Alignment (ITRA) that explicitly aligns feature distributions of two different mini-batches with a matching loss.
We also provide a rigorous analysis of the desirable effects of the matching loss on feature representation learning.
arXiv Detail & Related papers (2022-11-23T22:23:22Z) - Multi-scale and Cross-scale Contrastive Learning for Semantic
Segmentation [5.281694565226513]
We apply contrastive learning to enhance the discriminative power of the multi-scale features extracted by semantic segmentation networks.
By first mapping the encoder's multi-scale representations to a common feature space, we instantiate a novel form of supervised local-global constraint.
arXiv Detail & Related papers (2022-03-25T01:24:24Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z) - Quasi-Global Momentum: Accelerating Decentralized Deep Learning on
Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks.
In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge.
We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z) - Truly shift-invariant convolutional neural networks [0.0]
Recent works have shown that the output of a CNN can change significantly with small shifts in input.
We propose adaptive polyphase sampling (APS), a simple sub-sampling scheme that allows convolutional neural networks to achieve 100% consistency in classification performance under shifts.
arXiv Detail & Related papers (2020-11-28T20:57:35Z) - Regularizing Deep Networks with Semantic Data Augmentation [44.53483945155832]
We propose a novel semantic data augmentation algorithm to complement traditional approaches.
The proposed method is inspired by the intriguing property that deep networks are effective in learning linearized features.
We show that the proposed implicit semantic data augmentation (ISDA) algorithm amounts to minimizing a novel robust CE loss.
arXiv Detail & Related papers (2020-07-21T00:32:44Z) - On Robustness and Transferability of Convolutional Neural Networks [147.71743081671508]
Modern deep convolutional networks (CNNs) are often criticized for not generalizing under distributional shifts.
We study the interplay between out-of-distribution and transfer performance of modern image classification CNNs for the first time.
We find that increasing both the training set and model sizes significantly improve the distributional shift robustness.
arXiv Detail & Related papers (2020-07-16T18:39:04Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.