Avoiding Overfitting: A Survey on Regularization Methods for
Convolutional Neural Networks
- URL: http://arxiv.org/abs/2201.03299v1
- Date: Mon, 10 Jan 2022 11:54:06 GMT
- Title: Avoiding Overfitting: A Survey on Regularization Methods for
Convolutional Neural Networks
- Authors: Claudio Filipi Gon\c{c}alves dos Santos, Jo\~ao Paulo Papa
- Abstract summary: Several image processing tasks have been significantly improved using Convolutional Neural Networks (CNN)
A critical factor in training concerns the network's regularization, which prevents the structure from overfitting.
This work analyzes several regularization methods developed in the last few years, showing significant improvements for different CNN models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Several image processing tasks, such as image classification and object
detection, have been significantly improved using Convolutional Neural Networks
(CNN). Like ResNet and EfficientNet, many architectures have achieved
outstanding results in at least one dataset by the time of their creation. A
critical factor in training concerns the network's regularization, which
prevents the structure from overfitting. This work analyzes several
regularization methods developed in the last few years, showing significant
improvements for different CNN models. The works are classified into three main
areas: the first one is called "data augmentation", where all the techniques
focus on performing changes in the input data. The second, named "internal
changes", which aims to describe procedures to modify the feature maps
generated by the neural network or the kernels. The last one, called "label",
concerns transforming the labels of a given input. This work presents two main
differences comparing to other available surveys about regularization: (i) the
first concerns the papers gathered in the manuscript, which are not older than
five years, and (ii) the second distinction is about reproducibility, i.e., all
works refered here have their code available in public repositories or they
have been directly implemented in some framework, such as TensorFlow or Torch.
Related papers
- Dual-Path Adversarial Lifting for Domain Shift Correction in Online Test-time Adaptation [59.18151483767509]
We introduce a dual-path token lifting for domain shift correction in test time adaptation.
We then perform dual-path lifting with interleaved token prediction and update between the path of domain shift tokens and the path of class tokens.
Experimental results on the benchmark datasets demonstrate that our proposed method significantly improves the online fully test-time domain adaptation performance.
arXiv Detail & Related papers (2024-08-26T02:33:47Z) - Cross-codex Learning for Reliable Scribe Identification in Medieval
Manuscripts [0.0]
We demonstrate the importance of cross-codex training data for CNN based text-independent off-line scribe identification.
We trained different neural networks on our complex data, validating time and accuracy differences in order to define the most reliable network architecture.
We present the results on our large scale open source dataset -- the Codex Claustroneoburgensis database (CCl-DB)
arXiv Detail & Related papers (2023-12-07T13:40:20Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Towards Better Out-of-Distribution Generalization of Neural Algorithmic
Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks.
The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z) - Deep vanishing point detection: Geometric priors make dataset variations
vanish [24.348651041697114]
Deep learning has improved vanishing point detection in images.
Yet, deep networks require expensive annotated datasets trained on costly hardware.
Here, we address these issues by injecting deep vanishing point detection networks with prior knowledge.
arXiv Detail & Related papers (2022-03-16T12:34:27Z) - Deep ensembles in bioimage segmentation [74.01883650587321]
In this work, we propose an ensemble of convolutional neural networks (CNNs)
In ensemble methods, many different models are trained and then used for classification, the ensemble aggregates the outputs of the single classifiers.
The proposed ensemble is implemented by combining different backbone networks using the DeepLabV3+ and HarDNet environment.
arXiv Detail & Related papers (2021-12-24T05:54:21Z) - Neural Pruning via Growing Regularization [82.9322109208353]
We extend regularization to tackle two central problems of pruning: pruning schedule and weight importance scoring.
Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains.
The proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning.
arXiv Detail & Related papers (2020-12-16T20:16:28Z) - Deep Convolutional Neural Networks: A survey of the foundations,
selected improvements, and some current applications [0.0]
This paper seeks to present and discuss one such method, namely Convolutional Neural Networks (CNNs)
CNNs are deep neural networks that use a special linear operation called convolution.
This paper discusses two applications of convolution that have proven to be very effective in practice.
arXiv Detail & Related papers (2020-11-25T19:03:23Z) - Attentive WaveBlock: Complementarity-enhanced Mutual Networks for
Unsupervised Domain Adaptation in Person Re-identification and Beyond [97.25179345878443]
This paper proposes a novel light-weight module, the Attentive WaveBlock (AWB)
AWB can be integrated into the dual networks of mutual learning to enhance the complementarity and further depress noise in the pseudo-labels.
Experiments demonstrate that the proposed method achieves state-of-the-art performance with significant improvements on multiple UDA person re-identification tasks.
arXiv Detail & Related papers (2020-06-11T15:40:40Z) - On the Texture Bias for Few-Shot CNN Segmentation [21.349705243254423]
Convolutional Neural Networks (CNNs) are driven by shapes to perform visual recognition tasks.
Recent evidence suggests texture bias in CNNs provides higher performing models when learning on large labeled training datasets.
We propose a novel architecture that integrates a set of Difference of Gaussians (DoG) to attenuate high-frequency local components in the feature space.
arXiv Detail & Related papers (2020-03-09T11:55:47Z) - Fully Convolutional Neural Networks for Raw Eye Tracking Data
Segmentation, Generation, and Reconstruction [15.279153483132179]
We use fully convolutional neural networks for semantic segmentation of eye tracking data.
We also use these networks for reconstruction, and in conjunction with a variational auto-encoder to generate eye movement data.
arXiv Detail & Related papers (2020-02-17T06:57:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.