Related papers: Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks

Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks

URL: http://arxiv.org/abs/2201.03299v1
Date: Mon, 10 Jan 2022 11:54:06 GMT
Title: Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks
Authors: Claudio Filipi Gon\c{c}alves dos Santos, Jo\~ao Paulo Papa
Abstract summary: Several image processing tasks have been significantly improved using Convolutional Neural Networks (CNN) A critical factor in training concerns the network's regularization, which prevents the structure from overfitting. This work analyzes several regularization methods developed in the last few years, showing significant improvements for different CNN models.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Several image processing tasks, such as image classification and object detection, have been significantly improved using Convolutional Neural Networks (CNN). Like ResNet and EfficientNet, many architectures have achieved outstanding results in at least one dataset by the time of their creation. A critical factor in training concerns the network's regularization, which prevents the structure from overfitting. This work analyzes several regularization methods developed in the last few years, showing significant improvements for different CNN models. The works are classified into three main areas: the first one is called "data augmentation", where all the techniques focus on performing changes in the input data. The second, named "internal changes", which aims to describe procedures to modify the feature maps generated by the neural network or the kernels. The last one, called "label", concerns transforming the labels of a given input. This work presents two main differences comparing to other available surveys about regularization: (i) the first concerns the papers gathered in the manuscript, which are not older than five years, and (ii) the second distinction is about reproducibility, i.e., all works refered here have their code available in public repositories or they have been directly implemented in some framework, such as TensorFlow or Torch.

Related papers

A Pipeline of Augmentation and Sequence Embedding for Classification of Imbalanced Network Traffic [0.0]
We propose a pipeline to balance the dataset and classify it using a robust and accurate embedding technique. We demonstrate that the proposed augmentation pipeline, combined with FS-Embedding, increases convergence speed and leads to a significant reduction in the number of model parameters.
arXiv Detail & Related papers (2025-02-26T07:55:24Z)
Dual-Path Adversarial Lifting for Domain Shift Correction in Online Test-time Adaptation [59.18151483767509]
We introduce a dual-path token lifting for domain shift correction in test time adaptation. We then perform dual-path lifting with interleaved token prediction and update between the path of domain shift tokens and the path of class tokens. Experimental results on the benchmark datasets demonstrate that our proposed method significantly improves the online fully test-time domain adaptation performance.
arXiv Detail & Related papers (2024-08-26T02:33:47Z)
Cross-codex Learning for Reliable Scribe Identification in Medieval Manuscripts [0.0]
We demonstrate the importance of cross-codex training data for CNN based text-independent off-line scribe identification. We trained different neural networks on our complex data, validating time and accuracy differences in order to define the most reliable network architecture. We present the results on our large scale open source dataset -- the Codex Claustroneoburgensis database (CCl-DB)
arXiv Detail & Related papers (2023-12-07T13:40:20Z)
Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge. We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem. Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z)
Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks. The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z)
Deep vanishing point detection: Geometric priors make dataset variations vanish [24.348651041697114]
Deep learning has improved vanishing point detection in images. Yet, deep networks require expensive annotated datasets trained on costly hardware. Here, we address these issues by injecting deep vanishing point detection networks with prior knowledge.
arXiv Detail & Related papers (2022-03-16T12:34:27Z)
Deep ensembles in bioimage segmentation [74.01883650587321]
In this work, we propose an ensemble of convolutional neural networks (CNNs) In ensemble methods, many different models are trained and then used for classification, the ensemble aggregates the outputs of the single classifiers. The proposed ensemble is implemented by combining different backbone networks using the DeepLabV3+ and HarDNet environment.
arXiv Detail & Related papers (2021-12-24T05:54:21Z)
Neural Pruning via Growing Regularization [82.9322109208353]
We extend regularization to tackle two central problems of pruning: pruning schedule and weight importance scoring. Specifically, we propose an L2 regularization variant with rising penalty factors and show it can bring significant accuracy gains. The proposed algorithms are easy to implement and scalable to large datasets and networks in both structured and unstructured pruning.
arXiv Detail & Related papers (2020-12-16T20:16:28Z)
Deep Convolutional Neural Networks: A survey of the foundations, selected improvements, and some current applications [0.0]
This paper seeks to present and discuss one such method, namely Convolutional Neural Networks (CNNs) CNNs are deep neural networks that use a special linear operation called convolution. This paper discusses two applications of convolution that have proven to be very effective in practice.
arXiv Detail & Related papers (2020-11-25T19:03:23Z)
Attentive WaveBlock: Complementarity-enhanced Mutual Networks for Unsupervised Domain Adaptation in Person Re-identification and Beyond [97.25179345878443]
This paper proposes a novel light-weight module, the Attentive WaveBlock (AWB) AWB can be integrated into the dual networks of mutual learning to enhance the complementarity and further depress noise in the pseudo-labels. Experiments demonstrate that the proposed method achieves state-of-the-art performance with significant improvements on multiple UDA person re-identification tasks.
arXiv Detail & Related papers (2020-06-11T15:40:40Z)
On the Texture Bias for Few-Shot CNN Segmentation [21.349705243254423]
Convolutional Neural Networks (CNNs) are driven by shapes to perform visual recognition tasks. Recent evidence suggests texture bias in CNNs provides higher performing models when learning on large labeled training datasets. We propose a novel architecture that integrates a set of Difference of Gaussians (DoG) to attenuate high-frequency local components in the feature space.
arXiv Detail & Related papers (2020-03-09T11:55:47Z)
Fully Convolutional Neural Networks for Raw Eye Tracking Data Segmentation, Generation, and Reconstruction [15.279153483132179]
We use fully convolutional neural networks for semantic segmentation of eye tracking data. We also use these networks for reconstruction, and in conjunction with a variational auto-encoder to generate eye movement data.
arXiv Detail & Related papers (2020-02-17T06:57:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.