Related papers: DropCluster: A structured dropout for convolutional networks

DropCluster: A structured dropout for convolutional networks

URL: http://arxiv.org/abs/2002.02997v1
Date: Fri, 7 Feb 2020 20:02:47 GMT
Title: DropCluster: A structured dropout for convolutional networks
Authors: Liyan Chen, Philip Gautier, Sergul Aydore
Abstract summary: Dropout as a regularizer in deep neural networks has been less effective in convolutional layers than in fully connected layers. We introduce a novel structured regularization for convolutional layers, which we call DropCluster. Our approach achieves better performance than DropBlock or other existing structured dropout variants.
Score: 0.7489179288638513
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Dropout as a regularizer in deep neural networks has been less effective in convolutional layers than in fully connected layers. This is due to the fact that dropout drops features randomly. When features are spatially correlated as in the case of convolutional layers, information about the dropped pixels can still propagate to the next layers via neighboring pixels. In order to address this problem, more structured forms of dropout have been proposed. A drawback of these methods is that they do not adapt to the data. In this work, we introduce a novel structured regularization for convolutional layers, which we call DropCluster. Our regularizer relies on data-driven structure. It finds clusters of correlated features in convolutional layer outputs and drops the clusters randomly at each iteration. The clusters are learned and updated during model training so that they adapt both to the data and to the model weights. Our experiments on the ResNet-50 architecture demonstrate that our approach achieves better performance than DropBlock or other existing structured dropout variants. We also demonstrate the robustness of our approach when the size of training data is limited and when there is corruption in the data at test time.

Related papers

Dynamic DropConnect: Enhancing Neural Network Robustness through Adaptive Edge Dropping Strategies [2.07180164747172]
Dropout and DropConnect are well-known techniques that apply a consistent drop rate to randomly deactivate neurons or edges in a neural network layer during training. This paper introduces a novel methodology that assigns dynamic drop rates to each edge within a layer, uniquely tailoring the dropping process without incorporating additional learning parameters.
arXiv Detail & Related papers (2025-02-27T10:17:02Z)
R-Block: Regularized Block of Dropout for convolutional networks [0.0]
Dropout as a regularization technique is widely used in fully connected layers while is less effective in convolutional layers. In this paper, we apply a mutual learning training strategy for convolutional layer regularization, namely R-Block. We show that R-Block achieves better performance than other existing structured dropout variants.
arXiv Detail & Related papers (2023-07-27T18:53:14Z)
Hard Regularization to Prevent Deep Online Clustering Collapse without Data Augmentation [65.268245109828]
Online deep clustering refers to the joint use of a feature extraction network and a clustering model to assign cluster labels to each new data point or batch as it is processed. While faster and more versatile than offline methods, online clustering can easily reach the collapsed solution where the encoder maps all inputs to the same point and all are put into a single cluster. We propose a method that does not require data augmentation, and that, differently from existing methods, regularizes the hard assignments.
arXiv Detail & Related papers (2023-03-29T08:23:26Z)
Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum. Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels. They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z)
Revisiting Structured Dropout [11.011268090482577]
textbfemphProbDropBlock drops contiguous blocks from feature maps with a probability given by the normalized feature salience values. We find that with a simple scheduling strategy the proposed approach to structured Dropout consistently improved model performance compared to baselines.
arXiv Detail & Related papers (2022-10-05T21:26:57Z)
Linear Connectivity Reveals Generalization Strategies [54.947772002394736]
Some pairs of finetuned models have large barriers of increasing loss on the linear paths between them. We find distinct clusters of models which are linearly connected on the test loss surface, but are disconnected from models outside the cluster. Our work demonstrates how the geometry of the loss surface can guide models towards different functions.
arXiv Detail & Related papers (2022-05-24T23:43:02Z)
Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training. We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively. Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z)
Robustness to Missing Features using Hierarchical Clustering with Split Neural Networks [39.29536042476913]
We propose a simple yet effective approach that clusters similar input features together using hierarchical clustering. We evaluate this approach on a series of benchmark datasets and show promising improvements even with simple imputation techniques.
arXiv Detail & Related papers (2020-11-19T00:35:08Z)
Advanced Dropout: A Model-free Methodology for Bayesian Dropout Optimization [62.8384110757689]
Overfitting ubiquitously exists in real-world applications of deep neural networks (DNNs) The advanced dropout technique applies a model-free and easily implemented distribution with parametric prior, and adaptively adjusts dropout rate. We evaluate the effectiveness of the advanced dropout against nine dropout techniques on seven computer vision datasets.
arXiv Detail & Related papers (2020-10-11T13:19:58Z)
Online Deep Clustering for Unsupervised Representation Learning [108.33534231219464]
Online Deep Clustering (ODC) performs clustering and network update simultaneously rather than alternatingly. We design and maintain two dynamic memory modules, i.e., samples memory to store samples labels and features, and centroids memory for centroids evolution. In this way, labels and the network evolve shoulder-to-shoulder rather than alternatingly.
arXiv Detail & Related papers (2020-06-18T16:15:46Z)
Reusing Trained Layers of Convolutional Neural Networks to Shorten Hyperparameters Tuning Time [1.160208922584163]
This paper describes a proposal to reuse the weights of hidden (convolutional) layers among different trainings to shorten this process. The experiments compare the training time and the validation loss when reusing and not reusing convolutional layers. They confirm that this strategy reduces the training time while it even increases the accuracy of the resulting neural network.
arXiv Detail & Related papers (2020-06-16T11:39:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.