Channel-Wise Early Stopping without a Validation Set via NNK Polytope
Interpolation
- URL: http://arxiv.org/abs/2107.12972v1
- Date: Tue, 27 Jul 2021 17:33:30 GMT
- Title: Channel-Wise Early Stopping without a Validation Set via NNK Polytope
Interpolation
- Authors: David Bonet, Antonio Ortega, Javier Ruiz-Hidalgo, Sarath Shekkizhar
- Abstract summary: Convolutional neural networks (ConvNets) comprise high-dimensional feature spaces formed by the aggregation of multiple channels.
We present channel-wise DeepNNK, a novel generalization estimate based on non-dimensional kernel regression (NNK) graphs.
- Score: 36.479195100553085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: State-of-the-art neural network architectures continue to scale in size and
deliver impressive generalization results, although this comes at the expense
of limited interpretability. In particular, a key challenge is to determine
when to stop training the model, as this has a significant impact on
generalization. Convolutional neural networks (ConvNets) comprise
high-dimensional feature spaces formed by the aggregation of multiple channels,
where analyzing intermediate data representations and the model's evolution can
be challenging owing to the curse of dimensionality. We present channel-wise
DeepNNK (CW-DeepNNK), a novel channel-wise generalization estimate based on
non-negative kernel regression (NNK) graphs with which we perform local
polytope interpolation on low-dimensional channels. This method leads to
instance-based interpretability of both the learned data representations and
the relationship between channels. Motivated by our observations, we use
CW-DeepNNK to propose a novel early stopping criterion that (i) does not
require a validation set, (ii) is based on a task performance metric, and (iii)
allows stopping to be reached at different points for each channel. Our
experiments demonstrate that our proposed method has advantages as compared to
the standard criterion based on validation set performance.
Related papers
- Information-Theoretic Generalization Bounds for Deep Neural Networks [22.87479366196215]
Deep neural networks (DNNs) exhibit an exceptional capacity for generalization in practical applications.
This work aims to capture the effect and benefits of depth for supervised learning via information-theoretic generalization bounds.
arXiv Detail & Related papers (2024-04-04T03:20:35Z) - Revealing Decurve Flows for Generalized Graph Propagation [108.80758541147418]
This study addresses the limitations of the traditional analysis of message-passing, central to graph learning, by defining em textbfgeneralized propagation with directed and weighted graphs.
We include a preliminary exploration of learned propagation patterns in datasets, a first in the field.
arXiv Detail & Related papers (2024-02-13T14:13:17Z) - Interpolation-based Correlation Reduction Network for Semi-Supervised
Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN)
In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries.
By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - Low Complexity Channel estimation with Neural Network Solutions [1.0499453838486013]
We deploy a general residual convolutional neural network to achieve channel estimation in a downlink scenario.
Compared with other deep learning methods for channel estimation, our results suggest improved mean squared error computation.
arXiv Detail & Related papers (2022-01-24T19:55:10Z) - Exploring Gradient Flow Based Saliency for DNN Model Compression [21.993801817422572]
Model pruning aims to reduce the deep neural network (DNN) model size or computational overhead.
Traditional model pruning methods that evaluates the channel significance for DNN pay too much attention to the local analysis of each channel.
We propose a new model pruning method from a new perspective of flow in this paper.
arXiv Detail & Related papers (2021-10-24T16:09:40Z) - Channel redundancy and overlap in convolutional neural networks with
channel-wise NNK graphs [36.479195100553085]
Feature spaces in the deep layers of convolutional neural networks (CNNs) are often very high-dimensional and difficult to interpret.
We analyze theoretically channel-wise non-negative kernel (CW-NNK) regression graphs to quantify the overlap between channels.
We find that redundancy between channels is significant and varies with the layer depth and the level of regularization.
arXiv Detail & Related papers (2021-10-18T22:50:07Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Bayesian Graph Neural Networks with Adaptive Connection Sampling [62.51689735630133]
We propose a unified framework for adaptive connection sampling in graph neural networks (GNNs)
The proposed framework not only alleviates over-smoothing and over-fitting tendencies of deep GNNs, but also enables learning with uncertainty in graph analytic tasks with GNNs.
arXiv Detail & Related papers (2020-06-07T07:06:35Z) - Depthwise Non-local Module for Fast Salient Object Detection Using a
Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection.
The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.