Inducing Early Neural Collapse in Deep Neural Networks for Improved
Out-of-Distribution Detection
- URL: http://arxiv.org/abs/2209.08378v1
- Date: Sat, 17 Sep 2022 17:46:06 GMT
- Title: Inducing Early Neural Collapse in Deep Neural Networks for Improved
Out-of-Distribution Detection
- Authors: Jarrod Haas, William Yolland, Bernhard Rabus
- Abstract summary: We propose a simple modification to standard ResNet architectures--L2 regularization over feature space--that substantially improves out-of-distribution (OoD) performance.
This change also induces early Neural Collapse (NC), which we show is an effect under which better OoD performance is more probable.
- Score: 0.9558392439655015
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We propose a simple modification to standard ResNet architectures--L2
regularization over feature space--that substantially improves
out-of-distribution (OoD) performance on the previously proposed Deep
Deterministic Uncertainty (DDU) benchmark. This change also induces early
Neural Collapse (NC), which we show is an effect under which better OoD
performance is more probable. Our method achieves comparable or superior OoD
detection scores and classification accuracy in a small fraction of the
training time of the benchmark. Additionally, it substantially improves worst
case OoD performance over multiple, randomly initialized models. Though we do
not suggest that NC is the sole mechanism or comprehensive explanation for OoD
behaviour in deep neural networks (DNN), we believe NC's simple mathematical
and geometric structure can provide an framework for analysis of this complex
phenomenon in future work.
Related papers
- Feature Attenuation of Defective Representation Can Resolve Incomplete Masking on Anomaly Detection [1.0358639819750703]
In unsupervised anomaly detection (UAD) research, it is necessary to develop a computationally efficient and scalable solution.
We revisit the reconstruction-by-inpainting approach and rethink to improve it by analyzing strengths and weaknesses.
We propose Feature Attenuation of Defective Representation (FADeR) that only employs two layers which attenuates feature information of anomaly reconstruction.
arXiv Detail & Related papers (2024-07-05T15:44:53Z) - A New PHO-rmula for Improved Performance of Semi-Structured Networks [0.0]
We show that techniques to properly identify the contributions of the different model components in SSNs lead to suboptimal network estimation.
We propose a non-invasive post-hocization (PHO) that guarantees identifiability of model components and provides better estimation and prediction quality.
Our theoretical findings are supported by numerical experiments, a benchmark comparison as well as a real-world application to COVID-19 infections.
arXiv Detail & Related papers (2023-06-01T10:23:28Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Deep Neural Collapse Is Provably Optimal for the Deep Unconstrained
Features Model [21.79259092920587]
We show that in a deep unconstrained features model, the unique global optimum for binary classification exhibits all the properties typical of deep neural collapse (DNC)
We also empirically show that (i) by optimizing deep unconstrained features models via gradient descent, the resulting solution agrees well with our theory, and (ii) trained networks recover the unconstrained features suitable for DNC.
arXiv Detail & Related papers (2023-05-22T15:51:28Z) - A Kernel-Expanded Stochastic Neural Network [10.837308632004644]
Deep neural network often gets trapped into a local minimum in training.
New kernel-expanded neural network (K-StoNet) model reformulates the network as a latent variable model.
Model can be easily trained using the imputationregularized optimization (IRO) algorithm.
arXiv Detail & Related papers (2022-01-14T06:42:42Z) - Robust lEarned Shrinkage-Thresholding (REST): Robust unrolling for
sparse recover [87.28082715343896]
We consider deep neural networks for solving inverse problems that are robust to forward model mis-specifications.
We design a new robust deep neural network architecture by applying algorithm unfolding techniques to a robust version of the underlying recovery problem.
The proposed REST network is shown to outperform state-of-the-art model-based and data-driven algorithms in both compressive sensing and radar imaging problems.
arXiv Detail & Related papers (2021-10-20T06:15:45Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Iterative Network for Image Super-Resolution [69.07361550998318]
Single image super-resolution (SISR) has been greatly revitalized by the recent development of convolutional neural networks (CNN)
This paper provides a new insight on conventional SISR algorithm, and proposes a substantially different approach relying on the iterative optimization.
A novel iterative super-resolution network (ISRN) is proposed on top of the iterative optimization.
arXiv Detail & Related papers (2020-05-20T11:11:47Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z) - Interpretable Deep Recurrent Neural Networks via Unfolding Reweighted
$\ell_1$-$\ell_1$ Minimization: Architecture Design and Generalization
Analysis [19.706363403596196]
This paper develops a novel deep recurrent neural network (coined reweighted-RNN) by the unfolding of a reweighted minimization algorithm.
To the best of our knowledge, this is the first deep unfolding method that explores reweighted minimization.
The experimental results on the moving MNIST dataset demonstrate that the proposed deep reweighted-RNN significantly outperforms existing RNN models.
arXiv Detail & Related papers (2020-03-18T17:02:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.