Low-Dimensional Gradient Helps Out-of-Distribution Detection
- URL: http://arxiv.org/abs/2310.17163v2
- Date: Mon, 22 Jul 2024 02:59:23 GMT
- Title: Low-Dimensional Gradient Helps Out-of-Distribution Detection
- Authors: Yingwen Wu, Tao Li, Xinwen Cheng, Jie Yang, Xiaolin Huang,
- Abstract summary: We conduct a comprehensive investigation into leveraging the entirety of gradient information for OOD detection.
The primary challenge arises from the high dimensionality of gradients due to the large number of network parameters.
We propose performing linear dimension reduction on the gradient using a designated subspace.
This innovative technique enables us to obtain a low-dimensional representation of the gradient with minimal information loss.
- Score: 26.237034426573523
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting out-of-distribution (OOD) samples is essential for ensuring the reliability of deep neural networks (DNNs) in real-world scenarios. While previous research has predominantly investigated the disparity between in-distribution (ID) and OOD data through forward information analysis, the discrepancy in parameter gradients during the backward process of DNNs has received insufficient attention. Existing studies on gradient disparities mainly focus on the utilization of gradient norms, neglecting the wealth of information embedded in gradient directions. To bridge this gap, in this paper, we conduct a comprehensive investigation into leveraging the entirety of gradient information for OOD detection. The primary challenge arises from the high dimensionality of gradients due to the large number of network parameters. To solve this problem, we propose performing linear dimension reduction on the gradient using a designated subspace that comprises principal components. This innovative technique enables us to obtain a low-dimensional representation of the gradient with minimal information loss. Subsequently, by integrating the reduced gradient with various existing detection score functions, our approach demonstrates superior performance across a wide range of detection tasks. For instance, on the ImageNet benchmark with ResNet50 model, our method achieves an average reduction of 11.15$\%$ in the false positive rate at 95$\%$ recall (FPR95) compared to the current state-of-the-art approach. The code would be released.
Related papers
- Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Can Forward Gradient Match Backpropagation? [2.875726839945885]
Forward Gradients have been shown to be utilizable for neural network training.
We propose to strongly bias our gradient guesses in directions that are much more promising, such as feedback obtained from small, local auxiliary networks.
We find that using gradients obtained from a local loss as a candidate direction drastically improves on random noise in Forward Gradient methods.
arXiv Detail & Related papers (2023-06-12T08:53:41Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data [63.34506218832164]
In this work, we investigate the implicit bias of gradient flow and gradient descent in two-layer fully-connected neural networks with ReLU activations.
For gradient flow, we leverage recent work on the implicit bias for homogeneous neural networks to show that leakyally, gradient flow produces a neural network with rank at most two.
For gradient descent, provided the random variance is small enough, we show that a single step of gradient descent suffices to drastically reduce the rank of the network, and that the rank remains small throughout training.
arXiv Detail & Related papers (2022-10-13T15:09:54Z) - Subspace Modeling for Fast Out-Of-Distribution and Anomaly Detection [5.672132510411465]
This paper presents a principled approach for detecting anomalous and out-of-distribution (OOD) samples in deep neural networks (DNN)
We propose the application of linear statistical dimensionality reduction techniques on the semantic features produced by a DNN.
We show that the "feature reconstruction error" (FRE), which is the $ell$-norm of the difference between the original feature in the high-dimensional space and the pre-image of its low-dimensional reduced embedding, is highly effective for OOD and anomaly detection.
arXiv Detail & Related papers (2022-03-20T00:55:20Z) - On the Importance of Gradients for Detecting Distributional Shifts in
the Wild [15.548068221414384]
We present GradNorm, a simple and effective approach for detecting OOD inputs by utilizing information extracted from the gradient space.
GradNorm demonstrates superior performance, reducing the average FPR95 by up to 10.89% compared to the previous best method.
arXiv Detail & Related papers (2021-10-01T05:19:32Z) - Densely Nested Top-Down Flows for Salient Object Detection [137.74130900326833]
This paper revisits the role of top-down modeling in salient object detection.
It designs a novel densely nested top-down flows (DNTDF)-based framework.
In every stage of DNTDF, features from higher levels are read in via the progressive compression shortcut paths (PCSP)
arXiv Detail & Related papers (2021-02-18T03:14:02Z) - Dynamically Sampled Nonlocal Gradients for Stronger Adversarial Attacks [3.055601224691843]
The vulnerability of deep neural networks to small and even imperceptible perturbations has become a central topic in deep learning research.
We propose Dynamically Dynamically Nonlocal Gradient Descent (DSNGD) as a vulnerability defense mechanism.
We show that DSNGD-based attacks are average 35% faster while achieving 0.9% to 27.1% higher success rates compared to their gradient descent-based counterparts.
arXiv Detail & Related papers (2020-11-05T08:55:24Z) - Solving Sparse Linear Inverse Problems in Communication Systems: A Deep
Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems.
The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z) - Progressively Guided Alternate Refinement Network for RGB-D Salient
Object Detection [63.18846475183332]
We aim to develop an efficient and compact deep network for RGB-D salient object detection.
We propose a progressively guided alternate refinement network to refine it.
Our model outperforms existing state-of-the-art approaches by a large margin.
arXiv Detail & Related papers (2020-08-17T02:55:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.