Approximations to the Fisher Information Metric of Deep Generative Models for Out-Of-Distribution Detection
- URL: http://arxiv.org/abs/2403.01485v2
- Date: Sat, 25 May 2024 21:47:13 GMT
- Title: Approximations to the Fisher Information Metric of Deep Generative Models for Out-Of-Distribution Detection
- Authors: Sam Dauncey, Chris Holmes, Christopher Williams, Fabian Falck,
- Abstract summary: We show that deep generative models consistently infer higher log-likelihoods for OOD data than data they were trained on.
We use the gradient of a data point with respect to the parameters of the deep generative model for OOD detection, based on the simple intuition that OOD data should have larger gradient norms than training data.
Our empirical results indicate that this method outperforms the Typicality test for most deep generative models and image dataset pairings.
- Score: 2.3749120526936465
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Likelihood-based deep generative models such as score-based diffusion models and variational autoencoders are state-of-the-art machine learning models approximating high-dimensional distributions of data such as images, text, or audio. One of many downstream tasks they can be naturally applied to is out-of-distribution (OOD) detection. However, seminal work by Nalisnick et al. which we reproduce showed that deep generative models consistently infer higher log-likelihoods for OOD data than data they were trained on, marking an open problem. In this work, we analyse using the gradient of a data point with respect to the parameters of the deep generative model for OOD detection, based on the simple intuition that OOD data should have larger gradient norms than training data. We formalise measuring the size of the gradient as approximating the Fisher information metric. We show that the Fisher information matrix (FIM) has large absolute diagonal values, motivating the use of chi-square distributed, layer-wise gradient norms as features. We combine these features to make a simple, model-agnostic and hyperparameter-free method for OOD detection which estimates the joint density of the layer-wise gradient norms for a given data point. We find that these layer-wise gradient norms are weakly correlated, rendering their combined usage informative, and prove that the layer-wise gradient norms satisfy the principle of (data representation) invariance. Our empirical results indicate that this method outperforms the Typicality test for most deep generative models and image dataset pairings.
Related papers
- Deep Metric Learning-Based Out-of-Distribution Detection with Synthetic Outlier Exposure [0.0]
We propose a label-mixup approach to generate synthetic OOD data using Denoising Diffusion Probabilistic Models (DDPMs)
In the experiments, we found that metric learning-based loss functions perform better than the softmax.
Our approach outperforms strong baselines in conventional OOD detection metrics.
arXiv Detail & Related papers (2024-05-01T16:58:22Z) - Optimizing OOD Detection in Molecular Graphs: A Novel Approach with Diffusion Models [71.39421638547164]
We propose to detect OOD molecules by adopting an auxiliary diffusion model-based framework, which compares similarities between input molecules and reconstructed graphs.
Due to the generative bias towards reconstructing ID training samples, the similarity scores of OOD molecules will be much lower to facilitate detection.
Our research pioneers an approach of Prototypical Graph Reconstruction for Molecular OOD Detection, dubbed as PGR-MOOD and hinges on three innovations.
arXiv Detail & Related papers (2024-04-24T03:25:53Z) - GOODAT: Towards Test-time Graph Out-of-Distribution Detection [103.40396427724667]
Graph neural networks (GNNs) have found widespread application in modeling graph data across diverse domains.
Recent studies have explored graph OOD detection, often focusing on training a specific model or modifying the data on top of a well-trained GNN.
This paper introduces a data-centric, unsupervised, and plug-and-play solution that operates independently of training data and modifications of GNN architecture.
arXiv Detail & Related papers (2024-01-10T08:37:39Z) - GradOrth: A Simple yet Efficient Out-of-Distribution Detection with
Orthogonal Projection of Gradients [24.50445616970387]
out-of-distribution (OOD) data is crucial for ensuring the safe deployment of machine learning models in real-world applications.
We propose a novel approach called GradOrth to facilitate OOD detection based on one intriguing observation.
This simple yet effective method exhibits outstanding performance, showcasing a notable reduction in the average false positive rate at a 95% true positive rate (FPR95) of up to 8% when compared to the current state-of-the-art methods.
arXiv Detail & Related papers (2023-08-01T06:12:12Z) - Energy-based Out-of-Distribution Detection for Graph Neural Networks [76.0242218180483]
We propose a simple, powerful and efficient OOD detection model for GNN-based learning on graphs, which we call GNNSafe.
GNNSafe achieves up to $17.0%$ AUROC improvement over state-of-the-arts and it could serve as simple yet strong baselines in such an under-developed area.
arXiv Detail & Related papers (2023-02-06T16:38:43Z) - Watermarking for Out-of-distribution Detection [76.20630986010114]
Out-of-distribution (OOD) detection aims to identify OOD data based on representations extracted from well-trained deep models.
We propose a general methodology named watermarking in this paper.
We learn a unified pattern that is superimposed onto features of original data, and the model's detection capability is largely boosted after watermarking.
arXiv Detail & Related papers (2022-10-27T06:12:32Z) - How Useful are Gradients for OOD Detection Really? [5.459639971144757]
Out of distribution (OOD) detection is a critical challenge in deploying highly performant machine learning models in real-life applications.
We provide an in-depth analysis and comparison of gradient based methods for OOD detection.
We propose a general, non-gradient based method of OOD detection which improves over previous baselines in both performance and computational efficiency.
arXiv Detail & Related papers (2022-05-20T21:10:05Z) - On the Importance of Gradients for Detecting Distributional Shifts in
the Wild [15.548068221414384]
We present GradNorm, a simple and effective approach for detecting OOD inputs by utilizing information extracted from the gradient space.
GradNorm demonstrates superior performance, reducing the average FPR95 by up to 10.89% compared to the previous best method.
arXiv Detail & Related papers (2021-10-01T05:19:32Z) - Evaluating State-of-the-Art Classification Models Against Bayes
Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows.
We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z) - Why Normalizing Flows Fail to Detect Out-of-Distribution Data [51.552870594221865]
Normalizing flows fail to distinguish between in- and out-of-distribution data.
We demonstrate that flows learn local pixel correlations and generic image-to-latent-space transformations.
We show that by modifying the architecture of flow coupling layers we can bias the flow towards learning the semantic structure of the target data.
arXiv Detail & Related papers (2020-06-15T17:00:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.