Feature Shift Detection: Localizing Which Features Have Shifted via
Conditional Distribution Tests
- URL: http://arxiv.org/abs/2107.06929v1
- Date: Wed, 14 Jul 2021 18:23:24 GMT
- Title: Feature Shift Detection: Localizing Which Features Have Shifted via
Conditional Distribution Tests
- Authors: Sean Kulinski, Saurabh Bagchi, David I. Inouye
- Abstract summary: In military sensor networks, users will want to detect when one or more of the sensors has been compromised.
We first define a formalization of this problem as multiple conditional distribution hypothesis tests.
For both efficiency and flexibility, we propose a test statistic based on the density model score function.
- Score: 12.468665026043382
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While previous distribution shift detection approaches can identify if a
shift has occurred, these approaches cannot localize which specific features
have caused a distribution shift -- a critical step in diagnosing or fixing any
underlying issue. For example, in military sensor networks, users will want to
detect when one or more of the sensors has been compromised, and critically,
they will want to know which specific sensors might be compromised. Thus, we
first define a formalization of this problem as multiple conditional
distribution hypothesis tests and propose both non-parametric and parametric
statistical tests. For both efficiency and flexibility, we then propose to use
a test statistic based on the density model score function (i.e. gradient with
respect to the input) -- which can easily compute test statistics for all
dimensions in a single forward and backward pass. Any density model could be
used for computing the necessary statistics including deep density models such
as normalizing flows or autoregressive models. We additionally develop methods
for identifying when and where a shift occurs in multivariate time-series data
and show results for multiple scenarios using realistic attack models on both
simulated and real world data.
Related papers
- Unsupervised Anomaly and Change Detection with Multivariate
Gaussianization [8.508880949780893]
Anomaly detection is a challenging problem given the high-dimensionality of the data.
We propose an unsupervised method for detecting anomalies and changes in remote sensing images.
We show the efficiency of the method in experiments involving both anomaly detection and change detection.
arXiv Detail & Related papers (2022-04-12T10:52:33Z) - Model-agnostic out-of-distribution detection using combined statistical
tests [15.27980070479021]
We present simple methods for out-of-distribution detection using a trained generative model.
We combine a classical parametric test (Rao's score test) with the recently introduced typicality test.
Despite their simplicity and generality, these methods can be competitive with model-specific out-of-distribution detection algorithms.
arXiv Detail & Related papers (2022-03-02T13:32:09Z) - Uncertainty Modeling for Out-of-Distribution Generalization [56.957731893992495]
We argue that the feature statistics can be properly manipulated to improve the generalization ability of deep learning models.
Common methods often consider the feature statistics as deterministic values measured from the learned features.
We improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training.
arXiv Detail & Related papers (2022-02-08T16:09:12Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Tracking the risk of a deployed model and detecting harmful distribution
shifts [105.27463615756733]
In practice, it may make sense to ignore benign shifts, under which the performance of a deployed model does not degrade substantially.
We argue that a sensible method for firing off a warning has to both (a) detect harmful shifts while ignoring benign ones, and (b) allow continuous monitoring of model performance without increasing the false alarm rate.
arXiv Detail & Related papers (2021-10-12T17:21:41Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Nonlinear Distribution Regression for Remote Sensing Applications [6.664736150040092]
In many remote sensing applications one wants to estimate variables or parameters of interest from observations.
Standard algorithms such as neural networks, random forests or Gaussian processes are readily available to relate to the two.
This paper introduces a nonlinear (kernel-based) method for distribution regression that solves the previous problems without making any assumption on the statistics of the grouped data.
arXiv Detail & Related papers (2020-12-07T22:04:43Z) - Density of States Estimation for Out-of-Distribution Detection [69.90130863160384]
DoSE is the density of states estimator.
We demonstrate DoSE's state-of-the-art performance against other unsupervised OOD detectors.
arXiv Detail & Related papers (2020-06-16T16:06:25Z) - A Causal Direction Test for Heterogeneous Populations [10.653162005300608]
Most causal models assume a single homogeneous population, an assumption that may fail to hold in many applications.
We show that when the homogeneity assumption is violated, causal models developed based on such assumption can fail to identify the correct causal direction.
We propose an adjustment to a commonly used causal direction test statistic by using a $k$-means type clustering algorithm.
arXiv Detail & Related papers (2020-06-08T18:59:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.