R2-AD2: Detecting Anomalies by Analysing the Raw Gradient
- URL: http://arxiv.org/abs/2206.10259v1
- Date: Tue, 21 Jun 2022 11:13:33 GMT
- Title: R2-AD2: Detecting Anomalies by Analysing the Raw Gradient
- Authors: Jan-Philipp Schulze, Philip Sperl, Ana R\u{a}du\c{t}oiu, Carla
Sagebiel, Konstantin B\"ottinger
- Abstract summary: We propose a novel semi-supervised anomaly detection method called R2-AD2.
By analysing the temporal distribution of the gradient over multiple training steps, we reliably detect point anomalies.
R2-AD2 works in a purely data-driven way, thus is readily applicable in a variety of important use cases of anomaly detection.
- Score: 0.6299766708197883
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks follow a gradient-based learning scheme, adapting their
mapping parameters by back-propagating the output loss. Samples unlike the ones
seen during training cause a different gradient distribution. Based on this
intuition, we design a novel semi-supervised anomaly detection method called
R2-AD2. By analysing the temporal distribution of the gradient over multiple
training steps, we reliably detect point anomalies in strict semi-supervised
settings. Instead of domain dependent features, we input the raw gradient
caused by the sample under test to an end-to-end recurrent neural network
architecture. R2-AD2 works in a purely data-driven way, thus is readily
applicable in a variety of important use cases of anomaly detection.
Related papers
- Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations.
In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z) - Learning Compact Features via In-Training Representation Alignment [19.273120635948363]
In each epoch, the true gradient of the loss function is estimated using a mini-batch sampled from the training set.
We propose In-Training Representation Alignment (ITRA) that explicitly aligns feature distributions of two different mini-batches with a matching loss.
We also provide a rigorous analysis of the desirable effects of the matching loss on feature representation learning.
arXiv Detail & Related papers (2022-11-23T22:23:22Z) - Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data [63.34506218832164]
In this work, we investigate the implicit bias of gradient flow and gradient descent in two-layer fully-connected neural networks with ReLU activations.
For gradient flow, we leverage recent work on the implicit bias for homogeneous neural networks to show that leakyally, gradient flow produces a neural network with rank at most two.
For gradient descent, provided the random variance is small enough, we show that a single step of gradient descent suffices to drastically reduce the rank of the network, and that the rank remains small throughout training.
arXiv Detail & Related papers (2022-10-13T15:09:54Z) - A Two-Block RNN-based Trajectory Prediction from Incomplete Trajectory [14.725386295605666]
We introduce a two-block RNN model that approximates the inference steps of the Bayesian filtering framework.
We show that the proposed model improves the prediction accuracy compared to the three baseline imputation methods.
We also show that our proposed method can achieve better prediction compared to the baselines when there is no miss-detection.
arXiv Detail & Related papers (2022-03-14T13:39:44Z) - Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent.
We show that SGD is biased towards a simple solution.
We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z) - DIFFnet: Diffusion parameter mapping network generalized for input
diffusion gradient schemes and bvalues [6.7487278071108525]
A new deep neural network, referred to as DIFFnet, is developed to function as a generalized reconstruction tool of the diffusion-weighted signals.
DIFFnet is evaluated for diffusion tensor imaging (DIFFnetDTI) and for neurite orientation dispersion and density imaging (DIFFnetNODDI)
The results demonstrate accurate reconstruction of the diffusion parameters at substantially reduced processing time.
arXiv Detail & Related papers (2021-02-04T07:45:36Z) - Uncertainty Inspired RGB-D Saliency Detection [70.50583438784571]
We propose the first framework to employ uncertainty for RGB-D saliency detection by learning from the data labeling process.
Inspired by the saliency data labeling process, we propose a generative architecture to achieve probabilistic RGB-D saliency detection.
Results on six challenging RGB-D benchmark datasets show our approach's superior performance in learning the distribution of saliency maps.
arXiv Detail & Related papers (2020-09-07T13:01:45Z) - Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated.
We propose a new method for this estimation problem combining sampling and analytic approximation steps.
We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z) - The Impact of the Mini-batch Size on the Variance of Gradients in
Stochastic Gradient Descent [28.148743710421932]
The mini-batch gradient descent (SGD) algorithm is widely used in training machine learning models.
We study SGD dynamics under linear regression and two-layer linear networks, with an easy extension to deeper linear networks.
arXiv Detail & Related papers (2020-04-27T20:06:11Z) - Simple and Effective Prevention of Mode Collapse in Deep One-Class
Classification [93.2334223970488]
We propose two regularizers to prevent hypersphere collapse in deep SVDD.
The first regularizer is based on injecting random noise via the standard cross-entropy loss.
The second regularizer penalizes the minibatch variance when it becomes too small.
arXiv Detail & Related papers (2020-01-24T03:44:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.