Sub-Adjacent Transformer: Improving Time Series Anomaly Detection with Reconstruction Error from Sub-Adjacent Neighborhoods
- URL: http://arxiv.org/abs/2404.18948v1
- Date: Sat, 27 Apr 2024 08:08:17 GMT
- Title: Sub-Adjacent Transformer: Improving Time Series Anomaly Detection with Reconstruction Error from Sub-Adjacent Neighborhoods
- Authors: Wenzhen Yue, Xianghua Ying, Ruohao Guo, DongDong Chen, Ji Shi, Bowei Xing, Yuqing Zhu, Taiyan Chen,
- Abstract summary: We present the Sub-Adjacent Transformer with a novel attention mechanism for unsupervised time series anomaly detection.
By focusing the attention on the sub-adjacent areas, we make the reconstruction of anomalies more challenging.
The Sub-Adjacent Transformer achieves state-of-the-art performance across six real-world anomaly detection benchmarks.
- Score: 22.49176231245093
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present the Sub-Adjacent Transformer with a novel attention mechanism for unsupervised time series anomaly detection. Unlike previous approaches that rely on all the points within some neighborhood for time point reconstruction, our method restricts the attention to regions not immediately adjacent to the target points, termed sub-adjacent neighborhoods. Our key observation is that owing to the rarity of anomalies, they typically exhibit more pronounced differences from their sub-adjacent neighborhoods than from their immediate vicinities. By focusing the attention on the sub-adjacent areas, we make the reconstruction of anomalies more challenging, thereby enhancing their detectability. Technically, our approach concentrates attention on the non-diagonal areas of the attention matrix by enlarging the corresponding elements in the training stage. To facilitate the implementation of the desired attention matrix pattern, we adopt linear attention because of its flexibility and adaptability. Moreover, a learnable mapping function is proposed to improve the performance of linear attention. Empirically, the Sub-Adjacent Transformer achieves state-of-the-art performance across six real-world anomaly detection benchmarks, covering diverse fields such as server monitoring, space exploration, and water treatment.
Related papers
- MAAT: Mamba Adaptive Anomaly Transformer with association discrepancy for time series [5.924110046959179]
Anomaly detection in time series is essential for industrial monitoring and environmental sensing.
Existing methods face limitations such as sensitivity to short-term contexts and inefficiency in noisy, non-stationary environments.
We introduce MAAT, an improved architecture that enhances association discrepancy modeling and reconstruction quality.
arXiv Detail & Related papers (2025-02-11T16:22:06Z) - Breaking the Bias: Recalibrating the Attention of Industrial Anomaly Detection [20.651257973799527]
Recalibrating Attention of Industrial Anomaly Detection (RAAD) is a framework that systematically decomposes and recalibrates attention maps.
HQS dynamically adjusts bit-widths based on the hierarchical nature of attention maps.
We validate the effectiveness of RAAD on 32 datasets using a single 3090ti.
arXiv Detail & Related papers (2024-12-11T08:31:47Z) - GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features [68.14842693208465]
GeneralAD is an anomaly detection framework designed to operate in semantic, near-distribution, and industrial settings.
We propose a novel self-supervised anomaly generation module that employs straightforward operations like noise addition and shuffling to patch features.
We extensively evaluated our approach on ten datasets, achieving state-of-the-art results in six and on-par performance in the remaining.
arXiv Detail & Related papers (2024-07-17T09:27:41Z) - Toward Motion Robustness: A masked attention regularization framework in remote photoplethysmography [5.743550396843244]
MAR-r is a framework that integrates the impact of ROI localization and complex motion artifacts.
MAR-r employs a masked attention regularization mechanism into the r field to capture semantic consistency of facial clips.
It also employs a masking technique to prevent the model from overfitting on inaccurate ROIs and subsequently degrading its performance.
arXiv Detail & Related papers (2024-07-09T08:25:30Z) - Mitigating Undisciplined Over-Smoothing in Transformer for Weakly
Supervised Semantic Segmentation [41.826919704238556]
We propose an adaptive re-activation mechanism (AReAM) that alleviates the issue of incomplete attention within the object and the unbounded background noise.
AReAM accomplishes this by supervising high-level attention with shallow affinity matrices, yielding promising results.
arXiv Detail & Related papers (2023-05-04T19:11:33Z) - Exploring Consistency in Cross-Domain Transformer for Domain Adaptive
Semantic Segmentation [51.10389829070684]
Domain gap can cause discrepancies in self-attention.
Due to this gap, the transformer attends to spurious regions or pixels, which deteriorates accuracy on the target domain.
We propose adaptation on attention maps with cross-domain attention layers.
arXiv Detail & Related papers (2022-11-27T02:40:33Z) - The Devil in Linear Transformer [42.232886799710215]
Linear transformers aim to reduce the quadratic space-time complexity of vanilla transformers.
They usually suffer from degraded performances on various tasks and corpus.
In this paper, we identify two key issues that lead to such performance gaps.
arXiv Detail & Related papers (2022-10-19T07:15:35Z) - Usage of specific attention improves change point detection [1.0723143072368782]
We investigate different attentions for the change point detection task and proposed specific form of attention related to the task at hand.
We show that using a special form of attention outperforms state-of-the-art results.
arXiv Detail & Related papers (2022-04-18T06:05:50Z) - Boosting Crowd Counting via Multifaceted Attention [109.89185492364386]
Large-scale variations often exist within crowd images.
Neither fixed-size convolution kernel of CNN nor fixed-size attention of recent vision transformers can handle this kind of variation.
We propose a Multifaceted Attention Network (MAN) to improve transformer models in local spatial relation encoding.
arXiv Detail & Related papers (2022-03-05T01:36:43Z) - Point-Level Region Contrast for Object Detection Pre-Training [147.47349344401806]
We present point-level region contrast, a self-supervised pre-training approach for the task of object detection.
Our approach performs contrastive learning by directly sampling individual point pairs from different regions.
Compared to an aggregated representation per region, our approach is more robust to the change in input region quality.
arXiv Detail & Related papers (2022-02-09T18:56:41Z) - Anomaly Transformer: Time Series Anomaly Detection with Association
Discrepancy [68.86835407617778]
Anomaly Transformer achieves state-of-the-art performance on six unsupervised time series anomaly detection benchmarks.
Anomaly Transformer achieves state-of-the-art performance on six unsupervised time series anomaly detection benchmarks.
arXiv Detail & Related papers (2021-10-06T10:33:55Z) - Ripple Attention for Visual Perception with Sub-quadratic Complexity [7.425337104538644]
Transformer architectures are now central to modeling in natural language processing tasks.
We propose ripple attention, a sub-quadratic attention mechanism for visual perception.
In ripple attention, contributions of different tokens to a query are weighted with respect to their relative spatial distances in the 2D space.
arXiv Detail & Related papers (2021-10-06T02:00:38Z) - Transformer Interpretability Beyond Attention Visualization [87.96102461221415]
Self-attention techniques, and specifically Transformers, are dominating the field of text processing.
In this work, we propose a novel way to compute relevancy for Transformer networks.
arXiv Detail & Related papers (2020-12-17T18:56:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.