Rethinking Attention Mechanism in Time Series Classification
- URL: http://arxiv.org/abs/2207.07564v1
- Date: Thu, 14 Jul 2022 07:15:06 GMT
- Title: Rethinking Attention Mechanism in Time Series Classification
- Authors: Bowen Zhao, Huanlai Xing, Xinhan Wang, Fuhong Song, Zhiwen Xiao
- Abstract summary: We promote the efficiency and performance of the attention mechanism by proposing our flexible multi-head linear attention (FMLA)
We propose a simple but effective mask mechanism that helps reduce the noise influence in time series and decrease the redundancy of the proposed FMLA.
We conduct extensive experiments on 85 UCR2018 datasets to compare our algorithm with 11 well-known ones and the results show that our algorithm has comparable performance in terms of top-1 accuracy.
- Score: 6.014777261874646
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Attention-based models have been widely used in many areas, such as computer
vision and natural language processing. However, relevant applications in time
series classification (TSC) have not been explored deeply yet, causing a
significant number of TSC algorithms still suffer from general problems of
attention mechanism, like quadratic complexity. In this paper, we promote the
efficiency and performance of the attention mechanism by proposing our flexible
multi-head linear attention (FMLA), which enhances locality awareness by
layer-wise interactions with deformable convolutional blocks and online
knowledge distillation. What's more, we propose a simple but effective mask
mechanism that helps reduce the noise influence in time series and decrease the
redundancy of the proposed FMLA by masking some positions of each given series
proportionally. To stabilize this mechanism, samples are forwarded through the
model with random mask layers several times and their outputs are aggregated to
teach the same model with regular mask layers. We conduct extensive experiments
on 85 UCR2018 datasets to compare our algorithm with 11 well-known ones and the
results show that our algorithm has comparable performance in terms of top-1
accuracy. We also compare our model with three Transformer-based models with
respect to the floating-point operations per second and number of parameters
and find that our algorithm achieves significantly better efficiency with lower
complexity.
Related papers
- LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - Hyperparameter Estimation for Sparse Bayesian Learning Models [1.0172874946490507]
Aparse Bayesian Learning (SBL) models are extensively used in signal processing and machine learning for promoting sparsity through hierarchical priors.
This paper presents a framework for the improvement of SBL models for various objective functions.
A novel algorithm is introduced showing enhanced efficiency, especially under signal noise ratios.
arXiv Detail & Related papers (2024-01-04T21:24:01Z) - Correlated Attention in Transformers for Multivariate Time Series [22.542109523780333]
We propose a novel correlated attention mechanism, which efficiently captures feature-wise dependencies, and can be seamlessly integrated within the encoder blocks of existing Transformers.
In particular, correlated attention operates across feature channels to compute cross-covariance matrices between queries and keys with different lag values, and selectively aggregate representations at the sub-series level.
This architecture facilitates automated discovery and representation learning of not only instantaneous but also lagged cross-correlations, while inherently capturing time series auto-correlation.
arXiv Detail & Related papers (2023-11-20T17:35:44Z) - Perceiver-based CDF Modeling for Time Series Forecasting [25.26713741799865]
We propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data.
Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction.
Experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods.
arXiv Detail & Related papers (2023-10-03T01:13:17Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Sparse Binary Transformers for Multivariate Time Series Modeling [1.3965477771846404]
We show that lightweight Compressed Neural Networks can achieve accuracy comparable to dense floating-point Transformers.
Our model achieves favorable results across three time series learning tasks: classification, anomaly detection, and single-step forecasting.
We measure the computational savings of our approach over a range of metrics including parameter count, bit size, and floating point operation (FLOPs) count.
arXiv Detail & Related papers (2023-08-09T00:23:04Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - Adaptive Multi-Resolution Attention with Linear Complexity [18.64163036371161]
We propose a novel structure named Adaptive Multi-Resolution Attention (AdaMRA) for short.
We leverage a multi-resolution multi-head attention mechanism, enabling attention heads to capture long-range contextual information in a coarse-to-fine fashion.
To facilitate AdaMRA utilization by the scientific community, the code implementation will be made publicly available.
arXiv Detail & Related papers (2021-08-10T23:17:16Z) - Covert Model Poisoning Against Federated Learning: Algorithm Design and
Optimization [76.51980153902774]
Federated learning (FL) is vulnerable to external attacks on FL models during parameters transmissions.
In this paper, we propose effective MP algorithms to combat state-of-the-art defensive aggregation mechanisms.
Our experimental results demonstrate that the proposed CMP algorithms are effective and substantially outperform existing attack mechanisms.
arXiv Detail & Related papers (2021-01-28T03:28:18Z) - Temporal Attention-Augmented Graph Convolutional Network for Efficient
Skeleton-Based Human Action Recognition [97.14064057840089]
Graphal networks (GCNs) have been very successful in modeling non-Euclidean data structures.
Most GCN-based action recognition methods use deep feed-forward networks with high computational complexity to process all skeletons in an action.
We propose a temporal attention module (TAM) for increasing the efficiency in skeleton-based action recognition.
arXiv Detail & Related papers (2020-10-23T08:01:55Z) - Optimization-driven Machine Learning for Intelligent Reflecting Surfaces
Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts.
Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity.
In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.