FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification
- URL: http://arxiv.org/abs/2302.09818v1
- Date: Mon, 20 Feb 2023 07:46:14 GMT
- Title: FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification
- Authors: Mingyue Cheng, Qi Liu, Zhiding Liu, Zhi Li, Yucong Luo, Enhong Chen
- Abstract summary: FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
- Score: 53.55504611255664
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning-based algorithms, e.g., convolutional networks, have
significantly facilitated multivariate time series classification (MTSC) task.
Nevertheless, they suffer from the limitation in modeling long-range dependence
due to the nature of convolution operations. Recent advancements have shown the
potential of transformers to capture long-range dependence. However, it would
incur severe issues, such as fixed scale representations, temporal-invariant
and quadratic time complexity, with transformers directly applicable to the
MTSC task because of the distinct properties of time series data. To tackle
these issues, we propose FormerTime, an hierarchical representation model for
improving the classification capacity for the MTSC task. In the proposed
FormerTime, we employ a hierarchical network architecture to perform
multi-scale feature maps. Besides, a novel transformer encoder is further
designed, in which an efficient temporal reduction attention layer and a
well-informed contextual positional encoding generating strategy are developed.
To sum up, FormerTime exhibits three aspects of merits: (1) learning
hierarchical multi-scale representations from time series data, (2) inheriting
the strength of both transformers and convolutional networks, and (3) tacking
the efficiency challenges incurred by the self-attention mechanism. Extensive
experiments performed on $10$ publicly available datasets from UEA archive
verify the superiorities of the FormerTime compared to previous competitive
baselines.
Related papers
- PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction.
We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences.
We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z) - Rough Transformers: Lightweight Continuous-Time Sequence Modelling with Path Signatures [46.58170057001437]
We introduce the Rough Transformer, a variation of the Transformer model that operates on continuous-time representations of input sequences.
We find that, on a variety of time-series-related tasks, Rough Transformers consistently outperform their vanilla attention counterparts.
arXiv Detail & Related papers (2024-05-31T14:00:44Z) - ConvTimeNet: A Deep Hierarchical Fully Convolutional Model for
Multivariate Time Series Analysis [8.560776357590088]
ConvTimeNet is a novel deep hierarchical fully convolutional network designed to serve as a general-purpose model for time series analysis.
The results consistently outperformed strong baselines in most situations in terms of effectiveness.
arXiv Detail & Related papers (2024-03-03T12:05:49Z) - Transformer-based Video Saliency Prediction with High Temporal Dimension
Decoding [12.595019348741042]
We propose a transformer-based video saliency prediction approach with high temporal dimension network decoding (THTDNet)
This architecture yields comparable performance to multi-branch and over-complicated models on common benchmarks such as DHF1K, UCF-sports and Hollywood-2.
arXiv Detail & Related papers (2024-01-15T20:09:56Z) - Rethinking Urban Mobility Prediction: A Super-Multivariate Time Series
Forecasting Approach [71.67506068703314]
Long-term urban mobility predictions play a crucial role in the effective management of urban facilities and services.
Traditionally, urban mobility data has been structured as videos, treating longitude and latitude as fundamental pixels.
In our research, we introduce a fresh perspective on urban mobility prediction.
Instead of oversimplifying urban mobility data as traditional video data, we regard it as a complex time series.
arXiv Detail & Related papers (2023-12-04T07:39:05Z) - TimeMAE: Self-Supervised Representations of Time Series with Decoupled
Masked Autoencoders [55.00904795497786]
We propose TimeMAE, a novel self-supervised paradigm for learning transferrable time series representations based on transformer networks.
The TimeMAE learns enriched contextual representations of time series with a bidirectional encoding scheme.
To solve the discrepancy issue incurred by newly injected masked embeddings, we design a decoupled autoencoder architecture.
arXiv Detail & Related papers (2023-03-01T08:33:16Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Infomaxformer: Maximum Entropy Transformer for Long Time-Series
Forecasting Problem [6.497816402045097]
The Transformer architecture yields state-of-the-art results in many tasks such as natural language processing (NLP) and computer vision (CV)
With this advanced capability, however, the quadratic time complexity and high memory usage prevents the Transformer from dealing with long time-series forecasting problem.
We propose a method that combines the encoder-decoder architecture with seasonal-trend decomposition to capture more specific seasonal parts.
arXiv Detail & Related papers (2023-01-04T14:08:21Z) - Cluster-Former: Clustering-based Sparse Transformer for Long-Range
Dependency Encoding [90.77031668988661]
Cluster-Former is a novel clustering-based sparse Transformer to perform attention across chunked sequences.
The proposed framework is pivoted on two unique types of Transformer layer: Sliding-Window Layer and Cluster-Former Layer.
Experiments show that Cluster-Former achieves state-of-the-art performance on several major QA benchmarks.
arXiv Detail & Related papers (2020-09-13T22:09:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.