TCCT: Tightly-Coupled Convolutional Transformer on Time Series
Forecasting
- URL: http://arxiv.org/abs/2108.12784v1
- Date: Sun, 29 Aug 2021 08:49:31 GMT
- Title: TCCT: Tightly-Coupled Convolutional Transformer on Time Series
Forecasting
- Authors: Li Shen and Yangzhu Wang
- Abstract summary: We propose the concept of tightly-coupled convolutional Transformer(TCCT) and three TCCT architectures.
Our experiments on real-world datasets show that our TCCT architectures could greatly improve the performance of existing state-of-art Transformer models.
- Score: 6.393659160890665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series forecasting is essential for a wide range of real-world
applications. Recent studies have shown the superiority of Transformer in
dealing with such problems, especially long sequence time series input(LSTI)
and long sequence time series forecasting(LSTF) problems. To improve the
efficiency and enhance the locality of Transformer, these studies combine
Transformer with CNN in varying degrees. However, their combinations are
loosely-coupled and do not make full use of CNN. To address this issue, we
propose the concept of tightly-coupled convolutional Transformer(TCCT) and
three TCCT architectures which apply transformed CNN architectures into
Transformer: (1) CSPAttention: through fusing CSPNet with self-attention
mechanism, the computation cost of self-attention mechanism is reduced by 30%
and the memory usage is reduced by 50% while achieving equivalent or beyond
prediction accuracy. (2) Dilated causal convolution: this method is to modify
the distilling operation proposed by Informer through replacing canonical
convolutional layers with dilated causal convolutional layers to gain
exponentially receptive field growth. (3) Passthrough mechanism: the
application of passthrough mechanism to stack of self-attention blocks helps
Transformer-like models get more fine-grained information with negligible extra
computation costs. Our experiments on real-world datasets show that our TCCT
architectures could greatly improve the performance of existing state-of-art
Transformer models on time series forecasting with much lower computation and
memory costs, including canonical Transformer, LogTrans and Informer.
Related papers
- PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction.
We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences.
We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z) - PointMT: Efficient Point Cloud Analysis with Hybrid MLP-Transformer Architecture [46.266960248570086]
This study tackles the quadratic complexity of the self-attention mechanism by introducing a complexity local attention mechanism for effective feature aggregation.
We also introduce a parameter-free channel temperature adaptation mechanism that adaptively adjusts the attention weight distribution in each channel.
We show that PointMT achieves performance comparable to state-of-the-art methods while maintaining an optimal balance between performance and accuracy.
arXiv Detail & Related papers (2024-08-10T10:16:03Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - Full Stack Optimization of Transformer Inference: a Survey [58.55475772110702]
Transformer models achieve superior accuracy across a wide range of applications.
The amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate.
There has been an increased focus on making Transformer models more efficient.
arXiv Detail & Related papers (2023-02-27T18:18:13Z) - FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Infomaxformer: Maximum Entropy Transformer for Long Time-Series
Forecasting Problem [6.497816402045097]
The Transformer architecture yields state-of-the-art results in many tasks such as natural language processing (NLP) and computer vision (CV)
With this advanced capability, however, the quadratic time complexity and high memory usage prevents the Transformer from dealing with long time-series forecasting problem.
We propose a method that combines the encoder-decoder architecture with seasonal-trend decomposition to capture more specific seasonal parts.
arXiv Detail & Related papers (2023-01-04T14:08:21Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.