TCCT: Tightly-Coupled Convolutional Transformer on Time Series
Forecasting
- URL: http://arxiv.org/abs/2108.12784v1
- Date: Sun, 29 Aug 2021 08:49:31 GMT
- Title: TCCT: Tightly-Coupled Convolutional Transformer on Time Series
Forecasting
- Authors: Li Shen and Yangzhu Wang
- Abstract summary: We propose the concept of tightly-coupled convolutional Transformer(TCCT) and three TCCT architectures.
Our experiments on real-world datasets show that our TCCT architectures could greatly improve the performance of existing state-of-art Transformer models.
- Score: 6.393659160890665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series forecasting is essential for a wide range of real-world
applications. Recent studies have shown the superiority of Transformer in
dealing with such problems, especially long sequence time series input(LSTI)
and long sequence time series forecasting(LSTF) problems. To improve the
efficiency and enhance the locality of Transformer, these studies combine
Transformer with CNN in varying degrees. However, their combinations are
loosely-coupled and do not make full use of CNN. To address this issue, we
propose the concept of tightly-coupled convolutional Transformer(TCCT) and
three TCCT architectures which apply transformed CNN architectures into
Transformer: (1) CSPAttention: through fusing CSPNet with self-attention
mechanism, the computation cost of self-attention mechanism is reduced by 30%
and the memory usage is reduced by 50% while achieving equivalent or beyond
prediction accuracy. (2) Dilated causal convolution: this method is to modify
the distilling operation proposed by Informer through replacing canonical
convolutional layers with dilated causal convolutional layers to gain
exponentially receptive field growth. (3) Passthrough mechanism: the
application of passthrough mechanism to stack of self-attention blocks helps
Transformer-like models get more fine-grained information with negligible extra
computation costs. Our experiments on real-world datasets show that our TCCT
architectures could greatly improve the performance of existing state-of-art
Transformer models on time series forecasting with much lower computation and
memory costs, including canonical Transformer, LogTrans and Informer.
Related papers
- iTransformer: Inverted Transformers Are Effective for Time Series Forecasting [62.40166958002558]
We propose iTransformer, which simply applies the attention and feed-forward network on the inverted dimensions.
The iTransformer model achieves state-of-the-art on challenging real-world datasets.
arXiv Detail & Related papers (2023-10-10T13:44:09Z) - CARD: Channel Aligned Robust Blend Transformer for Time Series
Forecasting [50.23240107430597]
We design a special Transformer, i.e., Channel Aligned Robust Blend Transformer (CARD for short), that addresses key shortcomings of CI type Transformer in time series forecasting.
First, CARD introduces a channel-aligned attention structure that allows it to capture both temporal correlations among signals.
Second, in order to efficiently utilize the multi-scale knowledge, we design a token blend module to generate tokens with different resolutions.
Third, we introduce a robust loss function for time series forecasting to alleviate the potential overfitting issue.
arXiv Detail & Related papers (2023-05-20T05:16:31Z) - Full Stack Optimization of Transformer Inference: a Survey [58.55475772110702]
Transformer models achieve superior accuracy across a wide range of applications.
The amount of compute and bandwidth required for inference of recent Transformer models is growing at a significant rate.
There has been an increased focus on making Transformer models more efficient.
arXiv Detail & Related papers (2023-02-27T18:18:13Z) - FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - Infomaxformer: Maximum Entropy Transformer for Long Time-Series
Forecasting Problem [6.497816402045097]
The Transformer architecture yields state-of-the-art results in many tasks such as natural language processing (NLP) and computer vision (CV)
With this advanced capability, however, the quadratic time complexity and high memory usage prevents the Transformer from dealing with long time-series forecasting problem.
We propose a method that combines the encoder-decoder architecture with seasonal-trend decomposition to capture more specific seasonal parts.
arXiv Detail & Related papers (2023-01-04T14:08:21Z) - Robust representations of oil wells' intervals via sparse attention
mechanism [2.604557228169423]
We introduce the class of efficient Transformers named Regularized Transformers (Reguformers)
The focus in our experiments is on oil&gas data, namely, well logs.
To evaluate our models for such problems, we work with an industry-scale open dataset consisting of well logs of more than 20 wells.
arXiv Detail & Related papers (2022-12-29T09:56:33Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z) - Informer: Beyond Efficient Transformer for Long Sequence Time-Series
Forecasting [25.417560221400347]
Long sequence time-series forecasting (LSTF) demands a high prediction capacity.
Recent studies have shown the potential of Transformer to increase the prediction capacity.
We design an efficient transformer-based model for LSTF, named Informer, with three distinctive characteristics.
arXiv Detail & Related papers (2020-12-14T11:43:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.