Learning to Embed Time Series Patches Independently
- URL: http://arxiv.org/abs/2312.16427v4
- Date: Thu, 2 May 2024 13:38:59 GMT
- Title: Learning to Embed Time Series Patches Independently
- Authors: Seunghan Lee, Taeyoung Park, Kibok Lee,
- Abstract summary: Masked time series modeling has recently gained much attention as a self-supervised representation learning strategy for time series.
We argue that capturing such patch might not be an optimal strategy for time series representation learning.
We propose to use 1) the simple patch reconstruction task, which autoencode each patch without looking at other patches, and 2) the simple patch-wise reconstruction that embeds each patch independently.
- Score: 5.752266579415516
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Masked time series modeling has recently gained much attention as a self-supervised representation learning strategy for time series. Inspired by masked image modeling in computer vision, recent works first patchify and partially mask out time series, and then train Transformers to capture the dependencies between patches by predicting masked patches from unmasked patches. However, we argue that capturing such patch dependencies might not be an optimal strategy for time series representation learning; rather, learning to embed patches independently results in better time series representations. Specifically, we propose to use 1) the simple patch reconstruction task, which autoencode each patch without looking at other patches, and 2) the simple patch-wise MLP that embeds each patch independently. In addition, we introduce complementary contrastive learning to hierarchically capture adjacent time series information efficiently. Our proposed method improves time series forecasting and classification performance compared to state-of-the-art Transformer-based models, while it is more efficient in terms of the number of parameters and training/inference time. Code is available at this repository: https://github.com/seunghan96/pits.
Related papers
- Downstream Task Guided Masking Learning in Masked Autoencoders Using
Multi-Level Optimization [42.82742477950748]
Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning.
We introduce the Multi-level Optimized Mask Autoencoder (MLO-MAE), a novel framework that learns an optimal masking strategy during pretraining.
Our experimental findings highlight MLO-MAE's significant advancements in visual representation learning.
arXiv Detail & Related papers (2024-02-28T07:37:26Z) - Multi-Patch Prediction: Adapting LLMs for Time Series Representation
Learning [22.28251586213348]
aLLM4TS is an innovative framework that adapts Large Language Models (LLMs) for time-series representation learning.
A distinctive element of our framework is the patch-wise decoding layer, which departs from previous methods reliant on sequence-level decoding.
arXiv Detail & Related papers (2024-02-07T13:51:26Z) - Patch-CLIP: A Patch-Text Pre-Trained Model [6.838615442552715]
patch representation learning has emerged as a necessary research direction for exploiting the capabilities of machine learning in software generation.
We introduce.
theweak-CLIP, a novel pre-training framework for patches and natural language text.
We show that.
theweak-CLIP sets new state-of-the-art performance, consistently outperforming the state-of-the-art in metrics like BLEU, ROUGE-L, METEOR, and Recall.
arXiv Detail & Related papers (2023-10-19T14:00:19Z) - Learning to Mask and Permute Visual Tokens for Vision Transformer
Pre-Training [59.923672191632065]
We propose a new self-supervised pre-training approach, named Masked and Permuted Vision Transformer (MaPeT)
MaPeT employs autoregressive and permuted predictions to capture intra-patch dependencies.
Our results demonstrate that MaPeT achieves competitive performance on ImageNet.
arXiv Detail & Related papers (2023-06-12T18:12:19Z) - PATS: Patch Area Transportation with Subdivision for Local Feature
Matching [78.67559513308787]
Local feature matching aims at establishing sparse correspondences between a pair of images.
We propose Patch Area Transportation with Subdivision (PATS) to tackle this issue.
PATS improves both matching accuracy and coverage, and shows superior performance in downstream tasks.
arXiv Detail & Related papers (2023-03-14T08:28:36Z) - Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data.
We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process.
In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z) - TimeMAE: Self-Supervised Representations of Time Series with Decoupled
Masked Autoencoders [55.00904795497786]
We propose TimeMAE, a novel self-supervised paradigm for learning transferrable time series representations based on transformer networks.
The TimeMAE learns enriched contextual representations of time series with a bidirectional encoding scheme.
To solve the discrepancy issue incurred by newly injected masked embeddings, we design a decoupled autoencoder architecture.
arXiv Detail & Related papers (2023-03-01T08:33:16Z) - Patch-level Representation Learning for Self-supervised Vision
Transformers [68.8862419248863]
Vision Transformers (ViTs) have gained much attention recently as a better architectural choice, often outperforming convolutional networks for various visual tasks.
Inspired by this, we design a simple yet effective visual pretext task, coined SelfPatch, for learning better patch-level representations.
We demonstrate that SelfPatch can significantly improve the performance of existing SSL methods for various visual tasks.
arXiv Detail & Related papers (2022-06-16T08:01:19Z) - SWAT: Spatial Structure Within and Among Tokens [53.525469741515884]
We argue that models can have significant gains when spatial structure is preserved during tokenization.
We propose two key contributions: (1) Structure-aware Tokenization and, (2) Structure-aware Mixing.
arXiv Detail & Related papers (2021-11-26T18:59:38Z) - SimPatch: A Nearest Neighbor Similarity Match between Image Patches [0.0]
We try to use large patches instead of relatively small patches so that each patch contains more information.
We use different feature extraction mechanisms to extract the features of each individual image patches which forms a feature matrix.
The nearest patches are calculated using two different nearest neighbor algorithms in this paper for a query patch for a given image.
arXiv Detail & Related papers (2020-08-07T10:51:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.