Related papers: Learning to Embed Time Series Patches Independently

Learning to Embed Time Series Patches Independently

URL: http://arxiv.org/abs/2312.16427v4
Date: Thu, 2 May 2024 13:38:59 GMT
Title: Learning to Embed Time Series Patches Independently
Authors: Seunghan Lee, Taeyoung Park, Kibok Lee,
Abstract summary: Masked time series modeling has recently gained much attention as a self-supervised representation learning strategy for time series. We argue that capturing such patch might not be an optimal strategy for time series representation learning. We propose to use 1) the simple patch reconstruction task, which autoencode each patch without looking at other patches, and 2) the simple patch-wise reconstruction that embeds each patch independently.
Score: 5.752266579415516
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Masked time series modeling has recently gained much attention as a self-supervised representation learning strategy for time series. Inspired by masked image modeling in computer vision, recent works first patchify and partially mask out time series, and then train Transformers to capture the dependencies between patches by predicting masked patches from unmasked patches. However, we argue that capturing such patch dependencies might not be an optimal strategy for time series representation learning; rather, learning to embed patches independently results in better time series representations. Specifically, we propose to use 1) the simple patch reconstruction task, which autoencode each patch without looking at other patches, and 2) the simple patch-wise MLP that embeds each patch independently. In addition, we introduce complementary contrastive learning to hierarchically capture adjacent time series information efficiently. Our proposed method improves time series forecasting and classification performance compared to state-of-the-art Transformer-based models, while it is more efficient in terms of the number of parameters and training/inference time. Code is available at this repository: https://github.com/seunghan96/pits.

Related papers

Enhancing Masked Time-Series Modeling via Dropping Patches [10.715930488118582]
This paper explores how to enhance existing masked time-series modeling by randomly dropping sub-sequence level patches of time series. The method named DropPatch is proposed, which improves the pre-training efficiency by a square-level advantage. It provides additional advantages for modeling in scenarios such as in-domain, cross-domain, few-shot learning and cold start.
arXiv Detail & Related papers (2024-12-19T17:21:34Z)
Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization [42.82742477950748]
Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning. We introduce the Multi-level Optimized Mask Autoencoder (MLO-MAE), a novel framework that learns an optimal masking strategy during pretraining. Our experimental findings highlight MLO-MAE's significant advancements in visual representation learning.
arXiv Detail & Related papers (2024-02-28T07:37:26Z)
Multi-Patch Prediction: Adapting LLMs for Time Series Representation Learning [22.28251586213348]
aLLM4TS is an innovative framework that adapts Large Language Models (LLMs) for time-series representation learning. A distinctive element of our framework is the patch-wise decoding layer, which departs from previous methods reliant on sequence-level decoding.
arXiv Detail & Related papers (2024-02-07T13:51:26Z)
Patch-CLIP: A Patch-Text Pre-Trained Model [6.838615442552715]
patch representation learning has emerged as a necessary research direction for exploiting the capabilities of machine learning in software generation. We introduce. theweak-CLIP, a novel pre-training framework for patches and natural language text. We show that. theweak-CLIP sets new state-of-the-art performance, consistently outperforming the state-of-the-art in metrics like BLEU, ROUGE-L, METEOR, and Recall.
arXiv Detail & Related papers (2023-10-19T14:00:19Z)
Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training [59.923672191632065]
We propose a new self-supervised pre-training approach, named Masked and Permuted Vision Transformer (MaPeT) MaPeT employs autoregressive and permuted predictions to capture intra-patch dependencies. Our results demonstrate that MaPeT achieves competitive performance on ImageNet.
arXiv Detail & Related papers (2023-06-12T18:12:19Z)
PATS: Patch Area Transportation with Subdivision for Local Feature Matching [78.67559513308787]
Local feature matching aims at establishing sparse correspondences between a pair of images. We propose Patch Area Transportation with Subdivision (PATS) to tackle this issue. PATS improves both matching accuracy and coverage, and shows superior performance in downstream tasks.
arXiv Detail & Related papers (2023-03-14T08:28:36Z)
Improving Masked Autoencoders by Learning Where to Mask [65.89510231743692]
Masked image modeling is a promising self-supervised learning method for visual data. We present AutoMAE, a framework that uses Gumbel-Softmax to interlink an adversarially-trained mask generator and a mask-guided image modeling process. In our experiments, AutoMAE is shown to provide effective pretraining models on standard self-supervised benchmarks and downstream tasks.
arXiv Detail & Related papers (2023-03-12T05:28:55Z)
TimeMAE: Self-Supervised Representations of Time Series with Decoupled Masked Autoencoders [55.00904795497786]
We propose TimeMAE, a novel self-supervised paradigm for learning transferrable time series representations based on transformer networks. The TimeMAE learns enriched contextual representations of time series with a bidirectional encoding scheme. To solve the discrepancy issue incurred by newly injected masked embeddings, we design a decoupled autoencoder architecture.
arXiv Detail & Related papers (2023-03-01T08:33:16Z)
Patch-level Representation Learning for Self-supervised Vision Transformers [68.8862419248863]
Vision Transformers (ViTs) have gained much attention recently as a better architectural choice, often outperforming convolutional networks for various visual tasks. Inspired by this, we design a simple yet effective visual pretext task, coined SelfPatch, for learning better patch-level representations. We demonstrate that SelfPatch can significantly improve the performance of existing SSL methods for various visual tasks.
arXiv Detail & Related papers (2022-06-16T08:01:19Z)
SWAT: Spatial Structure Within and Among Tokens [53.525469741515884]
We argue that models can have significant gains when spatial structure is preserved during tokenization. We propose two key contributions: (1) Structure-aware Tokenization and, (2) Structure-aware Mixing.
arXiv Detail & Related papers (2021-11-26T18:59:38Z)
SimPatch: A Nearest Neighbor Similarity Match between Image Patches [0.0]
We try to use large patches instead of relatively small patches so that each patch contains more information. We use different feature extraction mechanisms to extract the features of each individual image patches which forms a feature matrix. The nearest patches are calculated using two different nearest neighbor algorithms in this paper for a query patch for a given image.
arXiv Detail & Related papers (2020-08-07T10:51:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.