Joint Embeddings Go Temporal
- URL: http://arxiv.org/abs/2509.25449v1
- Date: Mon, 29 Sep 2025 19:57:37 GMT
- Title: Joint Embeddings Go Temporal
- Authors: Sofiane Ennadir, Siavash Golkar, Leopoldo Sarra,
- Abstract summary: JointEmbedding Predictive Architectures (JEPA) has been introduced with the aim to perform self-supervised learning in the latent space.<n>Time Series JEPA (TS-JEPA) is an architecture specifically adapted for time series representation learning.<n>We show that TS-JEPA can match or surpass current state-of-the-art baselines on different standard datasets.
- Score: 5.2741154046624255
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised learning has seen great success recently in unsupervised representation learning, enabling breakthroughs in natural language and image processing. However, these methods often rely on autoregressive and masked modeling, which aim to reproduce masked information in the input, which can be vulnerable to the presence of noise or confounding variables. To address this problem, Joint-Embedding Predictive Architectures (JEPA) has been introduced with the aim to perform self-supervised learning in the latent space. To leverage these advancements in the domain of time series, we introduce Time Series JEPA (TS-JEPA), an architecture specifically adapted for time series representation learning. We validate TS-JEPA on both classification and forecasting, showing that it can match or surpass current state-of-the-art baselines on different standard datasets. Notably, our approach demonstrates a strong performance balance across diverse tasks, indicating its potential as a robust foundation for learning general representations. Thus, this work lays the groundwork for developing future time series foundation models based on Joint Embedding.
Related papers
- UTICA: Multi-Objective Self-Distllation Foundation Model Pretraining for Time Series Classification [5.071106490524274]
We adapt DINOv2-style self-distillation to pretrain a time series foundation model.<n>We build on the Mantis tokenizer and transformer encoder architecture as our backbone.<n>Our method achieves state-of-the-art classification performance on both UCR and UEA benchmarks.
arXiv Detail & Related papers (2026-03-02T01:02:09Z) - UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines [64.84631333071728]
We introduce bfUnistage, a unified Transformer-based framework fortemporal modeling.<n>Our work demonstrates that a task-specific vision-text can build a generalizable model fortemporal learning.<n>We also introduce a temporal module to incorporate temporal dynamics explicitly.
arXiv Detail & Related papers (2025-03-26T17:33:23Z) - ACT-JEPA: Novel Joint-Embedding Predictive Architecture for Efficient Policy Representation Learning [90.41852663775086]
ACT-JEPA is a novel architecture that integrates imitation learning and self-supervised learning.<n>We train a policy to predict action sequences and abstract observation sequences.<n>Our experiments show that ACT-JEPA improves the quality of representations by learning temporal environment dynamics.
arXiv Detail & Related papers (2025-01-24T16:41:41Z) - Towards Generalisable Time Series Understanding Across Domains [10.350643783811174]
We introduce a novel pre-training paradigm specifically designed to handle time series heterogeneity.<n>We propose a tokeniser with learnable domain signatures, a dual masking strategy, and a normalised cross-correlation loss.<n>Our code and pre-trained weights are available at https://www.oetu.com/oetu/otis.
arXiv Detail & Related papers (2024-10-09T17:09:30Z) - LaT-PFN: A Joint Embedding Predictive Architecture for In-context Time-series Forecasting [0.0]
We introduce LatentTimePFN, a foundational Time Series model with a strong embedding space that enables zero-shot forecasting.
We perform in-context learning in latent space utilizing a novel integration of the Prior-data Fitted Networks (PFN) and Joint Embedding Predictive Architecture (JEPA) frameworks.
arXiv Detail & Related papers (2024-05-16T13:44:56Z) - TimeSiam: A Pre-Training Framework for Siamese Time-Series Modeling [67.02157180089573]
Time series pre-training has recently garnered wide attention for its potential to reduce labeling expenses and benefit various downstream tasks.
This paper proposes TimeSiam as a simple but effective self-supervised pre-training framework for Time series based on Siamese networks.
arXiv Detail & Related papers (2024-02-04T13:10:51Z) - Self-Distilled Representation Learning for Time Series [45.51976109748732]
Self-supervised learning for time-series data holds potential similar to that recently unleashed in Natural Language Processing and Computer Vision.
We propose a conceptually simple yet powerful non-contrastive approach, based on the data2vec self-distillation framework.
We demonstrate the competitiveness of our approach for classification and forecasting as downstream tasks, comparing with state-of-the-art self-supervised learning methods on the UCR and UEA archives as well as the ETT and Electricity datasets.
arXiv Detail & Related papers (2023-11-19T14:34:01Z) - TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting [24.834846119163885]
We propose a novel framework, TEMPO, that can effectively learn time series representations.
TEMPO expands the capability for dynamically modeling real-world temporal phenomena from data within diverse domains.
arXiv Detail & Related papers (2023-10-08T00:02:25Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Temporal Predictive Coding For Model-Based Planning In Latent Space [80.99554006174093]
We present an information-theoretic approach that employs temporal predictive coding to encode elements in the environment that can be predicted across time.
We evaluate our model on a challenging modification of standard DMControl tasks where the background is replaced with natural videos that contain complex but irrelevant information to the planning task.
arXiv Detail & Related papers (2021-06-14T04:31:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.