Related papers: A theoretical framework for self-supervised contrastive learning for continuous dependent data

A theoretical framework for self-supervised contrastive learning for continuous dependent data

URL: http://arxiv.org/abs/2506.09785v4
Date: Tue, 30 Sep 2025 07:56:25 GMT
Title: A theoretical framework for self-supervised contrastive learning for continuous dependent data
Authors: Alexander Marusov, Aleksandr Yugay, Alexey Zaytsev,
Abstract summary: Self-supervised learning (SSL) has emerged as a powerful approach to learning representations, particularly in the field of computer vision.<n>We propose a novel theoretical framework for contrastive SSL tailored to emphsemantic independence between samples.<n>Specifically, we outperform TS2Vec on the standard UEA and UCR benchmarks, with accuracy improvements of $4.17$% and $2.08$%, respectively.
Score: 79.62732169706054
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-supervised learning (SSL) has emerged as a powerful approach to learning representations, particularly in the field of computer vision. However, its application to dependent data, such as temporal and spatio-temporal domains, remains underexplored. Besides, traditional contrastive SSL methods often assume \emph{semantic independence between samples}, which does not hold for dependent data exhibiting complex correlations. We propose a novel theoretical framework for contrastive SSL tailored to \emph{continuous dependent data}, which allows the nearest samples to be semantically close to each other. In particular, we propose two possible \textit{ground truth similarity measures} between objects -- \emph{hard} and \emph{soft} closeness. Under it, we derive an analytical form for the \textit{estimated similarity matrix} that accommodates both types of closeness between samples, thereby introducing dependency-aware loss functions. We validate our approach, \emph{Dependent TS2Vec}, on temporal and spatio-temporal downstream problems. Given the dependency patterns presented in the data, our approach surpasses modern ones for dependent data, highlighting the effectiveness of our theoretically grounded loss functions for SSL in capturing spatio-temporal dependencies. Specifically, we outperform TS2Vec on the standard UEA and UCR benchmarks, with accuracy improvements of $4.17$\% and $2.08$\%, respectively. Furthermore, on the drought classification task, which involves complex spatio-temporal patterns, our method achieves a $7$\% higher ROC-AUC score.

Related papers

Self-Supervised Learning from Structural Invariance [6.07374214141791]
We study the one-to-many mapping problem in joint-embedding self-supervised learning (SSL)<n>We show that existing methods struggle to flexibly capture this conditional uncertainty.<n>We empirically show its versatility in causal representation learning, fine-grained image understanding, and world modeling on videos.
arXiv Detail & Related papers (2026-02-02T17:44:44Z)
How Different from the Past? Spatio-Temporal Time Series Forecasting with Self-Supervised Deviation Learning [15.102926671713668]
We propose ST-SSDL, a Spatio-Temporal series time forecasting framework.<n>It discretizes latent space using learnable prototypes that represent typicaltemporal patterns.<n>Experiments show that ST-SSDL consistently outperforms state-of-the-art baselines across multiple metrics.
arXiv Detail & Related papers (2025-10-06T15:21:13Z)
Hierarchical Self-Supervised Representation Learning for Depression Detection from Speech [51.14752758616364]
Speech-based depression detection (SDD) is a promising, non-invasive alternative to traditional clinical assessments.<n>We propose HAREN-CTC, a novel architecture that integrates multi-layer SSL features using cross-attention within a multitask learning framework.<n>The model achieves state-of-the-art macro F1-scores of 0.81 on DAIC-WOZ and 0.82 on MODMA, outperforming prior methods across both evaluation scenarios.
arXiv Detail & Related papers (2025-10-05T09:32:12Z)
Learning Time-Series Representations by Hierarchical Uniformity-Tolerance Latent Balancing [31.568247637126035]
TimeHUT is a novel method for learning time-series representations by hierarchical-tolerance balancing of contrastive representations.<n>Our method uses two distinct losses to learn strong representations with the aim of striking an effective balance between uniformity and tolerance in the embedding space.
arXiv Detail & Related papers (2025-10-02T04:30:13Z)
Causal Discovery in Multivariate Time Series through Mutual Information Featurization [0.1657441317977376]
Temporal Dependency to Causality (TD2C) learns to recognize complex causal signatures from a rich set of information-theoretic and statistical descriptors.<n>Our results show that TD2C achieves state-of-the-art performance, consistently outperforming established methods.
arXiv Detail & Related papers (2025-08-03T17:03:13Z)
Diffeomorphic Temporal Alignment Nets for Time-series Joint Alignment and Averaging [8.14908648005543]
In time-series analysis, nonlinear temporal misalignment remains a pivotal challenge that forestalls even simple averaging.<n>DTAN predicts and applies diffeomorphic transformations in an input-dependent manner, thus facilitating the joint alignment (JA) and averaging of time-series ensembles.<n>We extend our framework to incorporate multi-task learning (MT-DTAN), enabling simultaneous timeseries alignment and classification.
arXiv Detail & Related papers (2025-02-10T15:55:08Z)
STD-PLM: Understanding Both Spatial and Temporal Properties of Spatial-Temporal Data with PLM [18.56267873980915]
STD-PLM is capable of implementing both spatial-temporal forecasting and imputation tasks.<n> STD-PLM understands spatial-temporal correlations via explicitly designed spatial and temporal tokenizers.<n> STD-PLM exhibits competitive performance and generalization capabilities across the forecasting and imputation tasks.
arXiv Detail & Related papers (2024-07-12T08:48:16Z)
Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion [57.232688209606515]
We present HTCL, a novel Temporal Temporal Context Learning paradigm for improving camera-based semantic scene completion. Our method ranks $1st$ on the Semantic KITTI benchmark and even surpasses LiDAR-based methods in terms of mIoU.
arXiv Detail & Related papers (2024-07-02T09:11:17Z)
On the Identification of Temporally Causal Representation with Instantaneous Dependence [50.14432597910128]
Temporally causal representation learning aims to identify the latent causal process from time series observations. Most methods require the assumption that the latent causal processes do not have instantaneous relations. We propose an textbfIDentification framework for instantanetextbfOus textbfLatent dynamics.
arXiv Detail & Related papers (2024-05-24T08:08:05Z)
OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning [67.07363529640784]
We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models. We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather. We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
arXiv Detail & Related papers (2023-06-20T03:02:14Z)
RelationMatch: Matching In-batch Relationships for Semi-supervised Learning [11.423755495373907]
Semi-supervised learning has emerged as a pivotal approach for leveraging scarce labeled data alongside abundant unlabeled data.<n>We present RelationMatch, a novel SSL framework that explicitly enforces in-batch relational consistency through a Matrix Cross-Entropy (MCE) loss function.
arXiv Detail & Related papers (2023-05-17T17:37:48Z)
Spatiotemporal Self-supervised Learning for Point Clouds in the Wild [65.56679416475943]
We introduce an SSL strategy that leverages positive pairs in both the spatial and temporal domain. We demonstrate the benefits of our approach via extensive experiments performed by self-supervised training on two large-scale LiDAR datasets.
arXiv Detail & Related papers (2023-03-28T18:06:22Z)
The Challenges of Continuous Self-Supervised Learning [40.941767578622745]
Self-supervised learning (SSL) aims to eliminate one of the major bottlenecks in representation learning - the need for human annotations. We show that a direct application of current methods to such continuous setup is inefficient both computationally and in the amount of data required. We propose the use of replay buffers as an approach to alleviate the issues of inefficiency and temporal correlations.
arXiv Detail & Related papers (2022-03-23T20:05:06Z)
Interpretable Time-series Representation Learning With Multi-Level Disentanglement [56.38489708031278]
Disentangle Time Series (DTS) is a novel disentanglement enhancement framework for sequential data. DTS generates hierarchical semantic concepts as the interpretable and disentangled representation of time-series. DTS achieves superior performance in downstream applications, with high interpretability of semantic concepts.
arXiv Detail & Related papers (2021-05-17T22:02:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.