Unsupervised Multi-modal Feature Alignment for Time Series
Representation Learning
- URL: http://arxiv.org/abs/2312.05698v1
- Date: Sat, 9 Dec 2023 22:31:20 GMT
- Title: Unsupervised Multi-modal Feature Alignment for Time Series
Representation Learning
- Authors: Chen Liang, Donghua Yang, Zhiyu Liang, Hongzhi Wang, Zheng Liang,
Xiyang Zhang, Jianfeng Huang
- Abstract summary: We introduce an innovative approach that focuses on aligning and binding time series representations encoded from different modalities.
In contrast to conventional methods that fuse features from multiple modalities, our proposed approach simplifies the neural architecture by retaining a single time series encoder.
Our approach outperforms existing state-of-the-art URL methods across diverse downstream tasks.
- Score: 20.655943795843037
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent times, the field of unsupervised representation learning (URL) for
time series data has garnered significant interest due to its remarkable
adaptability across diverse downstream applications. Unsupervised learning
goals differ from downstream tasks, making it tricky to ensure downstream task
utility by focusing only on temporal feature characterization. Researchers have
proposed multiple transformations to extract discriminative patterns implied in
informative time series, trying to fill the gap. Despite the introduction of a
variety of feature engineering techniques, e.g. spectral domain, wavelet
transformed features, features in image form and symbolic features etc. the
utilization of intricate feature fusion methods and dependence on heterogeneous
features during inference hampers the scalability of the solutions. To address
this, our study introduces an innovative approach that focuses on aligning and
binding time series representations encoded from different modalities, inspired
by spectral graph theory, thereby guiding the neural encoder to uncover latent
pattern associations among these multi-modal features. In contrast to
conventional methods that fuse features from multiple modalities, our proposed
approach simplifies the neural architecture by retaining a single time series
encoder, consequently leading to preserved scalability. We further demonstrate
and prove mechanisms for the encoder to maintain better inductive bias. In our
experimental evaluation, we validated the proposed method on a diverse set of
time series datasets from various domains. Our approach outperforms existing
state-of-the-art URL methods across diverse downstream tasks.
Related papers
- DRFormer: Multi-Scale Transformer Utilizing Diverse Receptive Fields for Long Time-Series Forecasting [3.420673126033772]
We propose a dynamic tokenizer with a dynamic sparse learning algorithm to capture diverse receptive fields and sparse patterns of time series data.
Our proposed model, named DRFormer, is evaluated on various real-world datasets, and experimental results demonstrate its superiority compared to existing methods.
arXiv Detail & Related papers (2024-08-05T07:26:47Z) - HIERVAR: A Hierarchical Feature Selection Method for Time Series Analysis [22.285570102169356]
Time series classification stands as a pivotal and intricate challenge across various domains.
We propose a novel hierarchical feature selection method aided by ANOVA variance analysis.
Our method substantially reduces features by over 94% while preserving accuracy.
arXiv Detail & Related papers (2024-07-22T20:55:13Z) - Asynchronous Multimodal Video Sequence Fusion via Learning Modality-Exclusive and -Agnostic Representations [19.731611716111566]
We propose a Multimodal fusion approach for learning modality-Exclusive and modality-Agnostic representations.
We introduce a predictive self-attention module to capture reliable context dynamics within modalities.
A hierarchical cross-modal attention module is designed to explore valuable element correlations among modalities.
A double-discriminator strategy is presented to ensure the production of distinct representations in an adversarial manner.
arXiv Detail & Related papers (2024-07-06T04:36:48Z) - UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series
Forecasting [59.11817101030137]
This research advocates for a unified model paradigm that transcends domain boundaries.
Learning an effective cross-domain model presents the following challenges.
We propose UniTime for effective cross-domain time series learning.
arXiv Detail & Related papers (2023-10-15T06:30:22Z) - A Shapelet-based Framework for Unsupervised Multivariate Time Series Representation Learning [29.511632089649552]
We propose a novel URL framework for multivariate time series by learning time-series-specific shapelet-based representation.
To the best of our knowledge, this is the first work that explores the shapelet-based embedding in the unsupervised general-purpose representation learning.
A unified shapelet-based encoder and a novel learning objective with multi-grained contrasting and multi-scale alignment are particularly designed to achieve our goal.
arXiv Detail & Related papers (2023-05-30T09:31:57Z) - Gait Recognition in the Wild with Multi-hop Temporal Switch [81.35245014397759]
gait recognition in the wild is a more practical problem that has attracted the attention of the community of multimedia and computer vision.
This paper presents a novel multi-hop temporal switch method to achieve effective temporal modeling of gait patterns in real-world scenes.
arXiv Detail & Related papers (2022-09-01T10:46:09Z) - An Unsupervised Short- and Long-Term Mask Representation for
Multivariate Time Series Anomaly Detection [2.387411589813086]
This paper proposes an anomaly detection method based on unsupervised Short- and Long-term Mask Representation learning (SLMR)
Experiments show that the performance of our method outperforms other state-of-the-art models on three real-world datasets.
arXiv Detail & Related papers (2022-08-19T09:34:11Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - Self-Attention Neural Bag-of-Features [103.70855797025689]
We build on the recently introduced 2D-Attention and reformulate the attention learning methodology.
We propose a joint feature-temporal attention mechanism that learns a joint 2D attention mask highlighting relevant information.
arXiv Detail & Related papers (2022-01-26T17:54:14Z) - Efficient Modelling Across Time of Human Actions and Interactions [92.39082696657874]
We argue that current fixed-sized-temporal kernels in 3 convolutional neural networks (CNNDs) can be improved to better deal with temporal variations in the input.
We study how we can better handle between classes of actions, by enhancing their feature differences over different layers of the architecture.
The proposed approaches are evaluated on several benchmark action recognition datasets and show competitive results.
arXiv Detail & Related papers (2021-10-05T15:39:11Z) - CCVS: Context-aware Controllable Video Synthesis [95.22008742695772]
presentation introduces a self-supervised learning approach to the synthesis of new video clips from old ones.
It conditions the synthesis process on contextual information for temporal continuity and ancillary information for fine control.
arXiv Detail & Related papers (2021-07-16T17:57:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.