Related papers: How Can Time Series Analysis Benefit From Multiple Modalities? A Survey and Outlook

How Can Time Series Analysis Benefit From Multiple Modalities? A Survey and Outlook

URL: http://arxiv.org/abs/2503.11835v4
Date: Thu, 02 Oct 2025 01:27:21 GMT
Title: How Can Time Series Analysis Benefit From Multiple Modalities? A Survey and Outlook
Authors: Haoxin Liu, Harshavardhan Kamarthi, Zhiyuan Zhao, Shangqing Xu, Shiyu Wang, Qingsong Wen, Tom Hartvigsen, Fei Wang, B. Aditya Prakash,
Abstract summary: Time series analysis (TSA) is a longstanding research topic in the data mining community and has wide real-world significance.<n>Recent TSA works have formed a new research field, i.e., Multiple Modalities for TSA (MM4TSA)<n>This survey is the first to offer a comprehensive review and a detailed outlook for this emerging field.
Score: 50.94159389998148
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Time series analysis (TSA) is a longstanding research topic in the data mining community and has wide real-world significance. Compared to "richer" modalities such as language and vision, which have recently experienced explosive development and are densely connected, the time-series modality remains relatively underexplored and isolated. We notice that many recent TSA works have formed a new research field, i.e., Multiple Modalities for TSA (MM4TSA). In general, these MM4TSA works follow a common motivation: how TSA can benefit from multiple modalities. This survey is the first to offer a comprehensive review and a detailed outlook for this emerging field. Specifically, we systematically discuss three benefits: (1) reusing foundation models of other modalities for efficient TSA, (2) multimodal extension for enhanced TSA, and (3) cross-modality interaction for advanced TSA. We further group the works by the introduced modality type, including text, images, audio, tables, and others, within each perspective. Finally, we identify the gaps with future opportunities, including the reused modalities selections, heterogeneous modality combinations, and unseen tasks generalizations, corresponding to the three benefits. We release an up-to-date GitHub repository that includes key papers and resources.

Related papers

DimABSA: Building Multilingual and Multidomain Datasets for Dimensional Aspect-Based Sentiment Analysis [57.70022214686838]
DimABSA is the first multilingual, dimensional ABSA resource annotated with both traditional ABSA elements and VA scores.<n>This resource contains 76,958 aspect instances across 42,590 sentences, spanning six languages and four domains.
arXiv Detail & Related papers (2026-01-30T14:30:35Z)
Unlocking Financial Insights: An advanced Multimodal Summarization with Multimodal Output Framework for Financial Advisory Videos [11.550322270589952]
FASTER (Financial Advisory Summariser with Textual Embedded Relevant images) is a framework that produces optimized, concise summaries.<n>FASTER employs BLIP for semantic visual descriptions, OCR for textual patterns, and Whisper-based transcription with Speaker diarization as BOS features.<n>A modified Direct Preference Optimization (DPO)-based loss function, equipped with BOS-specific fact-checking, ensures precision, relevance, and factual consistency.
arXiv Detail & Related papers (2025-09-25T09:54:19Z)
Multimodality Helps Few-shot 3D Point Cloud Semantic Segmentation [61.91492500828508]
Few-shot 3D point cloud segmentation (FS-PCS) aims at generalizing models to segment novel categories with minimal support samples. We introduce a multimodal FS-PCS setup, utilizing textual labels and the potentially available 2D image modality. We propose a simple yet effective Test-time Adaptive Cross-modal (TACC) technique to mitigate training bias.
arXiv Detail & Related papers (2024-10-29T19:28:41Z)
PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis [74.41260927676747]
This paper bridges the gaps by introducing a multimodal conversational Sentiment Analysis (ABSA) To benchmark the tasks, we construct PanoSent, a dataset annotated both manually and automatically, featuring high quality, large scale, multimodality, multilingualism, multi-scenarios, and covering both implicit and explicit sentiment elements. To effectively address the tasks, we devise a novel Chain-of-Sentiment reasoning framework, together with a novel multimodal large language model (namely Sentica) and a paraphrase-based verification mechanism.
arXiv Detail & Related papers (2024-08-18T13:51:01Z)
Heuristic-enhanced Candidates Selection strategy for GPTs tackle Few-Shot Aspect-Based Sentiment Analysis [1.5020330976600738]
The paper designs a Heuristic-enhanced Candidates Selection strategy and further proposes All in One (AiO) model based on it. The model works in a two-stage, which simultaneously accommodates the accuracy of PLMs and the capability of generalization. The experimental results demonstrate that the proposed model can better adapt to multiple sub-tasks, and also outperforms the methods that directly utilize GPTs.
arXiv Detail & Related papers (2024-04-09T07:02:14Z)
RethinkingTMSC: An Empirical Study for Target-Oriented Multimodal Sentiment Classification [70.9087014537896]
Target-oriented Multimodal Sentiment Classification (TMSC) has gained significant attention among scholars. To investigate the causes of this problem, we perform extensive empirical evaluation and in-depth analysis of the datasets.
arXiv Detail & Related papers (2023-10-14T14:52:37Z)
Exploring Progress in Multivariate Time Series Forecasting: Comprehensive Benchmarking and Heterogeneity Analysis [70.78170766633039]
We address the need for means of assessing MTS forecasting proposals reliably and fairly. BasicTS+ is a benchmark designed to enable fair, comprehensive, and reproducible comparison of MTS forecasting solutions. We apply BasicTS+ along with rich datasets to assess the capabilities of more than 45 MTS forecasting solutions.
arXiv Detail & Related papers (2023-10-09T19:52:22Z)
TSA-Net: Tube Self-Attention Network for Action Quality Assessment [4.220843694492582]
We propose a Tube Self-Attention Network (TSA-Net) for action quality assessment (AQA) TSA-Net is with the following merits: 1) High computational efficiency, 2) High flexibility, and 3) The state-of-the art performance.
arXiv Detail & Related papers (2022-01-11T02:25:27Z)
Transformer-based Multi-Aspect Modeling for Multi-Aspect Multi-Sentiment Analysis [56.893393134328996]
We propose a novel Transformer-based Multi-aspect Modeling scheme (TMM), which can capture potential relations between multiple aspects and simultaneously detect the sentiment of all aspects in a sentence. Our method achieves noticeable improvements compared with strong baselines such as BERT and RoBERTa.
arXiv Detail & Related papers (2020-11-01T11:06:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.