On the Feasibility of Vision-Language Models for Time-Series Classification
- URL: http://arxiv.org/abs/2412.17304v2
- Date: Fri, 17 Jan 2025 23:02:52 GMT
- Title: On the Feasibility of Vision-Language Models for Time-Series Classification
- Authors: Vinay Prithyani, Mohsin Mohammed, Richa Gadgil, Ricardo Buitrago, Vinija Jain, Aman Chadha,
- Abstract summary: We build upon time-series classification by leveraging the capabilities of Vision Language Models (VLMs)
We develop a novel approach that incorporates graphical data representations as images in conjunction with numerical data.
- Score: 0.7421845364041001
- License:
- Abstract: We build upon time-series classification by leveraging the capabilities of Vision Language Models (VLMs). We find that VLMs produce competitive results after two or less epochs of fine-tuning. We develop a novel approach that incorporates graphical data representations as images in conjunction with numerical data. This approach is rooted in the hypothesis that graphical representations can provide additional contextual information that numerical data alone may not capture. Additionally, providing a graphical representation can circumvent issues such as limited context length faced by LLMs. To further advance this work, we implemented a scalable end-to-end pipeline for training on different scenarios, allowing us to isolate the most effective strategies for transferring learning capabilities from LLMs to Time Series Classification (TSC) tasks. Our approach works with univariate and multivariate time-series data. In addition, we conduct extensive and practical experiments to show how this approach works for time-series classification and generative labels.
Related papers
- Large Language Models are Few-shot Multivariate Time Series Classifiers [23.045734479292356]
Large Language Models (LLMs) have been extensively applied in time series analysis.
Yet, their utility in the few-shot classification (i.e., a crucial training scenario) is underexplored.
We aim to leverage the extensive pre-trained knowledge in LLMs to overcome the data scarcity problem.
arXiv Detail & Related papers (2025-01-30T03:59:59Z) - Integrate Temporal Graph Learning into LLM-based Temporal Knowledge Graph Model [48.15492235240126]
Temporal Knowledge Graph Forecasting aims to predict future events based on the observed events in history.
Existing methods have integrated retrieved historical facts or static graph representations into Large Language Models (LLMs)
We propose a novel framework TGL-LLM to integrate temporal graph learning into LLM-based temporal knowledge graph model.
arXiv Detail & Related papers (2025-01-21T06:12:49Z) - Vision Language Models are In-Context Value Learners [89.29486557646624]
We present Generative Value Learning (GVL), a universal value function estimator that leverages the world knowledge embedded in vision-language models (VLMs) to predict task progress.
Without any robot or task specific training, GVL can in-context zero-shot and few-shot predict effective values for more than 300 distinct real-world tasks.
arXiv Detail & Related papers (2024-11-07T09:17:50Z) - Hierarchical Multimodal LLMs with Semantic Space Alignment for Enhanced Time Series Classification [4.5939667818289385]
HiTime is a hierarchical multi-modal model that seamlessly integrates temporal information into large language models.
Our findings highlight the potential of integrating temporal features into LLMs, paving the way for advanced time series analysis.
arXiv Detail & Related papers (2024-10-24T12:32:19Z) - Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning [1.6570772838074355]
multimodal large language models (MLLMs) exhibit great potential for chart question answering (CQA)
Recent efforts primarily focus on scaling up training datasets through data collection and synthesis.
We propose a visualization-referenced instruction tuning approach to guide the training dataset enhancement and model development.
arXiv Detail & Related papers (2024-07-29T17:04:34Z) - CLAMP: Contrastive LAnguage Model Prompt-tuning [89.96914454453791]
We show that large language models can achieve good image classification performance when adapted this way.
Our approach beats state-of-the-art mLLMs by 13% and slightly outperforms contrastive learning with a custom text model.
arXiv Detail & Related papers (2023-12-04T05:13:59Z) - Time-LLM: Time Series Forecasting by Reprogramming Large Language Models [110.20279343734548]
Time series forecasting holds significant importance in many real-world dynamic systems.
We present Time-LLM, a reprogramming framework to repurpose large language models for time series forecasting.
Time-LLM is a powerful time series learner that outperforms state-of-the-art, specialized forecasting models.
arXiv Detail & Related papers (2023-10-03T01:31:25Z) - Vision-Language Modelling For Radiological Imaging and Reports In The
Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space.
We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains.
Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z) - Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data.
Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z) - Multimodal Prototypical Networks for Few-shot Learning [20.100480009813953]
Cross-modal feature generation framework is used to enrich the low populated embedding space in few-shot scenarios.
We show that in such cases nearest neighbor classification is a viable approach and outperform state-of-the-art single-modal and multimodal few-shot learning methods.
arXiv Detail & Related papers (2020-11-17T19:32:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.