Time-MMD: A New Multi-Domain Multimodal Dataset for Time Series Analysis
- URL: http://arxiv.org/abs/2406.08627v1
- Date: Wed, 12 Jun 2024 20:20:09 GMT
- Title: Time-MMD: A New Multi-Domain Multimodal Dataset for Time Series Analysis
- Authors: Haoxin Liu, Shangqing Xu, Zhiyuan Zhao, Lingkai Kong, Harshavardhan Kamarthi, Aditya B. Sasanur, Megha Sharma, Jiaming Cui, Qingsong Wen, Chao Zhang, B. Aditya Prakash,
- Abstract summary: Time-MMD is the first multi-domain, multimodal time series dataset.
MM-TSFlib is the first multimodal time-series forecasting library.
- Score: 40.44013652777716
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Time series data are ubiquitous across a wide range of real-world domains. While real-world time series analysis (TSA) requires human experts to integrate numerical series data with multimodal domain-specific knowledge, most existing TSA models rely solely on numerical data, overlooking the significance of information beyond numerical series. This oversight is due to the untapped potential of textual series data and the absence of a comprehensive, high-quality multimodal dataset. To overcome this obstacle, we introduce Time-MMD, the first multi-domain, multimodal time series dataset covering 9 primary data domains. Time-MMD ensures fine-grained modality alignment, eliminates data contamination, and provides high usability. Additionally, we develop MM-TSFlib, the first multimodal time-series forecasting (TSF) library, seamlessly pipelining multimodal TSF evaluations based on Time-MMD for in-depth analyses. Extensive experiments conducted on Time-MMD through MM-TSFlib demonstrate significant performance enhancements by extending unimodal TSF to multimodality, evidenced by over 15% mean squared error reduction in general, and up to 40% in domains with rich textual data. More importantly, our datasets and library revolutionize broader applications, impacts, research topics to advance TSA. The dataset and library are available at https://github.com/AdityaLab/Time-MMD and https://github.com/AdityaLab/MM-TSFlib.
Related papers
- MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens [113.9621845919304]
We release MINT-1T, the most extensive and diverse open-source Multimodal INTerleaved dataset to date.
MINT-1T comprises one trillion text tokens and 3.4 billion images, a 10x scale-up from existing open-source datasets.
Our experiments show that LMMs trained on MINT-1T rival the performance of models trained on the previous leading dataset, OBELICS.
arXiv Detail & Related papers (2024-06-17T07:21:36Z) - PeFAD: A Parameter-Efficient Federated Framework for Time Series Anomaly Detection [51.20479454379662]
We propose a.
Federated Anomaly Detection framework named PeFAD with the increasing privacy concerns.
We conduct extensive evaluations on four real datasets, where PeFAD outperforms existing state-of-the-art baselines by up to 28.74%.
arXiv Detail & Related papers (2024-06-04T13:51:08Z) - Advancing multivariate time series similarity assessment: an integrated computational approach [0.0]
We propose an integrated computational approach for assessing the similarity of multivariate time series data.
MTASA is built upon a hybrid methodology designed to optimize time series alignment, complemented by a multiprocessing engine.
Results from this study highlight MTASA's superiority, achieving approximately 1.5 times greater accuracy and twice the speed compared to existing state-of-the-art integrated frameworks.
arXiv Detail & Related papers (2024-03-16T23:52:25Z) - Temporal Treasure Hunt: Content-based Time Series Retrieval System for
Discovering Insights [34.1973242428317]
Time series data is ubiquitous across various domains such as finance, healthcare, and manufacturing.
The ability to perform Content-based Time Series Retrieval (CTSR) is crucial for identifying unknown time series examples.
We introduce a CTSR benchmark dataset that comprises time series data from a variety of domains.
arXiv Detail & Related papers (2023-11-05T04:12:13Z) - Fully-Connected Spatial-Temporal Graph for Multivariate Time-Series Data [50.84488941336865]
We propose a novel method called Fully- Spatial-Temporal Graph Neural Network (FC-STGNN)
For graph construction, we design a decay graph to connect sensors across all timestamps based on their temporal distances.
For graph convolution, we devise FC graph convolution with a moving-pooling GNN layer to effectively capture the ST dependencies for learning effective representations.
arXiv Detail & Related papers (2023-09-11T08:44:07Z) - Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models [61.10851158749843]
Key insights can be obtained by discovering lead-lag relationships inherent in the data.
We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models.
arXiv Detail & Related papers (2023-05-11T10:30:35Z) - TFAD: A Decomposition Time Series Anomaly Detection Architecture with
Time-Frequency Analysis [12.867257563413972]
Time series anomaly detection is a challenging problem due to the complex temporal dependencies and the limited label data.
We propose a Time-Frequency analysis based time series Anomaly Detection model, or TFAD, to exploit both time and frequency domains for performance improvement.
arXiv Detail & Related papers (2022-10-18T09:08:57Z) - Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data.
Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z) - TimeAutoML: Autonomous Representation Learning for Multivariate
Irregularly Sampled Time Series [27.0506649441212]
We propose an autonomous representation learning approach for multivariate time series (TimeAutoML) with irregular sampling rates and variable lengths.
Extensive empirical studies on real-world datasets demonstrate that the proposed TimeAutoML outperforms competing approaches on various tasks by a large margin.
arXiv Detail & Related papers (2020-10-04T15:01:46Z) - MTS-CycleGAN: An Adversarial-based Deep Mapping Learning Network for
Multivariate Time Series Domain Adaptation Applied to the Ironmaking Industry [0.0]
This research focuses on translating the specific asset-based historical data (source domain) into data corresponding to one reference asset (target domain)
We propose MTS-CycleGAN, an algorithm for Multivariate Time Series data based on CycleGAN.
Our contribution is the integration in the CycleGAN architecture of a Long Short-Term Memory (LSTM)-based AutoEncoder (AE) for the generator and a stacked LSTM-based discriminator.
arXiv Detail & Related papers (2020-07-15T07:33:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.