ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation
- URL: http://arxiv.org/abs/2312.01728v3
- Date: Wed, 29 May 2024 01:39:55 GMT
- Title: ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation
- Authors: Tong Nie, Guoyang Qin, Wei Ma, Yuewen Mei, Jian Sun,
- Abstract summary: Existing imputation solutions mainly include low-rank models and deep learning models.
We demonstrate a low rankness-induced bias balance between strong inductive bias and hightemporal model expressivity.
We demonstrate its superiority in terms of accuracy, efficiency, and versatility in heterogeneous datasets, including traffic flow, solar energy, smart meters and air quality.
- Score: 43.684035409535696
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Missing data is a pervasive issue in both scientific and engineering tasks, especially for the modeling of spatiotemporal data. This problem attracts many studies to contribute to data-driven solutions. Existing imputation solutions mainly include low-rank models and deep learning models. The former assumes general structural priors but has limited model capacity. The latter possesses salient features of expressivity but lacks prior knowledge of the underlying spatiotemporal structures. Leveraging the strengths of both two paradigms, we demonstrate a low rankness-induced Transformer to achieve a balance between strong inductive bias and high model expressivity. The exploitation of the inherent structures of spatiotemporal data enables our model to learn balanced signal-noise representations, making it generalizable for a variety of imputation problems. We demonstrate its superiority in terms of accuracy, efficiency, and versatility in heterogeneous datasets, including traffic flow, solar energy, smart meters, and air quality. Promising empirical results provide strong conviction that incorporating time series primitives, such as low-rankness, can substantially facilitate the development of a generalizable model to approach a wide range of spatiotemporal imputation problems.
Related papers
- Powerformer: A Transformer with Weighted Causal Attention for Time-series Forecasting [50.298817606660826]
We introduce Powerformer, a novel Transformer variant that replaces noncausal attention weights with causal weights that are reweighted according to a smooth heavy-tailed decay.
Our empirical results demonstrate that Powerformer achieves state-of-the-art accuracy on public time-series benchmarks.
Our analyses show that the model's locality bias is amplified during training, demonstrating an interplay between time-series data and power-law-based attention.
arXiv Detail & Related papers (2025-02-10T04:42:11Z) - Synthetic Feature Augmentation Improves Generalization Performance of Language Models [8.463273762997398]
Training and fine-tuning deep learning models on limited and imbalanced datasets poses substantial challenges.
We propose augmenting features in the embedding space by generating synthetic samples using a range of techniques.
We validate the effectiveness of this approach across multiple open-source text classification benchmarks.
arXiv Detail & Related papers (2025-01-11T04:31:18Z) - TAEN: A Model-Constrained Tikhonov Autoencoder Network for Forward and Inverse Problems [0.6144680854063939]
Real-time solvers for forward and inverse problems are essential in engineering and science applications.
Machine learning surrogate models have emerged as promising alternatives to traditional methods, offering substantially reduced computational time.
These models typically demand extensive training datasets to achieve robust generalization across diverse scenarios.
We propose a novel Tikhonov autoencoder model-constrained framework, called TAE, capable of learning both forward and inverse surrogate models using a single arbitrary observation sample.
arXiv Detail & Related papers (2024-12-09T21:36:42Z) - Quantized Prompt for Efficient Generalization of Vision-Language Models [27.98205540768322]
Large-scale pre-trained vision-language models like CLIP have achieved tremendous success in various fields.
During downstream adaptation, the most challenging problems are overfitting and catastrophic forgetting.
In this paper, we explore quantization for regularizing vision-language model, which is quite efficiency and effective.
arXiv Detail & Related papers (2024-07-15T13:19:56Z) - A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation [35.46631415365955]
We introduce a conditional diffusion framework called C$2$TSD, which incorporates disentangled temporal (trend and seasonality) representations as conditional information.
Our experiments on three real-world datasets demonstrate the superior performance of our approach compared to a number of state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-18T11:59:04Z) - Fantastic Gains and Where to Find Them: On the Existence and Prospect of
General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other.
We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z) - Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction.
We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - Supercharging Imbalanced Data Learning With Energy-based Contrastive
Representation Transfer [72.5190560787569]
In computer vision, learning from long tailed datasets is a recurring theme, especially for natural image datasets.
Our proposal posits a meta-distributional scenario, where the data generating mechanism is invariant across the label-conditional feature distributions.
This allows us to leverage a causal data inflation procedure to enlarge the representation of minority classes.
arXiv Detail & Related papers (2020-11-25T00:13:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.