Related papers: ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation

ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation

URL: http://arxiv.org/abs/2312.01728v3
Date: Wed, 29 May 2024 01:39:55 GMT
Title: ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation
Authors: Tong Nie, Guoyang Qin, Wei Ma, Yuewen Mei, Jian Sun,
Abstract summary: Existing imputation solutions mainly include low-rank models and deep learning models. We demonstrate a low rankness-induced bias balance between strong inductive bias and hightemporal model expressivity. We demonstrate its superiority in terms of accuracy, efficiency, and versatility in heterogeneous datasets, including traffic flow, solar energy, smart meters and air quality.
Score: 43.684035409535696
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Missing data is a pervasive issue in both scientific and engineering tasks, especially for the modeling of spatiotemporal data. This problem attracts many studies to contribute to data-driven solutions. Existing imputation solutions mainly include low-rank models and deep learning models. The former assumes general structural priors but has limited model capacity. The latter possesses salient features of expressivity but lacks prior knowledge of the underlying spatiotemporal structures. Leveraging the strengths of both two paradigms, we demonstrate a low rankness-induced Transformer to achieve a balance between strong inductive bias and high model expressivity. The exploitation of the inherent structures of spatiotemporal data enables our model to learn balanced signal-noise representations, making it generalizable for a variety of imputation problems. We demonstrate its superiority in terms of accuracy, efficiency, and versatility in heterogeneous datasets, including traffic flow, solar energy, smart meters, and air quality. Promising empirical results provide strong conviction that incorporating time series primitives, such as low-rankness, can substantially facilitate the development of a generalizable model to approach a wide range of spatiotemporal imputation problems.

Related papers

SCENT: Robust Spatiotemporal Learning for Continuous Scientific Data via Scalable Conditioned Neural Fields [11.872753517172555]
We present SCENT, a novel framework for scalable and continuity-informed modeling learning. SCENT unifies representation, reconstruction, and forecasting within a single architecture. We validate SCENT through extensive simulations and real-world experiments, demonstrating state-of-the-art performance.
arXiv Detail & Related papers (2025-04-16T17:17:31Z)
Powerformer: A Transformer with Weighted Causal Attention for Time-series Forecasting [50.298817606660826]
We introduce Powerformer, a novel Transformer variant that replaces noncausal attention weights with causal weights that are reweighted according to a smooth heavy-tailed decay. Our empirical results demonstrate that Powerformer achieves state-of-the-art accuracy on public time-series benchmarks. Our analyses show that the model's locality bias is amplified during training, demonstrating an interplay between time-series data and power-law-based attention.
arXiv Detail & Related papers (2025-02-10T04:42:11Z)
TAEN: A Model-Constrained Tikhonov Autoencoder Network for Forward and Inverse Problems [0.6144680854063939]
Real-time solvers for forward and inverse problems are essential in engineering and science applications. Machine learning surrogate models have emerged as promising alternatives to traditional methods, offering substantially reduced computational time. These models typically demand extensive training datasets to achieve robust generalization across diverse scenarios. We propose a novel Tikhonov autoencoder model-constrained framework, called TAE, capable of learning both forward and inverse surrogate models using a single arbitrary observation sample.
arXiv Detail & Related papers (2024-12-09T21:36:42Z)
Quantized Prompt for Efficient Generalization of Vision-Language Models [27.98205540768322]
Large-scale pre-trained vision-language models like CLIP have achieved tremendous success in various fields. During downstream adaptation, the most challenging problems are overfitting and catastrophic forgetting. In this paper, we explore quantization for regularizing vision-language model, which is quite efficiency and effective.
arXiv Detail & Related papers (2024-07-15T13:19:56Z)
A Temporally Disentangled Contrastive Diffusion Model for Spatiotemporal Imputation [35.46631415365955]
We introduce a conditional diffusion framework called C$2$TSD, which incorporates disentangled temporal (trend and seasonality) representations as conditional information. Our experiments on three real-world datasets demonstrate the superior performance of our approach compared to a number of state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-18T11:59:04Z)
Earthfarseer: Versatile Spatio-Temporal Dynamical Systems Modeling in One Model [23.875981403451256]
EarthFarseer is a framework that combines parallel local convolutions and global Fourier-based transformer architectures. Our proposal demonstrates strong adaptability across various convergence and datasets, with fast and better local fidelity in long time-steps predictions.
arXiv Detail & Related papers (2023-12-13T07:20:24Z)
Fantastic Gains and Where to Find Them: On the Existence and Prospect of General Knowledge Transfer between Any Pretrained Model [74.62272538148245]
We show that for arbitrary pairings of pretrained models, one model extracts significant data context unavailable in the other. We investigate if it is possible to transfer such "complementary" knowledge from one model to another without performance degradation.
arXiv Detail & Related papers (2023-10-26T17:59:46Z)
CoDBench: A Critical Evaluation of Data-driven Models for Continuous Dynamical Systems [8.410938527671341]
We introduce CodBench, an exhaustive benchmarking suite comprising 11 state-of-the-art data-driven models for solving differential equations. Specifically, we evaluate 4 distinct categories of models, viz., feed forward neural networks, deep operator regression models, frequency-based neural operators, and transformer architectures. We conduct extensive experiments, assessing the operators' capabilities in learning, zero-shot super-resolution, data efficiency, robustness to noise, and computational efficiency.
arXiv Detail & Related papers (2023-10-02T21:27:54Z)
Leveraging the structure of dynamical systems for data-driven modeling [111.45324708884813]
We consider the impact of the training set and its structure on the quality of the long-term prediction. We show how an informed design of the training set, based on invariants of the system and the structure of the underlying attractor, significantly improves the resulting models.
arXiv Detail & Related papers (2021-12-15T20:09:20Z)
Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features. We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features. Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound. Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z)
Supercharging Imbalanced Data Learning With Energy-based Contrastive Representation Transfer [72.5190560787569]
In computer vision, learning from long tailed datasets is a recurring theme, especially for natural image datasets. Our proposal posits a meta-distributional scenario, where the data generating mechanism is invariant across the label-conditional feature distributions. This allows us to leverage a causal data inflation procedure to enlarge the representation of minority classes.
arXiv Detail & Related papers (2020-11-25T00:13:11Z)
Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear. We show that it commonly arises in parameters of discrete multiplicative noise due to variance. A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.