Self-Supervised Deconfounding Against Spatio-Temporal Shifts: Theory and
Modeling
- URL: http://arxiv.org/abs/2311.12472v2
- Date: Wed, 6 Mar 2024 12:57:27 GMT
- Title: Self-Supervised Deconfounding Against Spatio-Temporal Shifts: Theory and
Modeling
- Authors: Jiahao Ji, Wentao Zhang, Jingyuan Wang, Yue He and Chao Huang
- Abstract summary: In this work, we formalize the problem by constructing a causal graph of past traffic data, future traffic data, and external ST contexts.
We show that the failure of prior arts in OOD traffic data is due to ST contexts acting as a confounder, i.e., the common cause for past data and future ones.
We devise a Spatio-Temporal sElf-superVised dEconfounding (STEVE) framework to encode traffic data into two disentangled representations for associating invariant and variant ST contexts.
- Score: 48.09863133371918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As an important application of spatio-temporal (ST) data, ST traffic
forecasting plays a crucial role in improving urban travel efficiency and
promoting sustainable development. In practice, the dynamics of traffic data
frequently undergo distributional shifts attributed to external factors such as
time evolution and spatial differences. This entails forecasting models to
handle the out-of-distribution (OOD) issue where test data is distributed
differently from training data. In this work, we first formalize the problem by
constructing a causal graph of past traffic data, future traffic data, and
external ST contexts. We reveal that the failure of prior arts in OOD traffic
data is due to ST contexts acting as a confounder, i.e., the common cause for
past data and future ones. Then, we propose a theoretical solution named
Disentangled Contextual Adjustment (DCA) from a causal lens. It differentiates
invariant causal correlations against variant spurious ones and deconfounds the
effect of ST contexts. On top of that, we devise a Spatio-Temporal
sElf-superVised dEconfounding (STEVE) framework. It first encodes traffic data
into two disentangled representations for associating invariant and variant ST
contexts. Then, we use representative ST contexts from three conceptually
different perspectives (i.e., temporal, spatial, and semantic) as
self-supervised signals to inject context information into both
representations. In this way, we improve the generalization ability of the
learned context-oriented representations to OOD ST traffic forecasting.
Comprehensive experiments on four large-scale benchmark datasets demonstrate
that our STEVE consistently outperforms the state-of-the-art baselines across
various ST OOD scenarios.
Related papers
- MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model [76.509837704596]
We propose FASTopic, a fast, adaptive, stable, and transferable topic model.
We use Dual Semantic-relation Reconstruction (DSR) to model latent topics.
We also propose Embedding Transport Plan (ETP) to regularize semantic relations as optimal transport plans.
arXiv Detail & Related papers (2024-05-28T09:06:38Z) - FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic Prediction [22.265095967530296]
FlashST is a framework that adapts pre-trained models to generalize specific characteristics of diverse datasets.
It captures a shift of pre-training and downstream data, facilitating effective adaptation to diverse scenarios.
Empirical evaluations demonstrate the effectiveness of FlashST across different scenarios.
arXiv Detail & Related papers (2024-05-28T07:18:52Z) - Multi-Factor Spatio-Temporal Prediction based on Graph Decomposition
Learning [31.812810009108684]
We propose a multi-factor ST prediction task that predicts partial ST data evolution under different factors.
We instantiate a novel model-agnostic framework, named decomposition graph learning (STGDL) for multi-factor ST prediction.
Results show that our framework reduces prediction errors of various ST models by 9.41% on average.
arXiv Detail & Related papers (2023-10-16T13:12:27Z) - Spatio-Temporal Contrastive Self-Supervised Learning for POI-level Crowd
Flow Inference [23.8192952068949]
We present a novel Contrastive Self-learning framework for S-temporal data (CSST)
Our approach initiates with the construction of a spatial adjacency graph founded on the Points of Interest (POIs) and their respective distances.
We adopt a swapped prediction approach to anticipate the representation of the target subgraph from similar instances.
Our experiments, conducted on two real-world datasets, demonstrate that the CSST pre-trained on extensive noisy data consistently outperforms models trained from scratch.
arXiv Detail & Related papers (2023-09-06T02:51:24Z) - Spatio-Temporal Self-Supervised Learning for Traffic Flow Prediction [36.77135502344546]
We propose a novel Spatio-Supervised Learning (ST-SSL) traffic prediction framework.
Our ST-SSL is built over an integrated module with temporal spatial convolutions for encoding the information across space and time.
Experiments on four benchmark datasets demonstrate that ST-SSL consistently outperforms various state-of-the-art baselines.
arXiv Detail & Related papers (2022-12-07T10:02:01Z) - Handling Distribution Shifts on Graphs: An Invariance Perspective [78.31180235269035]
We formulate the OOD problem on graphs and develop a new invariant learning approach, Explore-to-Extrapolate Risk Minimization (EERM)
EERM resorts to multiple context explorers that are adversarially trained to maximize the variance of risks from multiple virtual environments.
We prove the validity of our method by theoretically showing its guarantee of a valid OOD solution.
arXiv Detail & Related papers (2022-02-05T02:31:01Z) - Towards Robust and Adaptive Motion Forecasting: A Causal Representation
Perspective [72.55093886515824]
We introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables.
We devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph.
Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations.
arXiv Detail & Related papers (2021-11-29T18:59:09Z) - ProSTformer: Pre-trained Progressive Space-Time Self-attention Model for
Traffic Flow Forecasting [6.35012051925346]
Two issues prevent the approach from being effectively applied in traffic flow forecasting.
We first factor the dependencies and then a space-time self-attention mechanism named ProSTformer.
ProSTformer performs better or the same on the big scale datasets than six state-of-the-art methods by RMSE.
arXiv Detail & Related papers (2021-11-03T12:20:08Z) - Interpretable Time-series Representation Learning With Multi-Level
Disentanglement [56.38489708031278]
Disentangle Time Series (DTS) is a novel disentanglement enhancement framework for sequential data.
DTS generates hierarchical semantic concepts as the interpretable and disentangled representation of time-series.
DTS achieves superior performance in downstream applications, with high interpretability of semantic concepts.
arXiv Detail & Related papers (2021-05-17T22:02:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.