Self-Supervised Deconfounding Against Spatio-Temporal Shifts: Theory and
Modeling
- URL: http://arxiv.org/abs/2311.12472v2
- Date: Wed, 6 Mar 2024 12:57:27 GMT
- Title: Self-Supervised Deconfounding Against Spatio-Temporal Shifts: Theory and
Modeling
- Authors: Jiahao Ji, Wentao Zhang, Jingyuan Wang, Yue He and Chao Huang
- Abstract summary: In this work, we formalize the problem by constructing a causal graph of past traffic data, future traffic data, and external ST contexts.
We show that the failure of prior arts in OOD traffic data is due to ST contexts acting as a confounder, i.e., the common cause for past data and future ones.
We devise a Spatio-Temporal sElf-superVised dEconfounding (STEVE) framework to encode traffic data into two disentangled representations for associating invariant and variant ST contexts.
- Score: 48.09863133371918
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As an important application of spatio-temporal (ST) data, ST traffic
forecasting plays a crucial role in improving urban travel efficiency and
promoting sustainable development. In practice, the dynamics of traffic data
frequently undergo distributional shifts attributed to external factors such as
time evolution and spatial differences. This entails forecasting models to
handle the out-of-distribution (OOD) issue where test data is distributed
differently from training data. In this work, we first formalize the problem by
constructing a causal graph of past traffic data, future traffic data, and
external ST contexts. We reveal that the failure of prior arts in OOD traffic
data is due to ST contexts acting as a confounder, i.e., the common cause for
past data and future ones. Then, we propose a theoretical solution named
Disentangled Contextual Adjustment (DCA) from a causal lens. It differentiates
invariant causal correlations against variant spurious ones and deconfounds the
effect of ST contexts. On top of that, we devise a Spatio-Temporal
sElf-superVised dEconfounding (STEVE) framework. It first encodes traffic data
into two disentangled representations for associating invariant and variant ST
contexts. Then, we use representative ST contexts from three conceptually
different perspectives (i.e., temporal, spatial, and semantic) as
self-supervised signals to inject context information into both
representations. In this way, we improve the generalization ability of the
learned context-oriented representations to OOD ST traffic forecasting.
Comprehensive experiments on four large-scale benchmark datasets demonstrate
that our STEVE consistently outperforms the state-of-the-art baselines across
various ST OOD scenarios.
Related papers
- Data Scaling Laws for End-to-End Autonomous Driving [83.85463296830743]
We evaluate the performance of a simple end-to-end driving architecture on internal driving datasets ranging in size from 16 to 8192 hours.
Specifically, we investigate how much additional training data is needed to achieve a target performance gain.
arXiv Detail & Related papers (2025-04-06T03:23:48Z) - MITA: Bridging the Gap between Model and Data for Test-time Adaptation [68.62509948690698]
Test-Time Adaptation (TTA) has emerged as a promising paradigm for enhancing the generalizability of models.
We propose Meet-In-The-Middle based MITA, which introduces energy-based optimization to encourage mutual adaptation of the model and data from opposing directions.
arXiv Detail & Related papers (2024-10-12T07:02:33Z) - A Time Series is Worth Five Experts: Heterogeneous Mixture of Experts for Traffic Flow Prediction [9.273632869779929]
We propose a Heterogeneous Mixture of Experts (TITAN) model for traffic flow prediction.
Experiments on two public traffic network datasets, METR-LA and P-BAY, demonstrate that TITAN effectively captures variable-centric dependencies.
It achieves improvements in all evaluation metrics, ranging from approximately 4.37% to 11.53%, compared to previous state-of-the-art (SOTA) models.
arXiv Detail & Related papers (2024-09-26T00:26:47Z) - FASTopic: Pretrained Transformer is a Fast, Adaptive, Stable, and Transferable Topic Model [76.509837704596]
We propose FASTopic, a fast, adaptive, stable, and transferable topic model.
We use Dual Semantic-relation Reconstruction (DSR) to model latent topics.
We also propose Embedding Transport Plan (ETP) to regularize semantic relations as optimal transport plans.
arXiv Detail & Related papers (2024-05-28T09:06:38Z) - FlashST: A Simple and Universal Prompt-Tuning Framework for Traffic Prediction [22.265095967530296]
FlashST is a framework that adapts pre-trained models to generalize specific characteristics of diverse datasets.
It captures a shift of pre-training and downstream data, facilitating effective adaptation to diverse scenarios.
Empirical evaluations demonstrate the effectiveness of FlashST across different scenarios.
arXiv Detail & Related papers (2024-05-28T07:18:52Z) - Multi-Factor Spatio-Temporal Prediction based on Graph Decomposition
Learning [31.812810009108684]
We propose a multi-factor ST prediction task that predicts partial ST data evolution under different factors.
We instantiate a novel model-agnostic framework, named decomposition graph learning (STGDL) for multi-factor ST prediction.
Results show that our framework reduces prediction errors of various ST models by 9.41% on average.
arXiv Detail & Related papers (2023-10-16T13:12:27Z) - Spatio-Temporal Contrastive Self-Supervised Learning for POI-level Crowd
Flow Inference [23.8192952068949]
We present a novel Contrastive Self-learning framework for S-temporal data (CSST)
Our approach initiates with the construction of a spatial adjacency graph founded on the Points of Interest (POIs) and their respective distances.
We adopt a swapped prediction approach to anticipate the representation of the target subgraph from similar instances.
Our experiments, conducted on two real-world datasets, demonstrate that the CSST pre-trained on extensive noisy data consistently outperforms models trained from scratch.
arXiv Detail & Related papers (2023-09-06T02:51:24Z) - OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive
Learning [67.07363529640784]
We propose OpenSTL to categorize prevalent approaches into recurrent-based and recurrent-free models.
We conduct standard evaluations on datasets across various domains, including synthetic moving object trajectory, human motion, driving scenes, traffic flow and forecasting weather.
We find that recurrent-free models achieve a good balance between efficiency and performance than recurrent models.
arXiv Detail & Related papers (2023-06-20T03:02:14Z) - Semantic-Fused Multi-Granularity Cross-City Traffic Prediction [17.020546413647708]
We propose a Semantic-Fused Multi-Granularity Transfer Learning model to achieve knowledge transfer across cities with fused semantics at different granularities.
In detail, we design a semantic fusion module to fuse various semantics while conserving static spatial dependencies.
We conduct extensive experiments on six real-world datasets to verify the effectiveness of our STL model.
arXiv Detail & Related papers (2023-02-23T04:26:34Z) - Spatio-Temporal Self-Supervised Learning for Traffic Flow Prediction [36.77135502344546]
We propose a novel Spatio-Supervised Learning (ST-SSL) traffic prediction framework.
Our ST-SSL is built over an integrated module with temporal spatial convolutions for encoding the information across space and time.
Experiments on four benchmark datasets demonstrate that ST-SSL consistently outperforms various state-of-the-art baselines.
arXiv Detail & Related papers (2022-12-07T10:02:01Z) - Enhancing the Robustness via Adversarial Learning and Joint
Spatial-Temporal Embeddings in Traffic Forecasting [11.680589359294972]
We propose TrendGCN to address the challenge of balancing dynamics and robustness.
Our model simultaneously incorporates spatial (node-wise) embeddings and temporal (time-wise) embeddings to account for heterogeneous space-and-time convolutions.
Compared with traditional approaches that handle step-wise predictive errors independently, our approach can produce more realistic and robust forecasts.
arXiv Detail & Related papers (2022-08-05T09:36:55Z) - Continuous-Time and Multi-Level Graph Representation Learning for
Origin-Destination Demand Prediction [52.0977259978343]
This paper proposes a Continuous-time and Multi-level dynamic graph representation learning method for Origin-Destination demand prediction (CMOD)
The state vectors keep historical transaction information and are continuously updated according to the most recently happened transactions.
Experiments are conducted on two real-world datasets from Beijing Subway and New York Taxi, and the results demonstrate the superiority of our model against the state-of-the-art approaches.
arXiv Detail & Related papers (2022-06-30T03:37:50Z) - Handling Distribution Shifts on Graphs: An Invariance Perspective [78.31180235269035]
We formulate the OOD problem on graphs and develop a new invariant learning approach, Explore-to-Extrapolate Risk Minimization (EERM)
EERM resorts to multiple context explorers that are adversarially trained to maximize the variance of risks from multiple virtual environments.
We prove the validity of our method by theoretically showing its guarantee of a valid OOD solution.
arXiv Detail & Related papers (2022-02-05T02:31:01Z) - Detecting Owner-member Relationship with Graph Convolution Network in
Fisheye Camera System [9.665475078766017]
We propose an innovative relationship prediction method, DeepWORD, by designing a graph convolutional network (GCN)
In the experiments we learned that the proposed method achieved state-of-the-art accuracy and real-time performance.
arXiv Detail & Related papers (2022-01-28T13:12:27Z) - Towards Robust and Adaptive Motion Forecasting: A Causal Representation
Perspective [72.55093886515824]
We introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables.
We devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph.
Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations.
arXiv Detail & Related papers (2021-11-29T18:59:09Z) - ProSTformer: Pre-trained Progressive Space-Time Self-attention Model for
Traffic Flow Forecasting [6.35012051925346]
Two issues prevent the approach from being effectively applied in traffic flow forecasting.
We first factor the dependencies and then a space-time self-attention mechanism named ProSTformer.
ProSTformer performs better or the same on the big scale datasets than six state-of-the-art methods by RMSE.
arXiv Detail & Related papers (2021-11-03T12:20:08Z) - Interpretable Time-series Representation Learning With Multi-Level
Disentanglement [56.38489708031278]
Disentangle Time Series (DTS) is a novel disentanglement enhancement framework for sequential data.
DTS generates hierarchical semantic concepts as the interpretable and disentangled representation of time-series.
DTS achieves superior performance in downstream applications, with high interpretability of semantic concepts.
arXiv Detail & Related papers (2021-05-17T22:02:24Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z) - Connecting the Dots: Multivariate Time Series Forecasting with Graph
Neural Networks [91.65637773358347]
We propose a general graph neural network framework designed specifically for multivariate time series data.
Our approach automatically extracts the uni-directed relations among variables through a graph learning module.
Our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets.
arXiv Detail & Related papers (2020-05-24T04:02:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.