Contextualizing MLP-Mixers Spatiotemporally for Urban Data Forecast at
Scale
- URL: http://arxiv.org/abs/2307.01482v5
- Date: Thu, 8 Feb 2024 03:31:04 GMT
- Title: Contextualizing MLP-Mixers Spatiotemporally for Urban Data Forecast at
Scale
- Authors: Tong Nie, Guoyang Qin, Lijun Sun, Wei Ma, Yu Mei, Jian Sun
- Abstract summary: Stemporal urban data (STUD) displays complex correlational patterns.
Because STUD is often massive in scale, practitioners need to strike a balance between effectiveness and efficiency.
An alternative paradigm called Nex-Mixer has the potential for both simplicity and effectiveness.
- Score: 57.38373754100004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Spatiotemporal urban data (STUD) displays complex correlational patterns.
Extensive advanced techniques have been designed to capture these patterns for
effective forecasting. However, because STUD is often massive in scale,
practitioners need to strike a balance between effectiveness and efficiency by
choosing computationally efficient models. An alternative paradigm called
MLP-Mixer has the potential for both simplicity and effectiveness. Taking
inspiration from its success in other domains, we propose an adapted version,
named NexuSQN, for STUD forecast at scale. We identify the challenges faced
when directly applying MLP-Mixers as series- and window-wise multivaluedness
and propose the ST-contextualization to distinguish between spatial and
temporal patterns. Experimental results surprisingly demonstrate that
MLP-Mixers with ST-contextualization can rival SOTA performance when tested on
several urban benchmarks. Furthermore, it was deployed in a collaborative urban
congestion project with Baidu, specifically evaluating its ability to forecast
traffic states in megacities like Beijing and Shanghai. Our findings contribute
to the exploration of simple yet effective models for real-world STUD
forecasting.
Related papers
- Advancing the Robustness of Large Language Models through Self-Denoised Smoothing [50.54276872204319]
Large language models (LLMs) have achieved significant success, but their vulnerability to adversarial perturbations has raised considerable concerns.
We propose to leverage the multitasking nature of LLMs to first denoise the noisy inputs and then to make predictions based on these denoised versions.
Unlike previous denoised smoothing techniques in computer vision, which require training a separate model to enhance the robustness of LLMs, our method offers significantly better efficiency and flexibility.
arXiv Detail & Related papers (2024-04-18T15:47:00Z) - Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance [55.872926690722714]
We study the predictability of model performance regarding the mixture proportions in function forms.
We propose nested use of the scaling laws of training steps, model sizes, and our data mixing law.
Our method effectively optimize the training mixture of a 1B model trained for 100B tokens in RedPajama.
arXiv Detail & Related papers (2024-03-25T17:14:00Z) - FairSTG: Countering performance heterogeneity via collaborative sample-level optimization [11.332049332977396]
We propose a model-independent Fairness-aware framework for smart Stemporal Graph learning (FairSTG)
Our work can potentially alleviate the risks ontemporaltemporal resource allocation for underrepresented urban regions.
arXiv Detail & Related papers (2024-03-19T02:59:50Z) - Supervised Contrastive Learning based Dual-Mixer Model for Remaining
Useful Life Prediction [3.081898819471624]
The Remaining Useful Life (RUL) prediction aims at providing an accurate estimate of the remaining time from the current predicting moment to the complete failure of the device.
To overcome the shortcomings of rigid combination for temporal and spatial features in most existing RUL prediction approaches, a spatial-temporal homogeneous feature extractor, named Dual-Mixer model, is proposed.
The effectiveness of the proposed method is validated through comparisons with other latest research works on the C-MAPSS dataset.
arXiv Detail & Related papers (2024-01-29T14:38:44Z) - TiMix: Text-aware Image Mixing for Effective Vision-Language
Pre-training [42.142924806184425]
Mixed data samples for cross-modal contrastive learning implicitly serve as a regularizer for the contrastive loss.
TiMix exhibits a comparable performance on downstream tasks, even with a reduced amount of training data and shorter training time, when benchmarked against existing methods.
arXiv Detail & Related papers (2023-12-14T12:02:24Z) - Exploring Progress in Multivariate Time Series Forecasting:
Comprehensive Benchmarking and Heterogeneity Analysis [72.18987459587682]
We introduce BasicTS, a benchmark designed for fair comparisons in MTS forecasting.
We highlight the heterogeneity among MTS datasets and classify them based on temporal and spatial characteristics.
arXiv Detail & Related papers (2023-10-09T19:52:22Z) - MLPST: MLP is All You Need for Spatio-Temporal Prediction [40.65579041549435]
Traffic is a typical deep model-temporal-based prediction method.
We propose a pure multi-layer perceptron architecture for traffic prediction.
arXiv Detail & Related papers (2023-09-23T12:58:16Z) - Spatial-Temporal Identity: A Simple yet Effective Baseline for
Multivariate Time Series Forecasting [17.84296081495185]
We explore the critical factors of MTS forecasting and design a model that is as powerful as STGNNs, but more concise and efficient.
We identify the indistinguishability of samples in both spatial and temporal dimensions as a key bottleneck, and propose a simple yet effective baseline for MTS forecasting.
These results suggest that we can design efficient and effective models as long as they solve the indistinguishability of samples, without being limited to STGNNs.
arXiv Detail & Related papers (2022-08-10T09:25:43Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - A Generative Learning Approach for Spatio-temporal Modeling in Connected
Vehicular Network [55.852401381113786]
This paper proposes LaMI (Latency Model Inpainting), a novel framework to generate a comprehensive-temporal quality framework for wireless access latency of connected vehicles.
LaMI adopts the idea from image inpainting and synthesizing and can reconstruct the missing latency samples by a two-step procedure.
In particular, it first discovers the spatial correlation between samples collected in various regions using a patching-based approach and then feeds the original and highly correlated samples into a Varienational Autocoder (VAE)
arXiv Detail & Related papers (2020-03-16T03:43:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.