Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation
- URL: http://arxiv.org/abs/2402.11922v3
- Date: Mon, 25 Mar 2024 11:39:57 GMT
- Title: Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation
- Authors: Yuan Yuan, Chenyang Shao, Jingtao Ding, Depeng Jin, Yong Li,
- Abstract summary: We propose a novel generative pre-training framework, GPD, for intricate few-shot learning with urban knowledge transfer.
We recast a generative diffusion model, which generates tailored neural networks guided by prompts.
GPD consistently outperforms state-of-the-art baselines on datasets for tasks such as traffic speed prediction and crowd flow prediction.
- Score: 25.916891462152044
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Spatio-temporal modeling is foundational for smart city applications, yet it is often hindered by data scarcity in many cities and regions. To bridge this gap, we propose a novel generative pre-training framework, GPD, for spatio-temporal few-shot learning with urban knowledge transfer. Unlike conventional approaches that heavily rely on common feature extraction or intricate few-shot learning designs, our solution takes a novel approach by performing generative pre-training on a collection of neural network parameters optimized with data from source cities. We recast spatio-temporal few-shot learning as pre-training a generative diffusion model, which generates tailored neural networks guided by prompts, allowing for adaptability to diverse data distributions and city-specific characteristics. GPD employs a Transformer-based denoising diffusion model, which is model-agnostic to integrate with powerful spatio-temporal neural networks. By addressing challenges arising from data gaps and the complexity of generalizing knowledge across cities, our framework consistently outperforms state-of-the-art baselines on multiple real-world datasets for tasks such as traffic speed prediction and crowd flow prediction. The implementation of our approach is available: https://github.com/tsinghua-fib-lab/GPD.
Related papers
- Diffusion-based Neural Network Weights Generation [85.6725307453325]
We propose an efficient and adaptive transfer learning scheme through dataset-conditioned pretrained weights sampling.
Specifically, we use a latent diffusion model with a variational autoencoder that can reconstruct the neural network weights.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - Visual Prompting Upgrades Neural Network Sparsification: A Data-Model
Perspective [67.25782152459851]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity.
Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z) - Graph-based Multi-ODE Neural Networks for Spatio-Temporal Traffic
Forecasting [8.832864937330722]
Long-range traffic forecasting remains a challenging task due to the intricate and extensive-temporal correlations observed in traffic networks.
In this paper, we propose a architecture called Graph-based Multi-ODE Neural Networks (GRAM-ODE) which is designed with multiple connective ODE-GNN modules to learn better representations.
Our extensive set of experiments conducted on six real-world datasets demonstrate the superior performance of GRAM-ODE compared with state-of-the-art baselines.
arXiv Detail & Related papers (2023-05-30T02:10:42Z) - Automated Spatio-Temporal Graph Contrastive Learning [18.245433428868775]
We develop an automated-temporal augmentation scheme with a parameterized contrastive view generator.
AutoST can adapt to the heterogeneous graph with multi-view semantics well preserved.
Experiments for three downstream-temporal mining tasks on several real-world datasets demonstrate the significant performance gain.
arXiv Detail & Related papers (2023-05-06T03:52:33Z) - Semantic-Fused Multi-Granularity Cross-City Traffic Prediction [17.020546413647708]
We propose a Semantic-Fused Multi-Granularity Transfer Learning model to achieve knowledge transfer across cities with fused semantics at different granularities.
In detail, we design a semantic fusion module to fuse various semantics while conserving static spatial dependencies.
We conduct extensive experiments on six real-world datasets to verify the effectiveness of our STL model.
arXiv Detail & Related papers (2023-02-23T04:26:34Z) - Online Evolutionary Neural Architecture Search for Multivariate
Non-Stationary Time Series Forecasting [72.89994745876086]
This work presents the Online Neuro-Evolution-based Neural Architecture Search (ONE-NAS) algorithm.
ONE-NAS is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks.
Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods.
arXiv Detail & Related papers (2023-02-20T22:25:47Z) - Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters.
We find that our approach successfully generates parameters for a wide range of loss prompts.
We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z) - Spatio-Temporal Graph Few-Shot Learning with Cross-City Knowledge
Transfer [58.6106391721944]
Cross-city knowledge has shown its promise, where the model learned from data-sufficient cities is leveraged to benefit the learning process of data-scarce cities.
We propose a model-agnostic few-shot learning framework for S-temporal graph called ST-GFSL.
We conduct comprehensive experiments on four traffic speed prediction benchmarks and the results demonstrate the effectiveness of ST-GFSL compared with state-of-the-art methods.
arXiv Detail & Related papers (2022-05-27T12:46:52Z) - ONE-NAS: An Online NeuroEvolution based Neural Architecture Search for
Time Series Forecasting [3.3758186776249928]
This work presents the Online NeuroEvolution based Neural Architecture Search (ONE-NAS) algorithm.
ONE-NAS is the first neural architecture search algorithm capable of automatically designing and training new recurrent neural networks (RNNs) in an online setting.
It is shown to outperform traditional statistical time series forecasting, including naive, moving average, and exponential smoothing methods.
arXiv Detail & Related papers (2022-02-27T22:58:32Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z) - Industrial Forecasting with Exponentially Smoothed Recurrent Neural
Networks [0.0]
We present a class of exponential smoothed recurrent neural networks (RNNs) which are well suited to modeling non-stationary dynamical systems arising in industrial applications.
Application of exponentially smoothed RNNs to forecasting electricity load, weather data, and stock prices highlight the efficacy of exponential smoothing of the hidden state for multi-step time series forecasting.
arXiv Detail & Related papers (2020-04-09T17:53:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.