Related papers: Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation

Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation

URL: http://arxiv.org/abs/2402.11922v3
Date: Mon, 25 Mar 2024 11:39:57 GMT
Title: Spatio-Temporal Few-Shot Learning via Diffusive Neural Network Generation
Authors: Yuan Yuan, Chenyang Shao, Jingtao Ding, Depeng Jin, Yong Li,
Abstract summary: We propose a novel generative pre-training framework, GPD, for intricate few-shot learning with urban knowledge transfer. We recast a generative diffusion model, which generates tailored neural networks guided by prompts. GPD consistently outperforms state-of-the-art baselines on datasets for tasks such as traffic speed prediction and crowd flow prediction.
Score: 25.916891462152044
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spatio-temporal modeling is foundational for smart city applications, yet it is often hindered by data scarcity in many cities and regions. To bridge this gap, we propose a novel generative pre-training framework, GPD, for spatio-temporal few-shot learning with urban knowledge transfer. Unlike conventional approaches that heavily rely on common feature extraction or intricate few-shot learning designs, our solution takes a novel approach by performing generative pre-training on a collection of neural network parameters optimized with data from source cities. We recast spatio-temporal few-shot learning as pre-training a generative diffusion model, which generates tailored neural networks guided by prompts, allowing for adaptability to diverse data distributions and city-specific characteristics. GPD employs a Transformer-based denoising diffusion model, which is model-agnostic to integrate with powerful spatio-temporal neural networks. By addressing challenges arising from data gaps and the complexity of generalizing knowledge across cities, our framework consistently outperforms state-of-the-art baselines on multiple real-world datasets for tasks such as traffic speed prediction and crowd flow prediction. The implementation of our approach is available: https://github.com/tsinghua-fib-lab/GPD.

Related papers

Collaborative Imputation of Urban Time Series through Cross-city Meta-learning [54.438991949772145]
We propose a novel collaborative imputation paradigm leveraging meta-learned implicit neural representations (INRs) We then introduce a cross-city collaborative learning scheme through model-agnostic meta learning. Experiments on a diverse urban dataset from 20 global cities demonstrate our model's superior imputation performance and generalizability.
arXiv Detail & Related papers (2025-01-20T07:12:40Z)
ST-FiT: Inductive Spatial-Temporal Forecasting with Limited Training Data [59.78770412981611]
In real-world applications, most nodes may not possess any available temporal data during training. We propose a principled framework named ST-FiT to handle this problem.
arXiv Detail & Related papers (2024-12-14T17:51:29Z)
Expand and Compress: Exploring Tuning Principles for Continual Spatio-Temporal Graph Forecasting [17.530885640317372]
We propose a novel prompt tuning-based continuous forecasting method. Specifically, we integrate the base-temporal graph neural network with a continuous prompt pool stored in memory. This method ensures that the model sequentially learns from the widespread-temporal data stream to accomplish tasks for corresponding periods.
arXiv Detail & Related papers (2024-10-16T14:12:11Z)
Navigating Spatio-Temporal Heterogeneity: A Graph Transformer Approach for Traffic Forecasting [13.309018047313801]
Traffic forecasting has emerged as a crucial research area in the development of smart cities. Recent advancements in network modeling for most-temporal correlations are starting to see diminishing returns in performance. To tackle these challenges, we introduce the Spatio-Temporal Graph Transformer (STGormer) We design two straightforward yet effective spatial encoding methods based on the structure and integrate time position into the vanilla transformer to capture-temporal traffic patterns.
arXiv Detail & Related papers (2024-08-20T13:18:21Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective [64.04617968947697]
We introduce a novel data-model co-design perspective: to promote superior weight sparsity. Specifically, customized Visual Prompts are mounted to upgrade neural Network sparsification in our proposed VPNs framework.
arXiv Detail & Related papers (2023-12-03T13:50:24Z)
Graph-based Multi-ODE Neural Networks for Spatio-Temporal Traffic Forecasting [8.832864937330722]
Long-range traffic forecasting remains a challenging task due to the intricate and extensive-temporal correlations observed in traffic networks. In this paper, we propose a architecture called Graph-based Multi-ODE Neural Networks (GRAM-ODE) which is designed with multiple connective ODE-GNN modules to learn better representations. Our extensive set of experiments conducted on six real-world datasets demonstrate the superior performance of GRAM-ODE compared with state-of-the-art baselines.
arXiv Detail & Related papers (2023-05-30T02:10:42Z)
Online Evolutionary Neural Architecture Search for Multivariate Non-Stationary Time Series Forecasting [72.89994745876086]
This work presents the Online Neuro-Evolution-based Neural Architecture Search (ONE-NAS) algorithm. ONE-NAS is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks. Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods.
arXiv Detail & Related papers (2023-02-20T22:25:47Z)
Learning to Learn with Generative Models of Neural Network Checkpoints [71.06722933442956]
We construct a dataset of neural network checkpoints and train a generative model on the parameters. We find that our approach successfully generates parameters for a wide range of loss prompts. We apply our method to different neural network architectures and tasks in supervised and reinforcement learning.
arXiv Detail & Related papers (2022-09-26T17:59:58Z)
Spatio-Temporal Graph Few-Shot Learning with Cross-City Knowledge Transfer [58.6106391721944]
Cross-city knowledge has shown its promise, where the model learned from data-sufficient cities is leveraged to benefit the learning process of data-scarce cities. We propose a model-agnostic few-shot learning framework for S-temporal graph called ST-GFSL. We conduct comprehensive experiments on four traffic speed prediction benchmarks and the results demonstrate the effectiveness of ST-GFSL compared with state-of-the-art methods.
arXiv Detail & Related papers (2022-05-27T12:46:52Z)
ONE-NAS: An Online NeuroEvolution based Neural Architecture Search for Time Series Forecasting [3.3758186776249928]
This work presents the Online NeuroEvolution based Neural Architecture Search (ONE-NAS) algorithm. ONE-NAS is the first neural architecture search algorithm capable of automatically designing and training new recurrent neural networks (RNNs) in an online setting. It is shown to outperform traditional statistical time series forecasting, including naive, moving average, and exponential smoothing methods.
arXiv Detail & Related papers (2022-02-27T22:58:32Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
Industrial Forecasting with Exponentially Smoothed Recurrent Neural Networks [0.0]
We present a class of exponential smoothed recurrent neural networks (RNNs) which are well suited to modeling non-stationary dynamical systems arising in industrial applications. Application of exponentially smoothed RNNs to forecasting electricity load, weather data, and stock prices highlight the efficacy of exponential smoothing of the hidden state for multi-step time series forecasting.
arXiv Detail & Related papers (2020-04-09T17:53:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.