Related papers: RainDiff: End-to-end Precipitation Nowcasting Via Token-wise Attention Diffusion

RainDiff: End-to-end Precipitation Nowcasting Via Token-wise Attention Diffusion

URL: http://arxiv.org/abs/2510.14962v1
Date: Thu, 16 Oct 2025 17:59:13 GMT
Title: RainDiff: End-to-end Precipitation Nowcasting Via Token-wise Attention Diffusion
Authors: Thao Nguyen, Jiaqi Ma, Fahad Shahbaz Khan, Souhaib Ben Taieb, Salman Khan,
Abstract summary: We propose a Token-wise Attention integrated into not only the U-Net diffusion model but also the radar-temporal encoder.<n>Unlike prior approaches, our method integrates attention into the architecture without incurring the high resource cost typical of pixel-space diffusion.<n>Our experiments and evaluations demonstrate that the proposed method significantly outperforms state-of-the-art approaches, robustness local fidelity, generalization, and superior in complex precipitation forecasting scenarios.
Score: 64.49056527678606
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Precipitation nowcasting, predicting future radar echo sequences from current observations, is a critical yet challenging task due to the inherently chaotic and tightly coupled spatio-temporal dynamics of the atmosphere. While recent advances in diffusion-based models attempt to capture both large-scale motion and fine-grained stochastic variability, they often suffer from scalability issues: latent-space approaches require a separately trained autoencoder, adding complexity and limiting generalization, while pixel-space approaches are computationally intensive and often omit attention mechanisms, reducing their ability to model long-range spatio-temporal dependencies. To address these limitations, we propose a Token-wise Attention integrated into not only the U-Net diffusion model but also the spatio-temporal encoder that dynamically captures multi-scale spatial interactions and temporal evolution. Unlike prior approaches, our method natively integrates attention into the architecture without incurring the high resource cost typical of pixel-space diffusion, thereby eliminating the need for separate latent modules. Our extensive experiments and visual evaluations across diverse datasets demonstrate that the proposed method significantly outperforms state-of-the-art approaches, yielding superior local fidelity, generalization, and robustness in complex precipitation forecasting scenarios.

Related papers

Learning Multi-Modal Mobility Dynamics for Generalized Next Location Recommendation [51.00494428978262]
We leverage multi-modal spatial-temporal knowledge to characterize mobility dynamics for the location recommendation task.<n>First, we construct a unified spatial-temporal relational graph (STRG) for multi-modal representation.<n>Second, we design a gating mechanism to fuse spatial-temporal graph representations of different modalities.
arXiv Detail & Related papers (2025-12-27T14:23:04Z)
Breaking the Discretization Barrier of Continuous Physics Simulation Learning [16.740327071700268]
We propose a purely data-driven method to model continuous physics simulation from partial observations.<n>Specifically, we employ multiplicative filter network to fuse and encode spatial information with the corresponding observations.<n>We customize geometric grids and use message-passing mechanism to map features from original spatial domain to the customized grids.
arXiv Detail & Related papers (2025-09-22T16:10:58Z)
ST-GS: Vision-Based 3D Semantic Occupancy Prediction with Spatial-Temporal Gaussian Splatting [21.87807066521776]
3D occupancy prediction is critical for comprehensive scene understanding in vision-centric autonomous driving.<n>Recent advances have explored utilizing 3D semantic Gaussians to model occupancy while reducing computational overhead.<n>We propose a novel Spatial-Temporal Gaussian Splatting (ST-GS) framework to enhance both spatial and temporal modeling.
arXiv Detail & Related papers (2025-09-20T06:36:30Z)
StateSpaceDiffuser: Bringing Long Context to Diffusion World Models [53.05314852577144]
We introduce StateSpaceDiffuser, where a diffusion model is enabled to perform long-context tasks by integrating features from a state-space model.<n>This design restores long-term memory while preserving the high-fidelity synthesis of diffusion models.<n>Experiments show that StateSpaceDiffuser significantly outperforms a strong diffusion-only baseline.
arXiv Detail & Related papers (2025-05-28T11:27:54Z)
Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models [71.63194926457119]
We introduce Dynamical Diffusion (DyDiff), a theoretically sound framework that incorporates temporally aware forward and reverse processes.<n>Experiments across scientifictemporal forecasting, video prediction, and time series forecasting demonstrate that Dynamical Diffusion consistently improves performance in temporal predictive tasks.
arXiv Detail & Related papers (2025-03-02T16:10:32Z)
Multi-Agent Path Finding in Continuous Spaces with Projected Diffusion Models [57.45019514036948]
Multi-Agent Path Finding (MAPF) is a fundamental problem in robotics.<n>This work proposes a novel approach that integrates constrained optimization with diffusion models for MAPF in continuous spaces.
arXiv Detail & Related papers (2024-12-23T21:27:19Z)
ST-ReP: Learning Predictive Representations Efficiently for Spatial-Temporal Forecasting [7.637123047745445]
Self-supervised methods are increasingly adapted to learn spatial-temporal representations.<n>Current value reconstruction and future value prediction are integrated into the pre-training framework.<n>Multi-time scale analysis is incorporated into the self-supervised loss to enhance predictive capability.
arXiv Detail & Related papers (2024-12-19T05:33:55Z)
SFANet: Spatial-Frequency Attention Network for Weather Forecasting [54.470205739015434]
Weather forecasting plays a critical role in various sectors, driving decision-making and risk management. Traditional methods often struggle to capture the complex dynamics of meteorological systems. We propose a novel framework designed to address these challenges and enhance the accuracy of weather prediction.
arXiv Detail & Related papers (2024-05-29T08:00:15Z)
Triplet Attention Transformer for Spatiotemporal Predictive Learning [9.059462850026216]
We propose an innovative triplet attention transformer designed to capture both inter-frame dynamics and intra-frame static features. The model incorporates the Triplet Attention Module (TAM), which replaces traditional recurrent units by exploring self-attention mechanisms in temporal, spatial, and channel dimensions.
arXiv Detail & Related papers (2023-10-28T12:49:33Z)
A Spatial-Temporal Attentive Network with Spatial Continuity for Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC) First, spatial-temporal attention mechanism is presented to explore the most useful and important information. Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.