Related papers: USF-Net: A Unified Spatiotemporal Fusion Network for Ground-Based Remote Sensing Cloud Image Sequence Extrapolation

USF-Net: A Unified Spatiotemporal Fusion Network for Ground-Based Remote Sensing Cloud Image Sequence Extrapolation

URL: http://arxiv.org/abs/2511.09045v1
Date: Thu, 13 Nov 2025 01:28:12 GMT
Title: USF-Net: A Unified Spatiotemporal Fusion Network for Ground-Based Remote Sensing Cloud Image Sequence Extrapolation
Authors: Penghui Niu, Taotao Cai, Jiashuai She, Yajuan Zhang, Junhua Gua, Ping Zhanga, Jungong Hane, Jianxin Li,
Abstract summary: Ground-based remote sensing cloud image sequence extrapolation is a key research area in the development of photovoltaic power systems.<n>We propose USF-Net, a Unified Stemporal Fusion that integrates adaptive large- kernel convolutions and a low-complexity attention mechanism.<n>As a key contribution, we also introduce and release the ASI-CIS dataset.
Score: 7.868367798549883
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Ground-based remote sensing cloud image sequence extrapolation is a key research area in the development of photovoltaic power systems. However, existing approaches exhibit several limitations:(1)they primarily rely on static kernels to augment feature information, lacking adaptive mechanisms to extract features at varying resolutions dynamically;(2)temporal guidance is insufficient, leading to suboptimal modeling of long-range spatiotemporal dependencies; and(3)the quadratic computational cost of attention mechanisms is often overlooked, limiting efficiency in practical deployment. To address these challenges, we propose USF-Net, a Unified Spatiotemporal Fusion Network that integrates adaptive large-kernel convolutions and a low-complexity attention mechanism, combining temporal flow information within an encoder-decoder framework. Specifically, the encoder employs three basic layers to extract features. Followed by the USTM, which comprises:(1)a SiB equipped with a SSM that dynamically captures multi-scale contextual information, and(2)a TiB featuring a TAM that effectively models long-range temporal dependencies while maintaining computational efficiency. In addition, a DSM with a TGM is introduced to enable unified modeling of temporally guided spatiotemporal dependencies. On the decoder side, a DUM is employed to address the common "ghosting effect." It utilizes the initial temporal state as an attention operator to preserve critical motion signatures. As a key contribution, we also introduce and release the ASI-CIS dataset. Extensive experiments on ASI-CIS demonstrate that USF-Net significantly outperforms state-of-the-art methods, establishing a superior balance between prediction accuracy and computational efficiency for ground-based cloud extrapolation. The dataset and source code will be available at https://github.com/she1110/ASI-CIS.

Related papers

FAIM: Frequency-Aware Interactive Mamba for Time Series Classification [87.84511960413715]
Time series classification (TSC) is crucial in numerous real-world applications, such as environmental monitoring, medical diagnosis, and posture recognition.<n>We propose FAIM, a lightweight Frequency-Aware Interactive Mamba model.<n>We show that FAIM consistently outperforms existing state-of-the-art (SOTA) methods, achieving a superior trade-off between accuracy and efficiency.
arXiv Detail & Related papers (2025-11-26T08:36:33Z)
Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services [10.421371572062595]
This study proposes an anomaly detection method based on the Transformer architecture with integrated multiscale feature perception.<n>The proposed method outperforms mainstream baseline models in key metrics, including precision, recall, AUC, and F1-score.
arXiv Detail & Related papers (2025-08-20T07:52:36Z)
Electromyography-Based Gesture Recognition: Hierarchical Feature Extraction for Enhanced Spatial-Temporal Dynamics [0.7083699704958353]
We propose a lightweight squeeze-excitation deep learning-based multi stream spatial temporal dynamics time-varying feature extraction approach.<n>The proposed model was tested on the Ninapro DB2, DB4, and DB5 datasets, achieving accuracy rates of 96.41%, 92.40%, and 93.34%, respectively.
arXiv Detail & Related papers (2025-04-04T07:11:12Z)
SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining [62.433137130087445]
SuperFlow++ is a novel framework that integrates pretraining and downstream tasks using consecutive camera pairs.<n>We show that SuperFlow++ outperforms state-of-the-art methods across diverse tasks and driving conditions.<n>With strong generalizability and computational efficiency, SuperFlow++ establishes a new benchmark for data-efficient LiDAR-based perception in autonomous driving.
arXiv Detail & Related papers (2025-03-25T17:59:57Z)
STNMamba: Mamba-based Spatial-Temporal Normality Learning for Video Anomaly Detection [48.997518615379995]
Video anomaly detection (VAD) has been extensively researched due to its potential for intelligent video systems.<n>Most existing methods based on CNNs and transformers still suffer from substantial computational burdens.<n>We propose a lightweight and effective Mamba-based network named STNMamba to enhance the learning of spatial-temporal normality.
arXiv Detail & Related papers (2024-12-28T08:49:23Z)
Cross Space and Time: A Spatio-Temporal Unitized Model for Traffic Flow Forecasting [16.782154479264126]
Predicting backbone-temporal traffic flow presents challenges due to complex interactions between temporal factors. Existing approaches address these dimensions in isolation, neglecting their critical interdependencies. In this paper, we introduce Sanonymous-Temporal Unitized Unitized Cell (ASTUC), a unified framework designed to capture both spatial and temporal dependencies.
arXiv Detail & Related papers (2024-11-14T07:34:31Z)
EffiCANet: Efficient Time Series Forecasting with Convolutional Attention [12.784289506021265]
EffiCANet is designed to enhance forecasting accuracy while maintaining computational efficiency. EffiCANet achieves the maximum reduction of 10.02% in MAE over state-of-the-art models.
arXiv Detail & Related papers (2024-11-07T12:54:42Z)
Temporal Feature Matters: A Framework for Diffusion Model Quantization [105.3033493564844]
Diffusion models rely on the time-step for the multi-round denoising.<n>We introduce a novel quantization framework that includes three strategies.<n>This framework preserves most of the temporal information and ensures high-quality end-to-end generation.
arXiv Detail & Related papers (2024-07-28T17:46:15Z)
SFANet: Spatial-Frequency Attention Network for Weather Forecasting [54.470205739015434]
Weather forecasting plays a critical role in various sectors, driving decision-making and risk management. Traditional methods often struggle to capture the complex dynamics of meteorological systems. We propose a novel framework designed to address these challenges and enhance the accuracy of weather prediction.
arXiv Detail & Related papers (2024-05-29T08:00:15Z)
Towards Long-Term Time-Series Forecasting: Feature, Pattern, and Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning. Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism. We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z)
TCJA-SNN: Temporal-Channel Joint Attention for Spiking Neural Networks [22.965024490694525]
Spiking Neural Networks (SNNs) are attracting widespread interest due to their biological plausibility, energy efficiency and powerful-temporal information representation ability. We present a Temporal-Channel Joint Attention mechanism for SNNs, referred to as TCJA-SNN. The proposed TCJA-SNN framework can effectively assess the significance of spike sequence from both spatial and temporal dimensions.
arXiv Detail & Related papers (2022-06-21T08:16:08Z)
STJLA: A Multi-Context Aware Spatio-Temporal Joint Linear Attention Network for Traffic Forecasting [7.232141271583618]
We propose a novel deep learning model for traffic forecasting named inefficient-Context Spatio-Temporal Joint Linear Attention (SSTLA) SSTLA applies linear attention to a joint graph to capture global dependence between alltemporal- nodes efficiently. Experiments on two real-world traffic datasets, England and Temporal7, demonstrate that our STJLA can achieve 9.83% and 3.08% 3.08% accuracy in MAE measure over state-of-the-art baselines.
arXiv Detail & Related papers (2021-12-04T06:39:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.