USF-Net: A Unified Spatiotemporal Fusion Network for Ground-Based Remote Sensing Cloud Image Sequence Extrapolation
- URL: http://arxiv.org/abs/2511.09045v1
- Date: Thu, 13 Nov 2025 01:28:12 GMT
- Title: USF-Net: A Unified Spatiotemporal Fusion Network for Ground-Based Remote Sensing Cloud Image Sequence Extrapolation
- Authors: Penghui Niu, Taotao Cai, Jiashuai She, Yajuan Zhang, Junhua Gua, Ping Zhanga, Jungong Hane, Jianxin Li,
- Abstract summary: Ground-based remote sensing cloud image sequence extrapolation is a key research area in the development of photovoltaic power systems.<n>We propose USF-Net, a Unified Stemporal Fusion that integrates adaptive large- kernel convolutions and a low-complexity attention mechanism.<n>As a key contribution, we also introduce and release the ASI-CIS dataset.
- Score: 7.868367798549883
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ground-based remote sensing cloud image sequence extrapolation is a key research area in the development of photovoltaic power systems. However, existing approaches exhibit several limitations:(1)they primarily rely on static kernels to augment feature information, lacking adaptive mechanisms to extract features at varying resolutions dynamically;(2)temporal guidance is insufficient, leading to suboptimal modeling of long-range spatiotemporal dependencies; and(3)the quadratic computational cost of attention mechanisms is often overlooked, limiting efficiency in practical deployment. To address these challenges, we propose USF-Net, a Unified Spatiotemporal Fusion Network that integrates adaptive large-kernel convolutions and a low-complexity attention mechanism, combining temporal flow information within an encoder-decoder framework. Specifically, the encoder employs three basic layers to extract features. Followed by the USTM, which comprises:(1)a SiB equipped with a SSM that dynamically captures multi-scale contextual information, and(2)a TiB featuring a TAM that effectively models long-range temporal dependencies while maintaining computational efficiency. In addition, a DSM with a TGM is introduced to enable unified modeling of temporally guided spatiotemporal dependencies. On the decoder side, a DUM is employed to address the common "ghosting effect." It utilizes the initial temporal state as an attention operator to preserve critical motion signatures. As a key contribution, we also introduce and release the ASI-CIS dataset. Extensive experiments on ASI-CIS demonstrate that USF-Net significantly outperforms state-of-the-art methods, establishing a superior balance between prediction accuracy and computational efficiency for ground-based cloud extrapolation. The dataset and source code will be available at https://github.com/she1110/ASI-CIS.
Related papers
- FAIM: Frequency-Aware Interactive Mamba for Time Series Classification [87.84511960413715]
Time series classification (TSC) is crucial in numerous real-world applications, such as environmental monitoring, medical diagnosis, and posture recognition.<n>We propose FAIM, a lightweight Frequency-Aware Interactive Mamba model.<n>We show that FAIM consistently outperforms existing state-of-the-art (SOTA) methods, achieving a superior trade-off between accuracy and efficiency.
arXiv Detail & Related papers (2025-11-26T08:36:33Z) - Artificial Intelligence-Based Multiscale Temporal Modeling for Anomaly Detection in Cloud Services [10.421371572062595]
This study proposes an anomaly detection method based on the Transformer architecture with integrated multiscale feature perception.<n>The proposed method outperforms mainstream baseline models in key metrics, including precision, recall, AUC, and F1-score.
arXiv Detail & Related papers (2025-08-20T07:52:36Z) - Electromyography-Based Gesture Recognition: Hierarchical Feature Extraction for Enhanced Spatial-Temporal Dynamics [0.7083699704958353]
We propose a lightweight squeeze-excitation deep learning-based multi stream spatial temporal dynamics time-varying feature extraction approach.<n>The proposed model was tested on the Ninapro DB2, DB4, and DB5 datasets, achieving accuracy rates of 96.41%, 92.40%, and 93.34%, respectively.
arXiv Detail & Related papers (2025-04-04T07:11:12Z) - SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data Pretraining [62.433137130087445]
SuperFlow++ is a novel framework that integrates pretraining and downstream tasks using consecutive camera pairs.<n>We show that SuperFlow++ outperforms state-of-the-art methods across diverse tasks and driving conditions.<n>With strong generalizability and computational efficiency, SuperFlow++ establishes a new benchmark for data-efficient LiDAR-based perception in autonomous driving.
arXiv Detail & Related papers (2025-03-25T17:59:57Z) - STNMamba: Mamba-based Spatial-Temporal Normality Learning for Video Anomaly Detection [48.997518615379995]
Video anomaly detection (VAD) has been extensively researched due to its potential for intelligent video systems.<n>Most existing methods based on CNNs and transformers still suffer from substantial computational burdens.<n>We propose a lightweight and effective Mamba-based network named STNMamba to enhance the learning of spatial-temporal normality.
arXiv Detail & Related papers (2024-12-28T08:49:23Z) - Cross Space and Time: A Spatio-Temporal Unitized Model for Traffic Flow Forecasting [16.782154479264126]
Predicting backbone-temporal traffic flow presents challenges due to complex interactions between temporal factors.
Existing approaches address these dimensions in isolation, neglecting their critical interdependencies.
In this paper, we introduce Sanonymous-Temporal Unitized Unitized Cell (ASTUC), a unified framework designed to capture both spatial and temporal dependencies.
arXiv Detail & Related papers (2024-11-14T07:34:31Z) - EffiCANet: Efficient Time Series Forecasting with Convolutional Attention [12.784289506021265]
EffiCANet is designed to enhance forecasting accuracy while maintaining computational efficiency.
EffiCANet achieves the maximum reduction of 10.02% in MAE over state-of-the-art models.
arXiv Detail & Related papers (2024-11-07T12:54:42Z) - Temporal Feature Matters: A Framework for Diffusion Model Quantization [105.3033493564844]
Diffusion models rely on the time-step for the multi-round denoising.<n>We introduce a novel quantization framework that includes three strategies.<n>This framework preserves most of the temporal information and ensures high-quality end-to-end generation.
arXiv Detail & Related papers (2024-07-28T17:46:15Z) - SFANet: Spatial-Frequency Attention Network for Weather Forecasting [54.470205739015434]
Weather forecasting plays a critical role in various sectors, driving decision-making and risk management.
Traditional methods often struggle to capture the complex dynamics of meteorological systems.
We propose a novel framework designed to address these challenges and enhance the accuracy of weather prediction.
arXiv Detail & Related papers (2024-05-29T08:00:15Z) - Towards Long-Term Time-Series Forecasting: Feature, Pattern, and
Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning.
Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism.
We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z) - TCJA-SNN: Temporal-Channel Joint Attention for Spiking Neural Networks [22.965024490694525]
Spiking Neural Networks (SNNs) are attracting widespread interest due to their biological plausibility, energy efficiency and powerful-temporal information representation ability.
We present a Temporal-Channel Joint Attention mechanism for SNNs, referred to as TCJA-SNN.
The proposed TCJA-SNN framework can effectively assess the significance of spike sequence from both spatial and temporal dimensions.
arXiv Detail & Related papers (2022-06-21T08:16:08Z) - STJLA: A Multi-Context Aware Spatio-Temporal Joint Linear Attention
Network for Traffic Forecasting [7.232141271583618]
We propose a novel deep learning model for traffic forecasting named inefficient-Context Spatio-Temporal Joint Linear Attention (SSTLA)
SSTLA applies linear attention to a joint graph to capture global dependence between alltemporal- nodes efficiently.
Experiments on two real-world traffic datasets, England and Temporal7, demonstrate that our STJLA can achieve 9.83% and 3.08% 3.08% accuracy in MAE measure over state-of-the-art baselines.
arXiv Detail & Related papers (2021-12-04T06:39:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.