Local-Global Temporal Difference Learning for Satellite Video
Super-Resolution
- URL: http://arxiv.org/abs/2304.04421v2
- Date: Mon, 30 Oct 2023 06:29:33 GMT
- Title: Local-Global Temporal Difference Learning for Satellite Video
Super-Resolution
- Authors: Yi Xiao, Qiangqiang Yuan, Kui Jiang, Xianyu Jin, Jiang He, Liangpei
Zhang, Chia-wen Lin
- Abstract summary: We propose to exploit the well-defined temporal difference for efficient and effective temporal compensation.
To fully utilize the local and global temporal information within frames, we systematically modeled the short-term and long-term temporal discrepancies.
Rigorous objective and subjective evaluations conducted across five mainstream video satellites demonstrate that our method performs favorably against state-of-the-art approaches.
- Score: 55.69322525367221
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optical-flow-based and kernel-based approaches have been extensively explored
for temporal compensation in satellite Video Super-Resolution (VSR). However,
these techniques are less generalized in large-scale or complex scenarios,
especially in satellite videos. In this paper, we propose to exploit the
well-defined temporal difference for efficient and effective temporal
compensation. To fully utilize the local and global temporal information within
frames, we systematically modeled the short-term and long-term temporal
discrepancies since we observed that these discrepancies offer distinct and
mutually complementary properties. Specifically, we devise a Short-term
Temporal Difference Module (S-TDM) to extract local motion representations from
RGB difference maps between adjacent frames, which yields more clues for
accurate texture representation. To explore the global dependency in the entire
frame sequence, a Long-term Temporal Difference Module (L-TDM) is proposed,
where the differences between forward and backward segments are incorporated
and activated to guide the modulation of the temporal feature, leading to a
holistic global compensation. Moreover, we further propose a Difference
Compensation Unit (DCU) to enrich the interaction between the spatial
distribution of the target frame and temporal compensated results, which helps
maintain spatial consistency while refining the features to avoid misalignment.
Rigorous objective and subjective evaluations conducted across five mainstream
video satellites demonstrate that our method performs favorably against
state-of-the-art approaches. Code will be available at
https://github.com/XY-boy/LGTD
Related papers
- Surgformer: Surgical Transformer with Hierarchical Temporal Attention for Surgical Phase Recognition [7.682613953680041]
We propose the Surgical Transformer (Surgformer) to address the issues of spatial-temporal modeling and redundancy in an end-to-end manner.
We show that our proposed Surgformer performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2024-08-07T16:16:31Z) - Collaborative Feedback Discriminative Propagation for Video Super-Resolution [66.61201445650323]
Key success of video super-resolution (VSR) methods stems mainly from exploring spatial and temporal information.
Inaccurate alignment usually leads to aligned features with significant artifacts.
propagation modules only propagate the same timestep features forward or backward.
arXiv Detail & Related papers (2024-04-06T22:08:20Z) - Continuous Space-Time Video Super-Resolution Utilizing Long-Range
Temporal Information [48.20843501171717]
We propose a continuous ST-VSR (CSTVSR) method that can convert the given video to any frame rate and spatial resolution.
We show that the proposed algorithm has good flexibility and achieves better performance on various datasets.
arXiv Detail & Related papers (2023-02-26T08:02:39Z) - Temporal Consistency Learning of inter-frames for Video Super-Resolution [38.26035126565062]
Video super-resolution (VSR) is a task that aims to reconstruct high-resolution (HR) frames from the low-resolution (LR) reference frame and multiple neighboring frames.
Existing methods generally explore information propagation and frame alignment to improve the performance of VSR.
We propose a Temporal Consistency learning Network (TCNet) for VSR in an end-to-end manner, to enhance the consistency of the reconstructed videos.
arXiv Detail & Related papers (2022-11-03T08:23:57Z) - Enhancing Space-time Video Super-resolution via Spatial-temporal Feature
Interaction [9.456643513690633]
The aim of space-time video super-resolution (STVSR) is to increase both the frame rate and the spatial resolution of a video.
Recent approaches solve STVSR using end-to-end deep neural networks.
We propose a spatial-temporal feature interaction network to enhance STVSR by exploiting both spatial and temporal correlations.
arXiv Detail & Related papers (2022-07-18T22:10:57Z) - Distortion-Aware Network Pruning and Feature Reuse for Real-time Video
Segmentation [49.17930380106643]
We propose a novel framework to speed up any architecture with skip-connections for real-time vision tasks.
Specifically, at the arrival of each frame, we transform the features from the previous frame to reuse them at specific spatial bins.
We then perform partial computation of the backbone network on the regions of the current frame that captures temporal differences between the current and previous frame.
arXiv Detail & Related papers (2022-06-20T07:20:02Z) - Look Back and Forth: Video Super-Resolution with Explicit Temporal
Difference Modeling [105.69197687940505]
We propose to explore the role of explicit temporal difference modeling in both LR and HR space.
To further enhance the super-resolution result, not only spatial residual features are extracted, but the difference between consecutive frames in high-frequency domain is also computed.
arXiv Detail & Related papers (2022-04-14T17:07:33Z) - Confidence-guided Adaptive Gate and Dual Differential Enhancement for
Video Salient Object Detection [47.68968739917077]
Video salient object detection (VSOD) aims to locate and segment the most attractive object by exploiting both spatial cues and temporal cues hidden in video sequences.
We propose a new framework to adaptively capture available information from spatial and temporal cues, which contains Confidence-guided Adaptive Gate (CAG) modules and Dual Differential Enhancement (DDE) modules.
arXiv Detail & Related papers (2021-05-14T08:49:37Z) - Exploring Rich and Efficient Spatial Temporal Interactions for Real Time
Video Salient Object Detection [87.32774157186412]
Main stream methods formulate their video saliency mainly from two independent venues, i.e., the spatial and temporal branches.
In this paper, we propose atemporal network to achieve such improvement in a full interactive fashion.
Our method is easy to implement yet effective, achieving high quality video saliency detection in real-time speed with 50 FPS.
arXiv Detail & Related papers (2020-08-07T03:24:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.