Related papers: Spatio-Temporal Distortion Aware Omnidirectional Video Super-Resolution

Spatio-Temporal Distortion Aware Omnidirectional Video Super-Resolution

URL: http://arxiv.org/abs/2410.11506v1
Date: Tue, 15 Oct 2024 11:17:19 GMT
Title: Spatio-Temporal Distortion Aware Omnidirectional Video Super-Resolution
Authors: Hongyu An, Xinfeng Zhang, Li Zhang, Ruiqin Xiong,
Abstract summary: Video super-resolution (VSR) methods are proposed to enhance the resolution of videos, but ODV projection distortions are not well addressed directly applying such methods. We propose a novel Spatio-Temporal Distortion Aware Network (STDAN) to achieve better super-resolution reconstruction quality.
Score: 26.166579083377556
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Omnidirectional video (ODV) can provide an immersive experience and is widely utilized in the field of virtual reality and augmented reality. However, the restricted capturing devices and transmission bandwidth lead to the low resolution of ODVs. Video super-resolution (VSR) methods are proposed to enhance the resolution of videos, but ODV projection distortions in the application are not well addressed directly applying such methods. To achieve better super-resolution reconstruction quality, we propose a novel Spatio-Temporal Distortion Aware Network (STDAN) oriented to ODV characteristics. Specifically, a spatio-temporal distortion modulation module is introduced to improve spatial ODV projection distortions and exploit the temporal correlation according to intra and inter alignments. Next, we design a multi-frame reconstruction and fusion mechanism to refine the consistency of reconstructed ODV frames. Furthermore, we incorporate latitude-saliency adaptive maps in the loss function to concentrate on important viewpoint regions with higher texture complexity and human-watching interest. In addition, we collect a new ODV-SR dataset with various scenarios. Extensive experimental results demonstrate that the proposed STDAN achieves superior super-resolution performance on ODVs and outperforms state-of-the-art methods.

Related papers

Semantic and Temporal Integration in Latent Diffusion Space for High-Fidelity Video Super-Resolution [20.151571582095468]
We propose Semantic and Temporal Guided Video Super-Resolution (SeTe-VSR)<n>Our approach achieves a seamless balance between recovering intricate details and ensuring temporal coherence.<n>Our method not only preserves high-reality visual content but also significantly enhances fidelity.
arXiv Detail & Related papers (2025-08-01T09:47:35Z)
UltraVSR: Achieving Ultra-Realistic Video Super-Resolution with Efficient One-Step Diffusion Space [46.43409853027655]
Diffusion models have shown great potential in generating realistic image detail.<n>Adapting these models to video super-resolution (VSR) remains challenging due to their inherentity and lack of temporal modeling.<n>We propose UltraVSR, a novel framework that enables ultra-realistic and temporally-coherent VSR through an efficient one-step diffusion space.
arXiv Detail & Related papers (2025-05-26T13:19:27Z)
DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer [56.98400572837792]
DiVE produces high-fidelity, temporally coherent, and cross-view consistent multi-view videos.<n>These innovations collectively achieve a 2.62x speedup with minimal quality degradation.
arXiv Detail & Related papers (2025-04-28T09:20:50Z)
Event-Enhanced Blurry Video Super-Resolution [52.894824081586776]
We tackle the task of blurry video super-resolution (BVSR), aiming to generate high-resolution (HR) videos from low-resolution (LR) and blurry inputs. Current BVSR methods often fail to restore sharp details at high resolutions, resulting in noticeable artifacts and jitter. We introduce event signals into BVSR and propose a novel event-enhanced network, Ev-DeVSR.
arXiv Detail & Related papers (2025-04-17T15:55:41Z)
Temporal-Consistent Video Restoration with Pre-trained Diffusion Models [51.47188802535954]
Video restoration (VR) aims to recover high-quality videos from degraded ones.<n>Recent zero-shot VR methods using pre-trained diffusion models (DMs) suffer from approximation errors during reverse diffusion and insufficient temporal consistency.<n>We present a novel a Posterior Maximum (MAP) framework that directly parameterizes video frames in the seed space of DMs, eliminating approximation errors.
arXiv Detail & Related papers (2025-03-19T03:41:56Z)
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration [73.70209718408641]
SeedVR is a diffusion transformer designed to handle real-world video restoration with arbitrary length and resolution.<n>It achieves highly-competitive performance on both synthetic and real-world benchmarks, as well as AI-generated videos.
arXiv Detail & Related papers (2025-01-02T16:19:48Z)
Collaborative Feedback Discriminative Propagation for Video Super-Resolution [66.61201445650323]
Key success of video super-resolution (VSR) methods stems mainly from exploring spatial and temporal information. Inaccurate alignment usually leads to aligned features with significant artifacts. propagation modules only propagate the same timestep features forward or backward.
arXiv Detail & Related papers (2024-04-06T22:08:20Z)
Learning Spatial Adaptation and Temporal Coherence in Diffusion Models for Video Super-Resolution [151.1255837803585]
We propose a novel approach, pursuing Spatial Adaptation and Temporal Coherence (SATeCo) for video super-resolution. SATeCo pivots on learning spatial-temporal guidance from low-resolution videos to calibrate both latent-space high-resolution video denoising and pixel-space video reconstruction. Experiments conducted on the REDS4 and Vid4 datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-03-25T17:59:26Z)
Video Dynamics Prior: An Internal Learning Approach for Robust Video Enhancements [83.5820690348833]
We present a framework for low-level vision tasks that does not require any external training data corpus. Our approach learns neural modules by optimizing over a corrupted sequence, leveraging the weights of the coherence-temporal test and statistics internal statistics.
arXiv Detail & Related papers (2023-12-13T01:57:11Z)
Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution [15.197746480157651]
We propose an effective real-world VSR algorithm by leveraging the strength of pre-trained latent diffusion models. We exploit the temporal dynamics in LR videos to guide the diffusion process by optimizing the latent sampling path with a motion-guided loss. The proposed motion-guided latent diffusion based VSR algorithm achieves significantly better perceptual quality than state-of-the-arts on real-world VSR benchmark datasets.
arXiv Detail & Related papers (2023-12-01T14:40:07Z)
Benchmark Dataset and Effective Inter-Frame Alignment for Real-World Video Super-Resolution [65.20905703823965]
Video super-resolution (VSR) aiming to reconstruct a high-resolution (HR) video from its low-resolution (LR) counterpart has made tremendous progress in recent years. It remains challenging to deploy existing VSR methods to real-world data with complex degradations. EAVSR takes the proposed multi-layer adaptive spatial transform network (MultiAdaSTN) to refine the offsets provided by the pre-trained optical flow estimation network.
arXiv Detail & Related papers (2022-12-10T17:41:46Z)
H2-Stereo: High-Speed, High-Resolution Stereoscopic Video System [39.95458608416292]
High-resolution stereoscopic (H2-Stereo) video allows us to perceive dynamic 3D content fine. Existing methods provide compromised solutions that lack temporal or spatial details. We propose a dual camera system, in which one captures high-spatial-resolution low-frame-rate (HSR-LFR) videos with rich spatial details. We then devise a Learned Information Fusion network (LIFnet) that exploits the cross-camera redundancies to reconstruct the H2-Stereo video effectively.
arXiv Detail & Related papers (2022-08-04T04:06:01Z)
Learned Video Compression via Heterogeneous Deformable Compensation Network [78.72508633457392]
We propose a learned video compression framework via heterogeneous deformable compensation strategy (HDCVC) to tackle the problems of unstable compression performance. More specifically, the proposed algorithm extracts features from the two adjacent frames to estimate content-Neighborhood heterogeneous deformable (HetDeform) kernel offsets. Experimental results indicate that HDCVC achieves superior performance than the recent state-of-the-art learned video compression approaches.
arXiv Detail & Related papers (2022-07-11T02:31:31Z)
STRPM: A Spatiotemporal Residual Predictive Model for High-Resolution Video Prediction [78.129039340528]
We propose a StemporalResidual Predictive Model (STRPM) for high-resolution video prediction. STRPM can generate more satisfactory results compared with various existing methods. Experimental results show that STRPM can generate more satisfactory results compared with various existing methods.
arXiv Detail & Related papers (2022-03-30T06:24:00Z)
Fast Online Video Super-Resolution with Deformable Attention Pyramid [172.16491820970646]
Video super-resolution (VSR) has many applications that pose strict causal, real-time, and latency constraints, including video streaming and TV. We propose a recurrent VSR architecture based on a deformable attention pyramid (DAP)
arXiv Detail & Related papers (2022-02-03T17:49:04Z)
Video Face Super-Resolution with Motion-Adaptive Feedback Cell [90.73821618795512]
Video super-resolution (VSR) methods have recently achieved a remarkable success due to the development of deep convolutional neural networks (CNN) In this paper, we propose a Motion-Adaptive Feedback Cell (MAFC), a simple but effective block, which can efficiently capture the motion compensation and feed it back to the network in an adaptive way.
arXiv Detail & Related papers (2020-02-15T13:14:10Z)
End-To-End Trainable Video Super-Resolution Based on a New Mechanism for Implicit Motion Estimation and Compensation [19.67999205691758]
Video super-resolution aims at generating a high-resolution video from its low-resolution counterpart. We propose a novel dynamic local filter network to perform implicit motion estimation and compensation. We also propose a global refinement network based on ResBlock and autoencoder structures.
arXiv Detail & Related papers (2020-01-05T03:47:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.