A Systematic Investigation on Deep Learning-Based Omnidirectional Image and Video Super-Resolution
- URL: http://arxiv.org/abs/2506.06710v1
- Date: Sat, 07 Jun 2025 08:24:44 GMT
- Title: A Systematic Investigation on Deep Learning-Based Omnidirectional Image and Video Super-Resolution
- Authors: Qianqian Zhao, Chunle Guo, Tianyi Zhang, Junpei Zhang, Peiyang Jia, Tan Su, Wenjie Jiang, Chongyi Li,
- Abstract summary: This paper presents a systematic review of recent progress in omnidirectional image and video super-resolution.<n>We introduce a new dataset, 360Insta, that comprises authentically degraded omnidirectional images and videos.<n>We conduct comprehensive qualitative and quantitative evaluations of existing methods on both public datasets and our proposed dataset.
- Score: 30.62413133817583
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Omnidirectional image and video super-resolution is a crucial research topic in low-level vision, playing an essential role in virtual reality and augmented reality applications. Its goal is to reconstruct high-resolution images or video frames from low-resolution inputs, thereby enhancing detail preservation and enabling more accurate scene analysis and interpretation. In recent years, numerous innovative and effective approaches have been proposed, predominantly based on deep learning techniques, involving diverse network architectures, loss functions, projection strategies, and training datasets. This paper presents a systematic review of recent progress in omnidirectional image and video super-resolution, focusing on deep learning-based methods. Given that existing datasets predominantly rely on synthetic degradation and fall short in capturing real-world distortions, we introduce a new dataset, 360Insta, that comprises authentically degraded omnidirectional images and videos collected under diverse conditions, including varying lighting, motion, and exposure settings. This dataset addresses a critical gap in current omnidirectional benchmarks and enables more robust evaluation of the generalization capabilities of omnidirectional super-resolution methods. We conduct comprehensive qualitative and quantitative evaluations of existing methods on both public datasets and our proposed dataset. Furthermore, we provide a systematic overview of the current status of research and discuss promising directions for future exploration. All datasets, methods, and evaluation metrics introduced in this work are publicly available and will be regularly updated. Project page: https://github.com/nqian1/Survey-on-ODISR-and-ODVSR.
Related papers
- From Waveforms to Pixels: A Survey on Audio-Visual Segmentation [43.79010208565961]
Audio-Visual aims to identify and segment sound-producing objects in videos by leveraging both visual and audio modalities.<n>We present a comprehensive overview of the AVS field, covering its problem formulation, benchmark datasets, evaluation metrics, and the progression of methodologies.
arXiv Detail & Related papers (2025-07-29T22:20:51Z) - Towards Depth Foundation Model: Recent Trends in Vision-Based Depth Estimation [75.30238170051291]
Depth estimation is a fundamental task in 3D computer vision, crucial for applications such as 3D reconstruction, free-viewpoint rendering, robotics, autonomous driving, and AR/VR technologies.<n>Traditional methods relying on hardware sensors like LiDAR are often limited by high costs, low resolution, and environmental sensitivity, limiting their applicability in real-world scenarios.<n>Recent advances in vision-based methods offer a promising alternative, yet they face challenges in generalization and stability due to either the low-capacity model architectures or the reliance on domain-specific and small-scale datasets.
arXiv Detail & Related papers (2025-07-15T17:59:59Z) - IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations [64.07859467542664]
Capturing geometric and material information from images remains a fundamental challenge in computer vision and graphics.<n>Traditional optimization-based methods often require hours of computational time to reconstruct geometry, material properties, and environmental lighting from dense multi-view inputs.<n>We introduce IDArb, a diffusion-based model designed to perform intrinsic decomposition on an arbitrary number of images under varying illuminations.
arXiv Detail & Related papers (2024-12-16T18:52:56Z) - A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends [67.43992456058541]
Image restoration (IR) refers to the process of improving visual quality of images while removing degradation, such as noise, blur, weather effects, and so on.
Traditional IR methods typically target specific types of degradation, which limits their effectiveness in real-world scenarios with complex distortions.
The all-in-one image restoration (AiOIR) paradigm has emerged, offering a unified framework that adeptly addresses multiple degradation types.
arXiv Detail & Related papers (2024-10-19T11:11:09Z) - Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method [31.763312726582217]
generative model has made significant advancements in the creation of realistic videos, which causes security issues.
In this paper, we first construct a video dataset using advanced diffusion-based video generation algorithms with various semantic contents.
By analyzing local and global temporal defects of current AI-generated videos, a novel detection framework is constructed to expose fake videos.
arXiv Detail & Related papers (2024-05-07T09:00:09Z) - A Survey on Super Resolution for video Enhancement Using GAN [0.0]
Recent developments in super-resolution image and video using deep learning algorithms such as Generative Adversarial Networks are covered.
Advancements aim to increase the visual clarity and quality of low-resolution video, have tremendous potential in a variety of sectors ranging from surveillance technology to medical imaging.
This collection delves into the wider field of Generative Adversarial Networks, exploring their principles, training approaches, and applications across a broad range of domains.
arXiv Detail & Related papers (2023-12-27T08:41:38Z) - Lighting the Darkness in the Deep Learning Era [118.35081853500411]
Low-light image enhancement (LLIE) aims at improving the perception or interpretability of an image captured in an environment with poor illumination.
Recent advances in this area are dominated by deep learning-based solutions.
We provide a comprehensive survey to cover various aspects ranging from algorithm taxonomy to unsolved open issues.
arXiv Detail & Related papers (2021-04-21T19:12:19Z) - Video Summarization Using Deep Neural Networks: A Survey [72.98424352264904]
Video summarization technologies aim to create a concise and complete synopsis by selecting the most informative parts of the video content.
This work focuses on the recent advances in the area and provides a comprehensive survey of the existing deep-learning-based methods for generic video summarization.
arXiv Detail & Related papers (2021-01-15T11:41:29Z) - Video Super Resolution Based on Deep Learning: A Comprehensive Survey [87.30395002197344]
We comprehensively investigate 33 state-of-the-art video super-resolution (VSR) methods based on deep learning.
We propose a taxonomy and classify the methods into six sub-categories according to the ways of utilizing inter-frame information.
We summarize and compare the performance of the representative VSR method on some benchmark datasets.
arXiv Detail & Related papers (2020-07-25T13:39:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.