OPDN: Omnidirectional Position-aware Deformable Network for
Omnidirectional Image Super-Resolution
- URL: http://arxiv.org/abs/2304.13471v1
- Date: Wed, 26 Apr 2023 11:47:40 GMT
- Title: OPDN: Omnidirectional Position-aware Deformable Network for
Omnidirectional Image Super-Resolution
- Authors: Xiaopeng Sun and Weiqi Li and Zhenyu Zhang and Qiufang Ma and Xuhan
Sheng and Ming Cheng and Haoyu Ma and Shijie Zhao and Jian Zhang and Junlin
Li and Li Zhang
- Abstract summary: We propose a two-stage framework for 360deg omnidirectional image superresolution.
Our proposed method achieves superior performance and wins the NTIRE 2023 challenge of 360deg omnidirectional image super-resolution.
- Score: 18.138867445188293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 360{\deg} omnidirectional images have gained research attention due to their
immersive and interactive experience, particularly in AR/VR applications.
However, they suffer from lower angular resolution due to being captured by
fisheye lenses with the same sensor size for capturing planar images. To solve
the above issues, we propose a two-stage framework for 360{\deg}
omnidirectional image superresolution. The first stage employs two branches:
model A, which incorporates omnidirectional position-aware deformable blocks
(OPDB) and Fourier upsampling, and model B, which adds a spatial frequency
fusion module (SFF) to model A. Model A aims to enhance the feature extraction
ability of 360{\deg} image positional information, while Model B further
focuses on the high-frequency information of 360{\deg} images. The second stage
performs same-resolution enhancement based on the structure of model A with a
pixel unshuffle operation. In addition, we collected data from YouTube to
improve the fitting ability of the transformer, and created pseudo
low-resolution images using a degradation network. Our proposed method achieves
superior performance and wins the NTIRE 2023 challenge of 360{\deg}
omnidirectional image super-resolution.
Related papers
- Customizing 360-Degree Panoramas through Text-to-Image Diffusion Models [38.70079108858637]
We propose an approach that focuses on the customization of 360-degree panoramas using a T2I diffusion model.
To achieve this, we curate a paired image-text dataset specifically designed for the task and employ it to fine-tune a pre-trained T2I diffusion model with LoRA.
We propose a method called StitchDiffusion to ensure continuity between the leftmost and rightmost sides of the synthesized images.
arXiv Detail & Related papers (2023-10-28T22:57:24Z) - Distortion-aware Transformer in 360{\deg} Salient Object Detection [44.74647420381127]
We propose a Transformer-based model called DATFormer to address the distortion problem.
To exploit the unique characteristics of 360deg data, we present a learnable relation matrix.
Our model outperforms existing 2D SOD (salient object detection) and 360 SOD methods.
arXiv Detail & Related papers (2023-08-07T07:28:24Z) - Generative Multiplane Neural Radiance for 3D-Aware Image Generation [102.15322193381617]
We present a method to efficiently generate 3D-aware high-resolution images that are view-consistent across multiple target views.
Our GMNR model generates 3D-aware images of 1024 X 1024 pixels with 17.6 FPS on a single V100.
arXiv Detail & Related papers (2023-04-03T17:41:20Z) - Multi-Projection Fusion and Refinement Network for Salient Object
Detection in 360{\deg} Omnidirectional Image [141.10227079090419]
We propose a Multi-Projection Fusion and Refinement Network (MPFR-Net) to detect the salient objects in 360deg omnidirectional image.
MPFR-Net uses the equirectangular projection image and four corresponding cube-unfolding images as inputs.
Experimental results on two omnidirectional datasets demonstrate that the proposed approach outperforms the state-of-the-art methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-12-23T14:50:40Z) - Neural Volume Super-Resolution [49.879789224455436]
We propose a neural super-resolution network that operates directly on the volumetric representation of the scene.
To realize our method, we devise a novel 3D representation that hinges on multiple 2D feature planes.
We validate the proposed method by super-resolving multi-view consistent views on a diverse set of unseen 3D scenes.
arXiv Detail & Related papers (2022-12-09T04:54:13Z) - View-aware Salient Object Detection for 360{\deg} Omnidirectional Image [33.43250302656753]
We construct a large scale 360deg ISOD dataset with object-level pixel-wise annotation on equirectangular projection (ERP)
Inspired by humans' observing process, we propose a view-aware salient object detection method based on a Sample Adaptive View Transformer (SAVT) module.
arXiv Detail & Related papers (2022-09-27T07:44:08Z) - TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with
Transformers [49.689566246504356]
We propose TransFusion, a robust solution to LiDAR-camera fusion with a soft-association mechanism to handle inferior image conditions.
TransFusion achieves state-of-the-art performance on large-scale datasets.
We extend the proposed method to the 3D tracking task and achieve the 1st place in the leaderboard of nuScenes tracking.
arXiv Detail & Related papers (2022-03-22T07:15:13Z) - Field-of-View IoU for Object Detection in 360{\deg} Images [36.72543749626039]
We propose two fundamental techniques -- Field-of-View IoU (FoV-IoU) and 360Augmentation for object detection in 360deg images.
FoV-IoU computes the intersection-over-union of two Field-of-View bounding boxes in a spherical image which could be used for training, inference, and evaluation.
360Augmentation is a data augmentation technique specific to 360deg object detection task which randomly rotates a spherical image and solves the bias due to the sphere-to-plane projection.
arXiv Detail & Related papers (2022-02-07T14:01:59Z) - StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image
Synthesis [92.25145204543904]
StyleNeRF is a 3D-aware generative model for high-resolution image synthesis with high multi-view consistency.
It integrates the neural radiance field (NeRF) into a style-based generator.
It can synthesize high-resolution images at interactive rates while preserving 3D consistency at high quality.
arXiv Detail & Related papers (2021-10-18T02:37:01Z) - Improved Transformer for High-Resolution GANs [69.42469272015481]
We introduce two key ingredients to Transformer to address this challenge.
We show in the experiments that the proposed HiT achieves state-of-the-art FID scores of 31.87 and 2.95 on unconditional ImageNet $128 times 128$ and FFHQ $256 times 256$, respectively.
arXiv Detail & Related papers (2021-06-14T17:39:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.