Review on Panoramic Imaging and Its Applications in Scene Understanding
- URL: http://arxiv.org/abs/2205.05570v1
- Date: Wed, 11 May 2022 15:31:05 GMT
- Title: Review on Panoramic Imaging and Its Applications in Scene Understanding
- Authors: Shaohua Gao, Kailun Yang, Hao Shi, Kaiwei Wang, Jian Bai
- Abstract summary: Panoramic imaging instruments are expected to have high resolution, no blind area, miniaturization, and multi-dimensional intelligent perception.
Recent advances in freeform surfaces, thin-plate optics, and metasurfaces provide innovative approaches to address human perception of the environment.
- Score: 9.79276235622546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid development of high-speed communication and artificial
intelligence technologies, human perception of real-world scenes is no longer
limited to the use of small Field of View (FoV) and low-dimensional scene
detection devices. Panoramic imaging emerges as the next generation of
innovative intelligent instruments for environmental perception and
measurement. However, while satisfying the need for large-FoV photographic
imaging, panoramic imaging instruments are expected to have high resolution, no
blind area, miniaturization, and multi-dimensional intelligent perception, and
can be combined with artificial intelligence methods towards the next
generation of intelligent instruments, enabling deeper understanding and more
holistic perception of 360-degree real-world surrounding environments.
Fortunately, recent advances in freeform surfaces, thin-plate optics, and
metasurfaces provide innovative approaches to address human perception of the
environment, offering promising ideas beyond conventional optical imaging. In
this review, we begin with introducing the basic principles of panoramic
imaging systems, and then describe the architectures, features, and functions
of various panoramic imaging systems. Afterwards, we discuss in detail the
broad application prospects and great design potential of freeform surfaces,
thin-plate optics, and metasurfaces in panoramic imaging. We then provide a
detailed analysis on how these techniques can help enhance the performance of
panoramic imaging systems. We further offer a detailed analysis of applications
of panoramic imaging in scene understanding for autonomous driving and
robotics, spanning panoramic semantic image segmentation, panoramic depth
estimation, panoramic visual localization, and so on. Finally, we cast a
perspective on future potential and research directions for panoramic imaging
instruments.
Related papers
- MSI-NeRF: Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field [1.3162012586770577]
We introduce MSI-NeRF, which combines deep learning omnidirectional depth estimation and novel view synthesis.
We construct a multi-sphere image as a cost volume through feature extraction and warping of the input images.
Our network has the generalization ability to reconstruct unknown scenes efficiently using only four images.
arXiv Detail & Related papers (2024-03-16T07:26:50Z) - OmniSCV: An Omnidirectional Synthetic Image Generator for Computer
Vision [5.2178708158547025]
We present a tool for generating datasets of omnidirectional images with semantic and depth information.
These images are synthesized from a set of captures that are acquired in a realistic virtual environment for Unreal Engine 4.
We include in our tool photorealistic non-central-projection systems as non-central panoramas and non-central catadioptric systems.
arXiv Detail & Related papers (2024-01-30T14:40:19Z) - NeRF-Enhanced Outpainting for Faithful Field-of-View Extrapolation [18.682430719467202]
In various applications, such as robotic navigation and remote visual assistance, expanding the field of view (FOV) of the camera proves beneficial for enhancing environmental perception.
We formulate a new problem of faithful FOV extrapolation that utilizes a set of pre-captured images as prior knowledge of the scene.
We present NeRF-Enhanced Outpainting (NEO) that uses extended-FOV images generated through NeRF to train a scene-specific image outpainting model.
arXiv Detail & Related papers (2023-09-23T03:16:58Z) - Calibrating Panoramic Depth Estimation for Practical Localization and
Mapping [20.621442016969976]
The absolute depth values of surrounding environments provide crucial cues for various assistive technologies, such as localization, navigation, and 3D structure estimation.
We propose that accurate depth estimated from panoramic images can serve as a powerful and light-weight input for a wide range of downstream tasks requiring 3D information.
arXiv Detail & Related papers (2023-08-27T04:50:05Z) - Review of Large Vision Models and Visual Prompt Engineering [50.63394642549947]
Review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering.
We present influential large models in the visual domain and a range of prompt engineering methods employed on these models.
arXiv Detail & Related papers (2023-07-03T08:48:49Z) - PanoGen: Text-Conditioned Panoramic Environment Generation for
Vision-and-Language Navigation [96.8435716885159]
Vision-and-Language Navigation (VLN) requires the agent to follow language instructions to navigate through 3D environments.
One main challenge in VLN is the limited availability of training environments, which makes it hard to generalize to new and unseen environments.
We propose PanoGen, a generation method that can potentially create an infinite number of diverse panoramic environments conditioned on text.
arXiv Detail & Related papers (2023-05-30T16:39:54Z) - HORIZON: High-Resolution Semantically Controlled Panorama Synthesis [105.55531244750019]
Panorama synthesis endeavors to craft captivating 360-degree visual landscapes, immersing users in the heart of virtual worlds.
Recent breakthroughs in visual synthesis have unlocked the potential for semantic control in 2D flat images, but a direct application of these methods to panorama synthesis yields distorted content.
We unveil an innovative framework for generating high-resolution panoramas, adeptly addressing the issues of spherical distortion and edge discontinuity through sophisticated spherical modeling.
arXiv Detail & Related papers (2022-10-10T09:43:26Z) - Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for
Mobile Agents via Unsupervised Contrastive Learning [93.6645991946674]
We introduce panoramic panoptic segmentation, as the most holistic scene understanding.
A complete surrounding understanding provides a maximum of information to a mobile agent.
We propose a framework which allows model training on standard pinhole images and transfers the learned features to a different domain.
arXiv Detail & Related papers (2022-06-21T20:07:15Z) - Unsupervised Learning of Depth and Ego-Motion from Cylindrical Panoramic
Video with Applications for Virtual Reality [2.294014185517203]
We introduce a convolutional neural network model for unsupervised learning of depth and ego-motion from cylindrical panoramic video.
Panoramic depth estimation is an important technology for applications such as virtual reality, 3D modeling, and autonomous robotic navigation.
arXiv Detail & Related papers (2020-10-14T16:41:33Z) - State of the Art on Neural Rendering [141.22760314536438]
We focus on approaches that combine classic computer graphics techniques with deep generative models to obtain controllable and photo-realistic outputs.
This report is focused on the many important use cases for the described algorithms such as novel view synthesis, semantic photo manipulation, facial and body reenactment, relighting, free-viewpoint video, and the creation of photo-realistic avatars for virtual and augmented reality telepresence.
arXiv Detail & Related papers (2020-04-08T04:36:31Z) - Learning Depth With Very Sparse Supervision [57.911425589947314]
This paper explores the idea that perception gets coupled to 3D properties of the world via interaction with the environment.
We train a specialized global-local network architecture with what would be available to a robot interacting with the environment.
Experiments on several datasets show that, when ground truth is available even for just one of the image pixels, the proposed network can learn monocular dense depth estimation up to 22.5% more accurately than state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-02T10:44:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.