Related papers: Review on Panoramic Imaging and Its Applications in Scene Understanding

Review on Panoramic Imaging and Its Applications in Scene Understanding

URL: http://arxiv.org/abs/2205.05570v1
Date: Wed, 11 May 2022 15:31:05 GMT
Title: Review on Panoramic Imaging and Its Applications in Scene Understanding
Authors: Shaohua Gao, Kailun Yang, Hao Shi, Kaiwei Wang, Jian Bai
Abstract summary: Panoramic imaging instruments are expected to have high resolution, no blind area, miniaturization, and multi-dimensional intelligent perception. Recent advances in freeform surfaces, thin-plate optics, and metasurfaces provide innovative approaches to address human perception of the environment.
Score: 9.79276235622546
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the rapid development of high-speed communication and artificial intelligence technologies, human perception of real-world scenes is no longer limited to the use of small Field of View (FoV) and low-dimensional scene detection devices. Panoramic imaging emerges as the next generation of innovative intelligent instruments for environmental perception and measurement. However, while satisfying the need for large-FoV photographic imaging, panoramic imaging instruments are expected to have high resolution, no blind area, miniaturization, and multi-dimensional intelligent perception, and can be combined with artificial intelligence methods towards the next generation of intelligent instruments, enabling deeper understanding and more holistic perception of 360-degree real-world surrounding environments. Fortunately, recent advances in freeform surfaces, thin-plate optics, and metasurfaces provide innovative approaches to address human perception of the environment, offering promising ideas beyond conventional optical imaging. In this review, we begin with introducing the basic principles of panoramic imaging systems, and then describe the architectures, features, and functions of various panoramic imaging systems. Afterwards, we discuss in detail the broad application prospects and great design potential of freeform surfaces, thin-plate optics, and metasurfaces in panoramic imaging. We then provide a detailed analysis on how these techniques can help enhance the performance of panoramic imaging systems. We further offer a detailed analysis of applications of panoramic imaging in scene understanding for autonomous driving and robotics, spanning panoramic semantic image segmentation, panoramic depth estimation, panoramic visual localization, and so on. Finally, we cast a perspective on future potential and research directions for panoramic imaging instruments.

Related papers

ViewPoint: Panoramic Video Generation with Pretrained Diffusion Models [52.87334248847314]
We propose a novel framework utilizing pretrained perspective video models for generating panoramic videos.<n>Specifically, we design a novel panorama representation named ViewPoint map, which possesses global spatial continuity and fine-grained visual details simultaneously.<n>Our method can synthesize highly dynamic and spatially consistent panoramic videos, achieving state-of-the-art performance and surpassing previous methods.
arXiv Detail & Related papers (2025-06-30T04:33:34Z)
Incorporating dense metric depth into neural 3D representations for view synthesis and relighting [25.028859317188395]
In robotic applications, dense metric depth can often be measured directly using stereo and illumination can be controlled. In this work we demonstrate a method to incorporate dense metric depth into the training of neural 3D representations. We also discuss a multi-flash stereo camera system developed to capture the necessary data for our pipeline and show results on relighting and view synthesis.
arXiv Detail & Related papers (2024-09-04T20:21:13Z)
MSI-NeRF: Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field [1.3162012586770577]
We introduce MSI-NeRF, which combines deep learning omnidirectional depth estimation and novel view synthesis. We construct a multi-sphere image as a cost volume through feature extraction and warping of the input images. Our network has the generalization ability to reconstruct unknown scenes efficiently using only four images.
arXiv Detail & Related papers (2024-03-16T07:26:50Z)
OmniSCV: An Omnidirectional Synthetic Image Generator for Computer Vision [5.2178708158547025]
We present a tool for generating datasets of omnidirectional images with semantic and depth information. These images are synthesized from a set of captures that are acquired in a realistic virtual environment for Unreal Engine 4. We include in our tool photorealistic non-central-projection systems as non-central panoramas and non-central catadioptric systems.
arXiv Detail & Related papers (2024-01-30T14:40:19Z)
Calibrating Panoramic Depth Estimation for Practical Localization and Mapping [20.621442016969976]
The absolute depth values of surrounding environments provide crucial cues for various assistive technologies, such as localization, navigation, and 3D structure estimation. We propose that accurate depth estimated from panoramic images can serve as a powerful and light-weight input for a wide range of downstream tasks requiring 3D information.
arXiv Detail & Related papers (2023-08-27T04:50:05Z)
Review of Large Vision Models and Visual Prompt Engineering [50.63394642549947]
Review aims to summarize the methods employed in the computer vision domain for large vision models and visual prompt engineering. We present influential large models in the visual domain and a range of prompt engineering methods employed on these models.
arXiv Detail & Related papers (2023-07-03T08:48:49Z)
PanoGen: Text-Conditioned Panoramic Environment Generation for Vision-and-Language Navigation [96.8435716885159]
Vision-and-Language Navigation (VLN) requires the agent to follow language instructions to navigate through 3D environments. One main challenge in VLN is the limited availability of training environments, which makes it hard to generalize to new and unseen environments. We propose PanoGen, a generation method that can potentially create an infinite number of diverse panoramic environments conditioned on text.
arXiv Detail & Related papers (2023-05-30T16:39:54Z)
HORIZON: High-Resolution Semantically Controlled Panorama Synthesis [105.55531244750019]
Panorama synthesis endeavors to craft captivating 360-degree visual landscapes, immersing users in the heart of virtual worlds. Recent breakthroughs in visual synthesis have unlocked the potential for semantic control in 2D flat images, but a direct application of these methods to panorama synthesis yields distorted content. We unveil an innovative framework for generating high-resolution panoramas, adeptly addressing the issues of spherical distortion and edge discontinuity through sophisticated spherical modeling.
arXiv Detail & Related papers (2022-10-10T09:43:26Z)
Panoramic Panoptic Segmentation: Insights Into Surrounding Parsing for Mobile Agents via Unsupervised Contrastive Learning [93.6645991946674]
We introduce panoramic panoptic segmentation, as the most holistic scene understanding. A complete surrounding understanding provides a maximum of information to a mobile agent. We propose a framework which allows model training on standard pinhole images and transfers the learned features to a different domain.
arXiv Detail & Related papers (2022-06-21T20:07:15Z)
Unsupervised Learning of Depth and Ego-Motion from Cylindrical Panoramic Video with Applications for Virtual Reality [2.294014185517203]
We introduce a convolutional neural network model for unsupervised learning of depth and ego-motion from cylindrical panoramic video. Panoramic depth estimation is an important technology for applications such as virtual reality, 3D modeling, and autonomous robotic navigation.
arXiv Detail & Related papers (2020-10-14T16:41:33Z)
State of the Art on Neural Rendering [141.22760314536438]
We focus on approaches that combine classic computer graphics techniques with deep generative models to obtain controllable and photo-realistic outputs. This report is focused on the many important use cases for the described algorithms such as novel view synthesis, semantic photo manipulation, facial and body reenactment, relighting, free-viewpoint video, and the creation of photo-realistic avatars for virtual and augmented reality telepresence.
arXiv Detail & Related papers (2020-04-08T04:36:31Z)
Learning Depth With Very Sparse Supervision [57.911425589947314]
This paper explores the idea that perception gets coupled to 3D properties of the world via interaction with the environment. We train a specialized global-local network architecture with what would be available to a robot interacting with the environment. Experiments on several datasets show that, when ground truth is available even for just one of the image pixels, the proposed network can learn monocular dense depth estimation up to 22.5% more accurately than state-of-the-art approaches.
arXiv Detail & Related papers (2020-03-02T10:44:13Z)
Semantic and structural image segmentation for prosthetic vision [2.048226951354646]
The ability of object recognition and scene understanding in real environments is severely restricted for prosthetic users. We present a new approach to build a schematic representation of indoor environments for phosphene images. The proposed method combines a variety of convolutional neural networks for extracting and conveying relevant information.
arXiv Detail & Related papers (2018-09-25T17:38:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.