Deep Learning for Omnidirectional Vision: A Survey and New Perspectives
- URL: http://arxiv.org/abs/2205.10468v2
- Date: Tue, 24 May 2022 08:49:08 GMT
- Title: Deep Learning for Omnidirectional Vision: A Survey and New Perspectives
- Authors: Hao Ai, Zidong Cao, Jinjing Zhu, Haotian Bai, Yucheng Chen and Lin
Wang
- Abstract summary: This paper presents a systematic and comprehensive review and analysis of the recent progress in deep learning methods for omnidirectional vision.
Our work covers four main contents: (i) An introduction to the principle of omnidirectional imaging, the convolution methods on the ODI, and datasets to highlight the differences and difficulties compared with the 2D planar image data; (ii) A structural and hierarchical taxonomy of the DL methods for omnidirectional vision; and (iii) A summarization of the latest novel learning strategies and applications.
- Score: 7.068031114801553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Omnidirectional image (ODI) data is captured with a 360x180 field-of-view,
which is much wider than the pinhole cameras and contains richer spatial
information than the conventional planar images. Accordingly, omnidirectional
vision has attracted booming attention due to its more advantageous performance
in numerous applications, such as autonomous driving and virtual reality. In
recent years, the availability of customer-level 360 cameras has made
omnidirectional vision more popular, and the advance of deep learning (DL) has
significantly sparked its research and applications. This paper presents a
systematic and comprehensive review and analysis of the recent progress in DL
methods for omnidirectional vision. Our work covers four main contents: (i) An
introduction to the principle of omnidirectional imaging, the convolution
methods on the ODI, and datasets to highlight the differences and difficulties
compared with the 2D planar image data; (ii) A structural and hierarchical
taxonomy of the DL methods for omnidirectional vision; (iii) A summarization of
the latest novel learning strategies and applications; (iv) An insightful
discussion of the challenges and open problems by highlighting the potential
research directions to trigger more research in the community.
Related papers
- VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving [44.91443640710085]
VisionPAD is a novel self-supervised pre-training paradigm for vision-centric algorithms in autonomous driving.
It reconstructs multi-view representations using only images as supervision.
It significantly improves performance in 3D object detection, occupancy prediction and map segmentation.
arXiv Detail & Related papers (2024-11-22T03:59:41Z) - Unlocking Textual and Visual Wisdom: Open-Vocabulary 3D Object Detection Enhanced by Comprehensive Guidance from Text and Image [70.02187124865627]
Open-vocabulary 3D object detection (OV-3DDet) aims to localize and recognize both seen and previously unseen object categories within any new 3D scene.
We leverage a vision foundation model to provide image-wise guidance for discovering novel classes in 3D scenes.
We demonstrate significant improvements in accuracy and generalization, highlighting the potential of foundation models in advancing open-vocabulary 3D object detection.
arXiv Detail & Related papers (2024-07-07T04:50:04Z) - Technique Report of CVPR 2024 PBDL Challenges [211.79824163599872]
Physics-based vision aims to invert the processes to recover scene properties such as shape, reflectance, light distribution, and medium properties from images.
Deep learning has shown promising improvements for various vision tasks, and when combined with physics-based vision, these approaches can enhance the robustness and accuracy of vision systems.
This technical report summarizes the outcomes of the Physics-Based Vision Meets Deep Learning (PBDL) 2024 challenge, held in CVPR 2024 workshop.
arXiv Detail & Related papers (2024-06-15T21:44:17Z) - Vision-based Learning for Drones: A Survey [1.280979348722635]
Drones as advanced cyber-physical systems are undergoing a transformative shift with the advent of vision-based learning.
This review offers a comprehensive overview of vision-based learning in drones, emphasizing its pivotal role in enhancing their operational capabilities.
We explore various applications of vision-based drones with learning capabilities, ranging from single-agent systems to more complex multi-agent and heterogeneous system scenarios.
arXiv Detail & Related papers (2023-12-08T12:57:13Z) - Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks [55.81577205593956]
Event cameras are bio-inspired sensors that capture the per-pixel intensity changes asynchronously.
Deep learning (DL) has been brought to this emerging field and inspired active research endeavors in mining its potential.
arXiv Detail & Related papers (2023-02-17T14:19:28Z) - Surround-View Vision-based 3D Detection for Autonomous Driving: A Survey [0.6091702876917281]
We provide a literature survey for the existing Vision Based 3D detection methods, focused on autonomous driving.
We have highlighted how the literature and industry trend have moved towards surround-view image based methods and note down thoughts on what special cases this method addresses.
arXiv Detail & Related papers (2023-02-13T19:30:17Z) - 3D Object Detection from Images for Autonomous Driving: A Survey [68.33502122185813]
3D object detection from images is one of the fundamental and challenging problems in autonomous driving.
More than 200 works have studied this problem from 2015 to 2021, encompassing a broad spectrum of theories, algorithms, and applications.
We provide the first comprehensive survey of this novel and continuously growing research field, summarizing the most commonly used pipelines for image-based 3D detection.
arXiv Detail & Related papers (2022-02-07T07:12:24Z) - Deep Learning on Monocular Object Pose Detection and Tracking: A
Comprehensive Overview [8.442460766094674]
Object pose detection and tracking has attracted increasing attention due to its wide applications in many areas, such as autonomous driving, robotics, and augmented reality.
Deep learning is the most promising one that has shown better performance than others.
This paper presents a comprehensive review of recent progress in object pose detection and tracking that belongs to the deep learning technical route.
arXiv Detail & Related papers (2021-05-29T12:59:29Z) - Recent Advances in Monocular 2D and 3D Human Pose Estimation: A Deep
Learning Perspective [69.44384540002358]
We provide a comprehensive and holistic 2D-to-3D perspective to tackle this problem.
We categorize the mainstream and milestone approaches since the year 2014 under unified frameworks.
We also summarize the pose representation styles, benchmarks, evaluation metrics, and the quantitative performance of popular approaches.
arXiv Detail & Related papers (2021-04-23T11:07:07Z) - Weakly-Supervised 3D Human Pose Learning via Multi-view Images in the
Wild [101.70320427145388]
We propose a weakly-supervised approach that does not require 3D annotations and learns to estimate 3D poses from unlabeled multi-view data.
We evaluate our proposed approach on two large scale datasets.
arXiv Detail & Related papers (2020-03-17T08:47:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.