A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented,
Temporal and Depth-aware design
- URL: http://arxiv.org/abs/2303.04315v1
- Date: Wed, 8 Mar 2023 01:29:55 GMT
- Title: A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented,
Temporal and Depth-aware design
- Authors: Felipe Manfio Barbosa, Fernando Santos Os\'orio
- Abstract summary: We conduct a survey on the most relevant and recent advances in Deep Semantic in the context of vision for autonomous vehicles.
Our main objective is to provide a comprehensive discussion on the main methods, advantages, limitations, results and challenges faced from each perspective.
- Score: 77.34726150561087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic image and video segmentation stand among the most important tasks in
computer vision nowadays, since they provide a complete and meaningful
representation of the environment by means of a dense classification of the
pixels in a given scene. Recently, Deep Learning, and more precisely
Convolutional Neural Networks, have boosted semantic segmentation to a new
level in terms of performance and generalization capabilities. However,
designing Deep Semantic Segmentation models is a complex task, as it may
involve application-dependent aspects. Particularly, when considering
autonomous driving applications, the robustness-efficiency trade-off, as well
as intrinsic limitations - computational/memory bounds and data-scarcity - and
constraints - real-time inference - should be taken into consideration. In this
respect, the use of additional data modalities, such as depth perception for
reasoning on the geometry of a scene, and temporal cues from videos to explore
redundancy and consistency, are promising directions yet not explored to their
full potential in the literature. In this paper, we conduct a survey on the
most relevant and recent advances in Deep Semantic Segmentation in the context
of vision for autonomous vehicles, from three different perspectives:
efficiency-oriented model development for real-time operation, RGB-Depth data
integration (RGB-D semantic segmentation), and the use of temporal information
from videos in temporally-aware models. Our main objective is to provide a
comprehensive discussion on the main methods, advantages, limitations, results
and challenges faced from each perspective, so that the reader can not only get
started, but also be up to date in respect to recent advances in this exciting
and challenging research field.
Related papers
- Deep Learning-Based 3D Instance and Semantic Segmentation: A Review [0.0]
3D segmentation is challenging with point cloud data due to substantial redundancy, fluctuating sample density and lack of organization.
Deep learning has been successfully used to a spectrum of 2D vision domains as a prevailing A.I. methods.
This study examines many strategies that have been presented to 3D instance and semantic segmentation.
arXiv Detail & Related papers (2024-06-19T07:56:14Z) - Frequency-based Matcher for Long-tailed Semantic Segmentation [22.199174076366003]
We focus on a relatively under-explored task setting, long-tailed semantic segmentation (LTSS)
We propose a dual-metric evaluation system and construct the LTSS benchmark to demonstrate the performance of semantic segmentation methods and long-tailed solutions.
We also propose a transformer-based algorithm to improve LTSS, frequency-based matcher, which solves the oversuppression problem by one-to-many matching.
arXiv Detail & Related papers (2024-06-06T09:57:56Z) - Deep Learning-Based Object Pose Estimation: A Comprehensive Survey [73.74933379151419]
We discuss the recent advances in deep learning-based object pose estimation.
Our survey also covers multiple input data modalities, degrees-of-freedom of output poses, object properties, and downstream tasks.
arXiv Detail & Related papers (2024-05-13T14:44:22Z) - Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - Few Shot Semantic Segmentation: a review of methodologies, benchmarks, and open challenges [5.0243930429558885]
Few-Shot Semantic is a novel task in computer vision, which aims at designing models capable of segmenting new semantic classes with only a few examples.
This paper consists of a comprehensive survey of Few-Shot Semantic, tracing its evolution and exploring various model designs.
arXiv Detail & Related papers (2023-04-12T13:07:37Z) - On Efficient Real-Time Semantic Segmentation: A Survey [12.404169549562523]
We take a look at the works that aim to address this misalignment with more compact and efficient models capable of deployment on low-memory embedded systems.
We evaluate the inference speed of the discussed models under consistent hardware and software setups.
Our experimental results demonstrate that many works are capable of real-time performance on resource-constrained hardware, while illustrating the consistent trade-off between latency and accuracy.
arXiv Detail & Related papers (2022-06-17T08:00:27Z) - A Survey on Deep Learning Technique for Video Segmentation [147.0767454918527]
Video segmentation plays a critical role in a broad range of practical applications.
Deep learning based approaches have been dedicated to video segmentation and delivered compelling performance.
arXiv Detail & Related papers (2021-07-02T15:51:07Z) - Learning Long-term Visual Dynamics with Region Proposal Interaction
Networks [75.06423516419862]
We build object representations that can capture inter-object and object-environment interactions over a long-range.
Thanks to the simple yet effective object representation, our approach outperforms prior methods by a significant margin.
arXiv Detail & Related papers (2020-08-05T17:48:00Z) - SideInfNet: A Deep Neural Network for Semi-Automatic Semantic
Segmentation with Side Information [83.03179580646324]
This paper proposes a novel deep neural network architecture, namely SideInfNet.
It integrates features learnt from images with side information extracted from user annotations.
To evaluate our method, we applied the proposed network to three semantic segmentation tasks and conducted extensive experiments on benchmark datasets.
arXiv Detail & Related papers (2020-02-07T06:10:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.