Related papers: On Efficient Real-Time Semantic Segmentation: A Survey

On Efficient Real-Time Semantic Segmentation: A Survey

URL: http://arxiv.org/abs/2206.08605v1
Date: Fri, 17 Jun 2022 08:00:27 GMT
Title: On Efficient Real-Time Semantic Segmentation: A Survey
Authors: Christopher J. Holder, Muhammad Shafique
Abstract summary: We take a look at the works that aim to address this misalignment with more compact and efficient models capable of deployment on low-memory embedded systems. We evaluate the inference speed of the discussed models under consistent hardware and software setups. Our experimental results demonstrate that many works are capable of real-time performance on resource-constrained hardware, while illustrating the consistent trade-off between latency and accuracy.
Score: 12.404169549562523
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Semantic segmentation is the problem of assigning a class label to every pixel in an image, and is an important component of an autonomous vehicle vision stack for facilitating scene understanding and object detection. However, many of the top performing semantic segmentation models are extremely complex and cumbersome, and as such are not suited to deployment onboard autonomous vehicle platforms where computational resources are limited and low-latency operation is a vital requirement. In this survey, we take a thorough look at the works that aim to address this misalignment with more compact and efficient models capable of deployment on low-memory embedded systems while meeting the constraint of real-time inference. We discuss several of the most prominent works in the field, placing them within a taxonomy based on their major contributions, and finally we evaluate the inference speed of the discussed models under consistent hardware and software setups that represent a typical research environment with high-end GPU and a realistic deployed scenario using low-memory embedded GPU hardware. Our experimental results demonstrate that many works are capable of real-time performance on resource-constrained hardware, while illustrating the consistent trade-off between latency and accuracy.

Related papers

ContextFormer: Redefining Efficiency in Semantic Segmentation [48.81126061219231]
Convolutional methods, although capturing local dependencies well, struggle with long-range relationships. Vision Transformers (ViTs) excel in global context capture but are hindered by high computational demands. We propose ContextFormer, a hybrid framework leveraging the strengths of CNNs and ViTs in the bottleneck to balance efficiency, accuracy, and robustness for real-time semantic segmentation.
arXiv Detail & Related papers (2025-01-31T16:11:04Z)
Can foundation models actively gather information in interactive environments to test hypotheses? [56.651636971591536]
We introduce a framework in which a model must determine the factors influencing a hidden reward function. We investigate whether approaches such as self- throughput and increased inference time improve information gathering efficiency.
arXiv Detail & Related papers (2024-12-09T12:27:21Z)
Exploiting Local Features and Range Images for Small Data Real-Time Point Cloud Semantic Segmentation [4.02235104503587]
In this paper, we harness the information from the three-dimensional representation to proficiently capture local features. A GPU-based KDTree allows for rapid building, querying, and enhancing projection with straightforward operations. We show that a reduced version of our model not only demonstrates strong competitiveness against full-scale state-of-the-art models but also operates in real-time.
arXiv Detail & Related papers (2024-10-14T13:49:05Z)
A Threefold Review on Deep Semantic Segmentation: Efficiency-oriented, Temporal and Depth-aware design [77.34726150561087]
We conduct a survey on the most relevant and recent advances in Deep Semantic in the context of vision for autonomous vehicles. Our main objective is to provide a comprehensive discussion on the main methods, advantages, limitations, results and challenges faced from each perspective.
arXiv Detail & Related papers (2023-03-08T01:29:55Z)
ASAP: Accurate semantic segmentation for real time performance [3.5327983932835165]
We propose an efficient feature fusion method, Feature Fusion with Different Norms (FFDN) FFDN utilizes rich global context of multi-level scale and vertical pooling module before self-attention. We achieve the mean Interaction of-union(mIoU) of 73.1 and the Frame Per Second(FPS) of 191, which are comparable results with state-of-the-arts on Cityscapes test datasets.
arXiv Detail & Related papers (2022-10-04T02:35:53Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
Efficient Multi-Organ Segmentation Using SpatialConfiguration-Net with Low GPU Memory Requirements [8.967700713755281]
In this work, we employ a multi-organ segmentation model based on the SpatialConfiguration-Net (SCN) We modified the architecture of the segmentation model to reduce its memory footprint without drastically impacting the quality of the predictions. Lastly, we implemented a minimal inference script for which we optimized both, execution time and required GPU memory.
arXiv Detail & Related papers (2021-11-26T17:47:10Z)
Semantic Scene Segmentation for Robotics Applications [51.66271681532262]
We investigate the behavior of the most successful semantic scene segmentation models, in terms of deployment (inference) speed, under various setups. The target of this work is to provide a comparative study of current state-of-the-art segmentation models so as to select the most compliant with the robotics applications requirements.
arXiv Detail & Related papers (2021-08-25T08:55:20Z)
Analysis of voxel-based 3D object detection methods efficiency for real-time embedded systems [93.73198973454944]
Two popular voxel-based 3D object detection methods are studied in this paper. Our experiments show that these methods mostly fail to detect distant small objects due to the sparsity of the input point clouds at large distances. Our findings suggest that a considerable part of the computations of existing methods is focused on locations of the scene that do not contribute with successful detection.
arXiv Detail & Related papers (2021-05-21T12:40:59Z)
Self-supervised Human Detection and Segmentation via Multi-view Consensus [116.92405645348185]
We propose a multi-camera framework in which geometric constraints are embedded in the form of multi-view consistency during training. We show that our approach outperforms state-of-the-art self-supervised person detection and segmentation techniques on images that visually depart from those of standard benchmarks.
arXiv Detail & Related papers (2020-12-09T15:47:21Z)
Multi-scale Interaction for Real-time LiDAR Data Segmentation on an Embedded Platform [62.91011959772665]
Real-time semantic segmentation of LiDAR data is crucial for autonomously driving vehicles. Current approaches that operate directly on the point cloud use complex spatial aggregation operations. We propose a projection-based method, called Multi-scale Interaction Network (MINet), which is very efficient and accurate.
arXiv Detail & Related papers (2020-08-20T19:06:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.