MonoLite3D: Lightweight 3D Object Properties Estimation
- URL: http://arxiv.org/abs/2503.02201v1
- Date: Tue, 04 Mar 2025 02:31:09 GMT
- Title: MonoLite3D: Lightweight 3D Object Properties Estimation
- Authors: Ahmed El-Dawy, Amr El-Zawawi, Mohamed El-Habrouk,
- Abstract summary: This paper introduces MonoLite3D network, an embedded-device friendly lightweight deep learning methodology designed for hardware environments with limited resources.<n>MonoLite3D network is a cutting-edge technique that focuses on estimating multiple properties of 3D objects, encompassing their dimensions and spatial orientation, solely from monocular images.
- Score: 2.895737747670724
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reliable perception of the environment plays a crucial role in enabling efficient self-driving vehicles. Therefore, the perception system necessitates the acquisition of comprehensive 3D data regarding the surrounding objects within a specific time constrain, including their dimensions, spatial location and orientation. Deep learning has gained significant popularity in perception systems, enabling the conversion of image features captured by a camera into meaningful semantic information. This research paper introduces MonoLite3D network, an embedded-device friendly lightweight deep learning methodology designed for hardware environments with limited resources. MonoLite3D network is a cutting-edge technique that focuses on estimating multiple properties of 3D objects, encompassing their dimensions and spatial orientation, solely from monocular images. This approach is specifically designed to meet the requirements of resource-constrained environments, making it highly suitable for deployment on devices with limited computational capabilities. The experimental results validate the accuracy and efficiency of the proposed approach on the orientation benchmark of the KITTI dataset. It achieves an impressive score of 82.27% on the moderate class and 69.81% on the hard class, while still meeting the real-time requirements.
Related papers
- Towards Flexible 3D Perception: Object-Centric Occupancy Completion Augments 3D Object Detection [54.78470057491049]
Occupancy has emerged as a promising alternative for 3D scene perception.<n>We introduce object-centric occupancy as a supplement to object bboxes.<n>We show that our occupancy features significantly enhance the detection results of state-of-the-art 3D object detectors.
arXiv Detail & Related papers (2024-12-06T16:12:38Z) - OccupancyDETR: Using DETR for Mixed Dense-sparse 3D Occupancy Prediction [10.87136340580404]
Visual-based 3D semantic occupancy perception is a key technology for robotics, including autonomous vehicles.
We propose a novel 3D semantic occupancy perception method, OccupancyDETR, which utilizes a DETR-like object detection, a mixed dense-sparse 3D occupancy decoder.
Our approach strikes a balance between efficiency and accuracy, achieving faster inference times, lower resource consumption, and improved performance for small object detection.
arXiv Detail & Related papers (2023-09-15T16:06:23Z) - Perspective-aware Convolution for Monocular 3D Object Detection [2.33877878310217]
We propose a novel perspective-aware convolutional layer that captures long-range dependencies in images.
By enforcing convolutional kernels to extract features along the depth axis of every image pixel, we incorporates perspective information into network architecture.
We demonstrate improved performance on the KITTI3D dataset, achieving a 23.9% average precision in the easy benchmark.
arXiv Detail & Related papers (2023-08-24T17:25:36Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - AutoDecoding Latent 3D Diffusion Models [95.7279510847827]
We present a novel approach to the generation of static and articulated 3D assets that has a 3D autodecoder at its core.
The 3D autodecoder framework embeds properties learned from the target dataset in the latent space.
We then identify the appropriate intermediate volumetric latent space, and introduce robust normalization and de-normalization operations.
arXiv Detail & Related papers (2023-07-07T17:59:14Z) - High-level camera-LiDAR fusion for 3D object detection with machine
learning [0.0]
This paper tackles the 3D object detection problem, which is of vital importance for applications such as autonomous driving.
It uses a Machine Learning pipeline on a combination of monocular camera and LiDAR data to detect vehicles in the surrounding 3D space of a moving platform.
Our results demonstrate an efficient and accurate inference on a validation set, achieving an overall accuracy of 87.1%.
arXiv Detail & Related papers (2021-05-24T01:57:34Z) - Analysis of voxel-based 3D object detection methods efficiency for
real-time embedded systems [93.73198973454944]
Two popular voxel-based 3D object detection methods are studied in this paper.
Our experiments show that these methods mostly fail to detect distant small objects due to the sparsity of the input point clouds at large distances.
Our findings suggest that a considerable part of the computations of existing methods is focused on locations of the scene that do not contribute with successful detection.
arXiv Detail & Related papers (2021-05-21T12:40:59Z) - Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem.
We employ a Neural Message Passing network for data association that is fully trainable.
We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z) - Real-time 3D object proposal generation and classification under limited
processing resources [1.6242924916178285]
We propose an efficient detection method consisting of 3D proposal generation and classification.
The experimental results demonstrate the capability of proposed real-time 3D object detection method from the point cloud.
arXiv Detail & Related papers (2020-03-24T05:36:53Z) - SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data.
Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.