Object-level 3D Semantic Mapping using a Network of Smart Edge Sensors
- URL: http://arxiv.org/abs/2211.11354v1
- Date: Mon, 21 Nov 2022 11:13:08 GMT
- Title: Object-level 3D Semantic Mapping using a Network of Smart Edge Sensors
- Authors: Julian Hau, Simon Bultmann, Sven Behnke
- Abstract summary: We extend a multi-view 3D semantic mapping system consisting of a network of distributed edge sensors with object-level information.
Our method is evaluated on the public Behave dataset where it shows pose estimation within a few centimeters and in real-world experiments with the sensor network in a challenging lab environment.
- Score: 25.393382192511716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Autonomous robots that interact with their environment require a detailed
semantic scene model. For this, volumetric semantic maps are frequently used.
The scene understanding can further be improved by including object-level
information in the map. In this work, we extend a multi-view 3D semantic
mapping system consisting of a network of distributed smart edge sensors with
object-level information, to enable downstream tasks that need object-level
input. Objects are represented in the map via their 3D mesh model or as an
object-centric volumetric sub-map that can model arbitrary object geometry when
no detailed 3D model is available. We propose a keypoint-based approach to
estimate object poses via PnP and refinement via ICP alignment of the 3D object
model with the observed point cloud segments. Object instances are tracked to
integrate observations over time and to be robust against temporary occlusions.
Our method is evaluated on the public Behave dataset where it shows pose
estimation accuracy within a few centimeters and in real-world experiments with
the sensor network in a challenging lab environment where multiple chairs and a
table are tracked through the scene online, in real time even under high
occlusions.
Related papers
- PatchContrast: Self-Supervised Pre-training for 3D Object Detection [14.603858163158625]
We introduce PatchContrast, a novel self-supervised point cloud pre-training framework for 3D object detection.
We show that our method outperforms existing state-of-the-art models on three commonly-used 3D detection datasets.
arXiv Detail & Related papers (2023-08-14T07:45:54Z) - 3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding [58.924180772480504]
3D visual grounding aims to localize the target object in a 3D point cloud by a free-form language description.
We propose a relation-aware one-stage framework, named 3D Relative Position-aware Network (3-Net)
arXiv Detail & Related papers (2023-07-25T09:33:25Z) - Objects as Spatio-Temporal 2.5D points [5.588892124219713]
We propose a weakly supervised method to estimate 3D position of objects by jointly learning to regress the 2D object detections scene's depth prediction in a single feed-forward pass of a network.
Our proposed method extends a single-point based object detector, and introduces a novel object representation where each object is modeled as a BEV point-temporally, without the need of any 3D or BEV annotations for training and LiDAR data at query time.
arXiv Detail & Related papers (2022-12-06T05:14:30Z) - CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework.
Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene.
In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z) - AGO-Net: Association-Guided 3D Point Cloud Object Detection Network [86.10213302724085]
We propose a novel 3D detection framework that associates intact features for objects via domain adaptation.
We achieve new state-of-the-art performance on the KITTI 3D detection benchmark in both accuracy and speed.
arXiv Detail & Related papers (2022-08-24T16:54:38Z) - S$^2$Contact: Graph-based Network for 3D Hand-Object Contact Estimation
with Semi-Supervised Learning [70.72037296392642]
We propose a novel semi-supervised framework that allows us to learn contact from monocular images.
Specifically, we leverage visual and geometric consistency constraints in large-scale datasets for generating pseudo-labels.
We show benefits from using a contact map that rules hand-object interactions to produce more accurate reconstructions.
arXiv Detail & Related papers (2022-08-01T14:05:23Z) - Semi-Perspective Decoupled Heatmaps for 3D Robot Pose Estimation from
Depth Maps [66.24554680709417]
Knowing the exact 3D location of workers and robots in a collaborative environment enables several real applications.
We propose a non-invasive framework based on depth devices and deep neural networks to estimate the 3D pose of robots from an external camera.
arXiv Detail & Related papers (2022-07-06T08:52:12Z) - 3D Semantic Scene Perception using Distributed Smart Edge Sensors [29.998917158604694]
We present a system for 3D semantic scene perception consisting of a network of distributed smart edge sensors.
The sensor nodes are based on an embedded CNN inference accelerator and RGB-D and thermal cameras.
The proposed perception system provides a complete scene view containing semantically annotated 3D geometry and estimates 3D poses of multiple persons in real time.
arXiv Detail & Related papers (2022-05-03T12:46:26Z) - Stereo RGB and Deeper LIDAR Based Network for 3D Object Detection [40.34710686994996]
3D object detection has become an emerging task in autonomous driving scenarios.
Previous works process 3D point clouds using either projection-based or voxel-based models.
We propose the Stereo RGB and Deeper LIDAR framework which can utilize semantic and spatial information simultaneously.
arXiv Detail & Related papers (2020-06-09T11:19:24Z) - DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data.
The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes.
We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z) - Extending Maps with Semantic and Contextual Object Information for Robot
Navigation: a Learning-Based Framework using Visual and Depth Cues [12.984393386954219]
This paper addresses the problem of building augmented metric representations of scenes with semantic information from RGB-D images.
We propose a complete framework to create an enhanced map representation of the environment with object-level information.
arXiv Detail & Related papers (2020-03-13T15:05:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.