Related papers: Multi-Resolution POMDP Planning for Multi-Object Search in 3D

Multi-Resolution POMDP Planning for Multi-Object Search in 3D

URL: http://arxiv.org/abs/2005.02878v5
Date: Fri, 18 Mar 2022 17:29:01 GMT
Title: Multi-Resolution POMDP Planning for Multi-Object Search in 3D
Authors: Kaiyu Zheng, Yoonchang Sung, George Konidaris, Stefanie Tellex
Abstract summary: We present a POMDP formulation for multi-object search in a 3D region with a frustum-shaped field-of-view. We design a novel octree-based belief representation to capture uncertainty of the target objects at different resolution levels. We demonstrate our approach on a mobile robot to find objects placed at different heights in two 10m$2 times 2$m regions by moving its base and actuating its torso.
Score: 26.683481431467783
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Robots operating in households must find objects on shelves, under tables, and in cupboards. In such environments, it is crucial to search efficiently at 3D scale while coping with limited field of view and the complexity of searching for multiple objects. Principled approaches to object search frequently use Partially Observable Markov Decision Process (POMDP) as the underlying framework for computing search strategies, but constrain the search space in 2D. In this paper, we present a POMDP formulation for multi-object search in a 3D region with a frustum-shaped field-of-view. To efficiently solve this POMDP, we propose a multi-resolution planning algorithm based on online Monte-Carlo tree search. In this approach, we design a novel octree-based belief representation to capture uncertainty of the target objects at different resolution levels, then derive abstract POMDPs at lower resolutions with dramatically smaller state and observation spaces. Evaluation in a simulated 3D domain shows that our approach finds objects more efficiently and successfully compared to a set of baselines without resolution hierarchy in larger instances under the same computational requirement. We demonstrate our approach on a mobile robot to find objects placed at different heights in two 10m$^2 \times 2$m regions by moving its base and actuating its torso.

Related papers

Counting Stacked Objects [57.68870743111393]
We propose a novel 3D counting approach that decomposes the task into two complementary subproblems. By combining geometric reconstruction and deep learning-based depth analysis, our method can accurately count identical objects within containers. We validate our 3D Counting pipeline on diverse real-world and large-scale synthetic datasets.
arXiv Detail & Related papers (2024-11-28T13:51:16Z)
Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version. We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z)
3D Small Object Detection with Dynamic Spatial Pruning [62.72638845817799]
We propose an efficient feature pruning strategy for 3D small object detection. We present a multi-level 3D detector named DSPDet3D which benefits from high spatial resolution. It takes less than 2s to directly process a whole building consisting of more than 4500k points while detecting out almost all objects.
arXiv Detail & Related papers (2023-05-05T17:57:04Z)
MMRDN: Consistent Representation for Multi-View Manipulation Relationship Detection in Object-Stacked Scenes [62.20046129613934]
We propose a novel multi-view fusion framework, namely multi-view MRD network (MMRDN) We project the 2D data from different views into a common hidden space and fit the embeddings with a set of Von-Mises-Fisher distributions. We select a set of $K$ Maximum Vertical Neighbors (KMVN) points from the point cloud of each object pair, which encodes the relative position of these two objects.
arXiv Detail & Related papers (2023-04-25T05:55:29Z)
Scatter Points in Space: 3D Detection from Multi-view Monocular Images [8.71944437852952]
3D object detection from monocular image(s) is a challenging and long-standing problem of computer vision. Recent methods tend to aggregate multiview feature by sampling regular 3D grid densely in space. We propose a learnable keypoints sampling method, which scatters pseudo surface points in 3D space, in order to keep data sparsity.
arXiv Detail & Related papers (2022-08-31T09:38:05Z)
A Simple Baseline for Multi-Camera 3D Object Detection [94.63944826540491]
3D object detection with surrounding cameras has been a promising direction for autonomous driving. We present SimMOD, a Simple baseline for Multi-camera Object Detection. We conduct extensive experiments on the 3D object detection benchmark of nuScenes to demonstrate the effectiveness of SimMOD.
arXiv Detail & Related papers (2022-08-22T03:38:01Z)
MonoGRNet: A General Framework for Monocular 3D Object Detection [23.59839921644492]
We propose MonoGRNet for the amodal 3D object detection from a monocular image via geometric reasoning. MonoGRNet decomposes the monocular 3D object detection task into four sub-tasks including 2D object detection, instance-level depth estimation, projected 3D center estimation and local corner regression. Experiments are conducted on KITTI, Cityscapes and MS COCO datasets.
arXiv Detail & Related papers (2021-04-18T10:07:52Z)
PLUME: Efficient 3D Object Detection from Stereo Images [95.31278688164646]
Existing methods tackle the problem in two steps: first depth estimation is performed, a pseudo LiDAR point cloud representation is computed from the depth estimates, and then object detection is performed in 3D space. We propose a model that unifies these two tasks in the same metric space. Our approach achieves state-of-the-art performance on the challenging KITTI benchmark, with significantly reduced inference time compared with existing methods.
arXiv Detail & Related papers (2021-01-17T05:11:38Z)
Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image. Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space. We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step. This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.