Explain What You See: Open-Ended Segmentation and Recognition of
Occluded 3D Objects
- URL: http://arxiv.org/abs/2301.07037v1
- Date: Tue, 17 Jan 2023 17:43:46 GMT
- Title: Explain What You See: Open-Ended Segmentation and Recognition of
Occluded 3D Objects
- Authors: H. Ayoobi, H. Kasaei, M. Cao, R. Verbrugge, B. Verheij
- Abstract summary: We propose a novel semantic 3D object-parts segmentation method that has the flexibility of Local-HDP.
We show that the proposed method has a higher percentage of mean intersection over union, using a smaller number of learning instances.
We show that the resulting model produces an explicit set of explanations for the 3D object category recognition task.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Local-HDP (for Local Hierarchical Dirichlet Process) is a hierarchical
Bayesian method that has recently been used for open-ended 3D object category
recognition. This method has been proven to be efficient in real-time robotic
applications. However, the method is not robust to a high degree of occlusion.
We address this limitation in two steps. First, we propose a novel semantic 3D
object-parts segmentation method that has the flexibility of Local-HDP. This
method is shown to be suitable for open-ended scenarios where the number of 3D
objects or object parts is not fixed and can grow over time. We show that the
proposed method has a higher percentage of mean intersection over union, using
a smaller number of learning instances. Second, we integrate this technique
with a recently introduced argumentation-based online incremental learning
method, thereby enabling the model to handle a high degree of occlusion. We
show that the resulting model produces an explicit set of explanations for the
3D object category recognition task.
Related papers
- Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance [49.14140194332482]
We introduce Open3DIS, a novel solution designed to tackle the problem of Open-Vocabulary Instance within 3D scenes.
Objects within 3D environments exhibit diverse shapes, scales, and colors, making precise instance-level identification a challenging task.
arXiv Detail & Related papers (2023-12-17T10:07:03Z) - SAI3D: Segment Any Instance in 3D Scenes [68.57002591841034]
We introduce SAI3D, a novel zero-shot 3D instance segmentation approach.
Our method partitions a 3D scene into geometric primitives, which are then progressively merged into 3D instance segmentations.
Empirical evaluations on ScanNet, Matterport3D and the more challenging ScanNet++ datasets demonstrate the superiority of our approach.
arXiv Detail & Related papers (2023-12-17T09:05:47Z) - ALSTER: A Local Spatio-Temporal Expert for Online 3D Semantic
Reconstruction [62.599588577671796]
We propose an online 3D semantic segmentation method that incrementally reconstructs a 3D semantic map from a stream of RGB-D frames.
Unlike offline methods, ours is directly applicable to scenarios with real-time constraints, such as robotics or mixed reality.
arXiv Detail & Related papers (2023-11-29T20:30:18Z) - Fine-grained 3D object recognition: an approach and experiments [0.0]
Three-dimensional (3D) object recognition technology is being used as a core technology in advanced technologies such as autonomous driving of automobiles.
There are two sets of approaches for 3D object recognition: (i) hand-crafted approaches like Global Orthographic Object Descriptor (GOOD), and (ii) deep learning-based approaches such as MobileNet and VGG.
In this paper, we first implemented an offline 3D object recognition system that takes an object view as input and generates category labels as output.
In the offline stage, instance-based learning (IBL) is used to form a new
arXiv Detail & Related papers (2023-06-28T04:48:21Z) - LocATe: End-to-end Localization of Actions in 3D with Transformers [91.28982770522329]
LocATe is an end-to-end approach that jointly localizes and recognizes actions in a 3D sequence.
Unlike transformer-based object-detection and classification models which consider image or patch features as input, LocATe's transformer model is capable of capturing long-term correlations between actions in a sequence.
We introduce a new, challenging, and more realistic benchmark dataset, BABEL-TAL-20 (BT20), where the performance of state-of-the-art methods is significantly worse.
arXiv Detail & Related papers (2022-03-21T03:35:32Z) - End-to-End Learning of Multi-category 3D Pose and Shape Estimation [128.881857704338]
We propose an end-to-end method that simultaneously detects 2D keypoints from an image and lifts them to 3D.
The proposed method learns both 2D detection and 3D lifting only from 2D keypoints annotations.
In addition to being end-to-end in image to 3D learning, our method also handles objects from multiple categories using a single neural network.
arXiv Detail & Related papers (2021-12-19T17:10:40Z) - Objects are Different: Flexible Monocular 3D Object Detection [87.82253067302561]
We propose a flexible framework for monocular 3D object detection which explicitly decouples the truncated objects and adaptively combines multiple approaches for object depth estimation.
Experiments demonstrate that our method outperforms the state-of-the-art method by relatively 27% for the moderate level and 30% for the hard level in the test set of KITTI benchmark.
arXiv Detail & Related papers (2021-04-06T07:01:28Z) - Sim2Real 3D Object Classification using Spherical Kernel Point
Convolution and a Deep Center Voting Scheme [28.072144989298298]
Learning from artificial 3D models alleviates the cost of annotation necessary to approach this problem.
We conjecture that the cause of those issue is the fact that many methods learn directly from point coordinates, instead of the shape.
We introduce spherical kernel point convolutions that directly exploit the object surface, represented as a graph, and a voting scheme to limit the impact of poor segmentation.
arXiv Detail & Related papers (2021-03-10T15:32:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.