Related papers: Detecting 3D Line Segments for 6DoF Pose Estimation with Limited Data

Detecting 3D Line Segments for 6DoF Pose Estimation with Limited Data

URL: http://arxiv.org/abs/2601.12090v1
Date: Sat, 17 Jan 2026 15:49:26 GMT
Title: Detecting 3D Line Segments for 6DoF Pose Estimation with Limited Data
Authors: Matej Mok, Lukáš Gajdošech, Michal Mesároš, Martin Madaras, Viktor Kocur,
Abstract summary: We propose a novel method for 6DoF pose estimation focused specifically on bins used in industrial settings.<n>We exploit the cuboid geometry of bins by first detecting intermediate 3D line segments corresponding to their top edges.<n>We show that our method significantly outperforms current state-of-the-art 6DoF pose estimation methods in terms of the pose accuracy.
Score: 3.3243678439936133
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The task of 6DoF object pose estimation is one of the fundamental problems of 3D vision with many practical applications such as industrial automation. Traditional deep learning approaches for this task often require extensive training data or CAD models, limiting their application in real-world industrial settings where data is scarce and object instances vary. We propose a novel method for 6DoF pose estimation focused specifically on bins used in industrial settings. We exploit the cuboid geometry of bins by first detecting intermediate 3D line segments corresponding to their top edges. Our approach extends the 2D line segment detection network LeTR to operate on structured point cloud data. The detected 3D line segments are then processed using a simple geometric procedure to robustly determine the bin's 6DoF pose. To evaluate our method, we extend an existing dataset with a newly collected and annotated dataset, which we make publicly available. We show that incorporating synthetic training data significantly improves pose estimation accuracy on real scans. Moreover, we show that our method significantly outperforms current state-of-the-art 6DoF pose estimation methods in terms of the pose accuracy (3 cm translation error, 8.2$^\circ$ rotation error) while not requiring instance-specific CAD models during inference.

Related papers

Sparse Multiview Open-Vocabulary 3D Detection [27.57172918603858]
3D object detection has traditionally been solved by training to detect a fixed set of categories.<n>In this work, we investigate open-vocabulary 3D object detection in the challenging yet practical sparse-view setting.<n>Our approach is training-free, relying on pre-trained, off-the-shelf 2D foundation models instead of employing computationally expensive 3D feature fusion.
arXiv Detail & Related papers (2025-09-19T12:22:24Z)
3D Adversarial Augmentations for Robust Out-of-Domain Predictions [115.74319739738571]
We focus on improving the generalization to out-of-domain data. We learn a set of vectors that deform the objects in an adversarial fashion. We perform adversarial augmentation by applying the learned sample-independent vectors to the available objects when training a model.
arXiv Detail & Related papers (2023-08-29T17:58:55Z)
V-DETR: DETR with Vertex Relative Position Encoding for 3D Object Detection [73.37781484123536]
We introduce a highly performant 3D object detector for point clouds using the DETR framework. To address the limitation, we introduce a novel 3D Relative Position (3DV-RPE) method. We show exceptional results on the challenging ScanNetV2 benchmark.
arXiv Detail & Related papers (2023-08-08T17:14:14Z)
Lifting 2D Object Locations to 3D by Discounting LiDAR Outliers across Objects and Views [70.1586005070678]
We present a system for automatically converting 2D mask object predictions and raw LiDAR point clouds into full 3D bounding boxes of objects. Our method significantly outperforms previous work despite the fact that those methods use significantly more complex pipelines, 3D models and additional human-annotated external sources of prior information.
arXiv Detail & Related papers (2021-09-16T13:01:13Z)
RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets. Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications. In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z)
Self-Supervised Pretraining of 3D Features on any Point-Cloud [40.26575888582241]
We present a simple self-supervised pertaining method that can work with any 3D data without 3D registration. We evaluate our models on 9 benchmarks for object detection, semantic segmentation, and object classification, where they achieve state-of-the-art results and can outperform supervised pretraining.
arXiv Detail & Related papers (2021-01-07T18:55:21Z)
3D Registration for Self-Occluded Objects in Context [66.41922513553367]
We introduce the first deep learning framework capable of effectively handling this scenario. Our method consists of an instance segmentation module followed by a pose estimation one. It allows us to perform 3D registration in a one-shot manner, without requiring an expensive iterative procedure.
arXiv Detail & Related papers (2020-11-23T08:05:28Z)
DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data. The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes. We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z)
L6DNet: Light 6 DoF Network for Robust and Precise Object Pose Estimation with Small Datasets [0.0]
We propose a novel approach to perform 6 DoF object pose estimation from a single RGB-D image. We adopt a hybrid pipeline in two stages: data-driven and geometric. Our approach is more robust and accurate than state-of-the-art methods.
arXiv Detail & Related papers (2020-02-03T17:41:29Z)
Simultaneous 3D Object Segmentation and 6-DOF Pose Estimation [0.7252027234425334]
We propose a method for simultaneous 3D object segmentation and 6-DOF pose estimation in pure 3D point clouds scenes.<n>The key component of our method is a multi-task CNN architecture that can simultaneously predict the 3D object segmentation and 6-DOF pose estimation in pure 3D point clouds.<n>For experimental evaluation, we generate expanded training data for two state-of-the-arts 3D object datasets citePLciteTLINEMOD by using Augmented Reality (AR)
arXiv Detail & Related papers (2019-12-27T13:48:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.