Related papers: PIRATR: Parametric Object Inference for Robotic Applications with Transformers in 3D Point Clouds

PIRATR: Parametric Object Inference for Robotic Applications with Transformers in 3D Point Clouds

URL: http://arxiv.org/abs/2602.05557v1
Date: Thu, 05 Feb 2026 11:29:09 GMT
Title: PIRATR: Parametric Object Inference for Robotic Applications with Transformers in 3D Point Clouds
Authors: Michael Schwingshackl, Fabio F. Oberweger, Mario Niedermeyer, Huemer Johannes, Markus Murschitz,
Abstract summary: We present PIRATR, an end-to-end 3D object detection framework for robotic use cases in point clouds.<n>Our method streamlines parametric 3D object detection by jointly estimating multi-class 6-DoF poses and class-specific parametric attributes.<n>PIRATR generalizes effectively to real outdoor LiDAR scans, achieving a detection mAP of 0.919 without additional fine-tuning.
Score: 0.9134244356393665
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We present PIRATR, an end-to-end 3D object detection framework for robotic use cases in point clouds. Extending PI3DETR, our method streamlines parametric 3D object detection by jointly estimating multi-class 6-DoF poses and class-specific parametric attributes directly from occlusion-affected point cloud data. This formulation enables not only geometric localization but also the estimation of task-relevant properties for parametric objects, such as a gripper's opening, where the 3D model is adjusted according to simple, predefined rules. The architecture employs modular, class-specific heads, making it straightforward to extend to novel object types without re-designing the pipeline. We validate PIRATR on an automated forklift platform, focusing on three structurally and functionally diverse categories: crane grippers, loading platforms, and pallets. Trained entirely in a synthetic environment, PIRATR generalizes effectively to real outdoor LiDAR scans, achieving a detection mAP of 0.919 without additional fine-tuning. PIRATR establishes a new paradigm of pose-aware, parameterized perception. This bridges the gap between low-level geometric reasoning and actionable world models, paving the way for scalable, simulation-trained perception systems that can be deployed in dynamic robotic environments. Code available at https://github.com/swingaxe/piratr.

Related papers

PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-Forward Planar Splatting [56.188624157291024]
We introduce PLANA3R, a pose-free framework for metric Planar 3D Reconstruction from unposed two-view images.<n>Unlike prior feedforward methods that require 3D plane annotations during training, PLANA3R learns planar 3D structures without explicit plane supervision.<n>We validate PLANA3R on multiple indoor-scene datasets with metric supervision and demonstrate strong generalization to out-of-domain indoor environments.
arXiv Detail & Related papers (2025-10-21T15:15:33Z)
PI3DETR: Parametric Instance Detection of 3D Point Cloud Edges with a Geometry-Aware 3DETR [0.0]
We present PI3DETR, an end-to-end framework that directly predicts 3D parametric curve instances from raw point clouds.<n> PI3DETR sets a new state-of-the-art on the ABC dataset and generalizes effectively to real sensor data, offering a simple yet powerful solution for 3D edge and curve estimation.
arXiv Detail & Related papers (2025-09-03T12:24:25Z)
Adaptive Point-Prompt Tuning: Fine-Tuning Heterogeneous Foundation Models for 3D Point Cloud Analysis [51.37795317716487]
We propose the Adaptive Point-Prompt Tuning (APPT) method, which fine-tunes pre-trained models with a modest number of parameters.<n>We convert raw point clouds into point embeddings by aggregating local geometry to capture spatial features followed by linear layers.<n>To calibrate self-attention across source domains of any modality to 3D, we introduce a prompt generator that shares weights with the point embedding module.
arXiv Detail & Related papers (2025-08-30T06:02:21Z)
Robust 3D Object Detection using Probabilistic Point Clouds from Single-Photon LiDARs [29.92823252627008]
LiDAR-based 3D sensors provide point clouds, a canonical 3D representation used in various scene understanding tasks.<n>Modern LiDARs face challenges in several real-world scenarios, such as long-distance or low-albedo objects, producing sparse or erroneous point clouds.<n>We propose Probabilistic Point Clouds (PPC), a novel 3D scene representation where each point is augmented with a probability attribute.
arXiv Detail & Related papers (2025-07-31T21:32:21Z)
On Geometry-Enhanced Parameter-Efficient Fine-Tuning for 3D Scene Segmentation [52.96632954620623]
We introduce a novel geometry-aware PEFT module specifically designed for 3D point cloud transformers.<n>Our approach sets a new benchmark for efficient, scalable, and geometry-aware fine-tuning of large-scale 3D point cloud models.
arXiv Detail & Related papers (2025-05-28T15:08:36Z)
3D-CDRGP: Towards Cross-Device Robotic Grasping Policy in 3D Open World [20.406334587479623]
Cross-device research has become an urgent issue that needs to be tackled.<n>We pioneer in probing the cross-device (cameras & robotics) grasping policy in the 3D open world.<n>We introduce the SSGC-Seg module that enables category-agnostic 3D object detection.
arXiv Detail & Related papers (2024-11-27T08:23:28Z)
Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries. We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images. Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z)
Flatten Anything: Unsupervised Neural Surface Parameterization [76.4422287292541]
We introduce the Flatten Anything Model (FAM), an unsupervised neural architecture to achieve global free-boundary surface parameterization. Compared with previous methods, our FAM directly operates on discrete surface points without utilizing connectivity information. Our FAM is fully-automated without the need for pre-cutting and can deal with highly-complex topologies.
arXiv Detail & Related papers (2024-05-23T14:39:52Z)
ParaPoint: Learning Global Free-Boundary Surface Parameterization of 3D Point Clouds [52.03819676074455]
ParaPoint is an unsupervised neural learning pipeline for achieving global free-boundary surface parameterization. This work makes the first attempt to investigate neural point cloud parameterization that pursues both global mappings and free boundaries.
arXiv Detail & Related papers (2024-03-15T14:35:05Z)
Hierarchical Point Attention for Indoor 3D Object Detection [111.04397308495618]
This work proposes two novel attention operations as generic hierarchical designs for point-based transformer detectors. First, we propose Multi-Scale Attention (MS-A) that builds multi-scale tokens from a single-scale input feature to enable more fine-grained feature learning. Second, we propose Size-Adaptive Local Attention (Local-A) with adaptive attention regions for localized feature aggregation within bounding box proposals.
arXiv Detail & Related papers (2023-01-06T18:52:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.