PIRATR: Parametric Object Inference for Robotic Applications with Transformers in 3D Point Clouds
- URL: http://arxiv.org/abs/2602.05557v1
- Date: Thu, 05 Feb 2026 11:29:09 GMT
- Title: PIRATR: Parametric Object Inference for Robotic Applications with Transformers in 3D Point Clouds
- Authors: Michael Schwingshackl, Fabio F. Oberweger, Mario Niedermeyer, Huemer Johannes, Markus Murschitz,
- Abstract summary: We present PIRATR, an end-to-end 3D object detection framework for robotic use cases in point clouds.<n>Our method streamlines parametric 3D object detection by jointly estimating multi-class 6-DoF poses and class-specific parametric attributes.<n>PIRATR generalizes effectively to real outdoor LiDAR scans, achieving a detection mAP of 0.919 without additional fine-tuning.
- Score: 0.9134244356393665
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We present PIRATR, an end-to-end 3D object detection framework for robotic use cases in point clouds. Extending PI3DETR, our method streamlines parametric 3D object detection by jointly estimating multi-class 6-DoF poses and class-specific parametric attributes directly from occlusion-affected point cloud data. This formulation enables not only geometric localization but also the estimation of task-relevant properties for parametric objects, such as a gripper's opening, where the 3D model is adjusted according to simple, predefined rules. The architecture employs modular, class-specific heads, making it straightforward to extend to novel object types without re-designing the pipeline. We validate PIRATR on an automated forklift platform, focusing on three structurally and functionally diverse categories: crane grippers, loading platforms, and pallets. Trained entirely in a synthetic environment, PIRATR generalizes effectively to real outdoor LiDAR scans, achieving a detection mAP of 0.919 without additional fine-tuning. PIRATR establishes a new paradigm of pose-aware, parameterized perception. This bridges the gap between low-level geometric reasoning and actionable world models, paving the way for scalable, simulation-trained perception systems that can be deployed in dynamic robotic environments. Code available at https://github.com/swingaxe/piratr.
Related papers
- PLANA3R: Zero-shot Metric Planar 3D Reconstruction via Feed-Forward Planar Splatting [56.188624157291024]
We introduce PLANA3R, a pose-free framework for metric Planar 3D Reconstruction from unposed two-view images.<n>Unlike prior feedforward methods that require 3D plane annotations during training, PLANA3R learns planar 3D structures without explicit plane supervision.<n>We validate PLANA3R on multiple indoor-scene datasets with metric supervision and demonstrate strong generalization to out-of-domain indoor environments.
arXiv Detail & Related papers (2025-10-21T15:15:33Z) - PI3DETR: Parametric Instance Detection of 3D Point Cloud Edges with a Geometry-Aware 3DETR [0.0]
We present PI3DETR, an end-to-end framework that directly predicts 3D parametric curve instances from raw point clouds.<n> PI3DETR sets a new state-of-the-art on the ABC dataset and generalizes effectively to real sensor data, offering a simple yet powerful solution for 3D edge and curve estimation.
arXiv Detail & Related papers (2025-09-03T12:24:25Z) - Adaptive Point-Prompt Tuning: Fine-Tuning Heterogeneous Foundation Models for 3D Point Cloud Analysis [51.37795317716487]
We propose the Adaptive Point-Prompt Tuning (APPT) method, which fine-tunes pre-trained models with a modest number of parameters.<n>We convert raw point clouds into point embeddings by aggregating local geometry to capture spatial features followed by linear layers.<n>To calibrate self-attention across source domains of any modality to 3D, we introduce a prompt generator that shares weights with the point embedding module.
arXiv Detail & Related papers (2025-08-30T06:02:21Z) - Robust 3D Object Detection using Probabilistic Point Clouds from Single-Photon LiDARs [29.92823252627008]
LiDAR-based 3D sensors provide point clouds, a canonical 3D representation used in various scene understanding tasks.<n>Modern LiDARs face challenges in several real-world scenarios, such as long-distance or low-albedo objects, producing sparse or erroneous point clouds.<n>We propose Probabilistic Point Clouds (PPC), a novel 3D scene representation where each point is augmented with a probability attribute.
arXiv Detail & Related papers (2025-07-31T21:32:21Z) - On Geometry-Enhanced Parameter-Efficient Fine-Tuning for 3D Scene Segmentation [52.96632954620623]
We introduce a novel geometry-aware PEFT module specifically designed for 3D point cloud transformers.<n>Our approach sets a new benchmark for efficient, scalable, and geometry-aware fine-tuning of large-scale 3D point cloud models.
arXiv Detail & Related papers (2025-05-28T15:08:36Z) - 3D-CDRGP: Towards Cross-Device Robotic Grasping Policy in 3D Open World [20.406334587479623]
Cross-device research has become an urgent issue that needs to be tackled.<n>We pioneer in probing the cross-device (cameras & robotics) grasping policy in the 3D open world.<n>We introduce the SSGC-Seg module that enables category-agnostic 3D object detection.
arXiv Detail & Related papers (2024-11-27T08:23:28Z) - Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries.
We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images.
Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z) - Flatten Anything: Unsupervised Neural Surface Parameterization [76.4422287292541]
We introduce the Flatten Anything Model (FAM), an unsupervised neural architecture to achieve global free-boundary surface parameterization.
Compared with previous methods, our FAM directly operates on discrete surface points without utilizing connectivity information.
Our FAM is fully-automated without the need for pre-cutting and can deal with highly-complex topologies.
arXiv Detail & Related papers (2024-05-23T14:39:52Z) - ParaPoint: Learning Global Free-Boundary Surface Parameterization of 3D Point Clouds [52.03819676074455]
ParaPoint is an unsupervised neural learning pipeline for achieving global free-boundary surface parameterization.
This work makes the first attempt to investigate neural point cloud parameterization that pursues both global mappings and free boundaries.
arXiv Detail & Related papers (2024-03-15T14:35:05Z) - Hierarchical Point Attention for Indoor 3D Object Detection [111.04397308495618]
This work proposes two novel attention operations as generic hierarchical designs for point-based transformer detectors.
First, we propose Multi-Scale Attention (MS-A) that builds multi-scale tokens from a single-scale input feature to enable more fine-grained feature learning.
Second, we propose Size-Adaptive Local Attention (Local-A) with adaptive attention regions for localized feature aggregation within bounding box proposals.
arXiv Detail & Related papers (2023-01-06T18:52:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.