Related papers: MaskPlanner: Learning-Based Object-Centric Motion Generation from 3D Point Clouds

MaskPlanner: Learning-Based Object-Centric Motion Generation from 3D Point Clouds

URL: http://arxiv.org/abs/2502.18745v1
Date: Wed, 26 Feb 2025 01:39:25 GMT
Title: MaskPlanner: Learning-Based Object-Centric Motion Generation from 3D Point Clouds
Authors: Gabriele Tiboni, Raffaello Camoriano, Tatiana Tommasi,
Abstract summary: Object-Centric Motion Generation (OCMG) plays a key role in a variety of industrial applications.<n>We propose MaskPlanner, a learning method that predicts local paths while a deep learning method for a given object in "path" masks.<n>Our findings crucially highlight the proposed learning method for OCMG to reduce engineering overhead and seamlessly adapt to several industrial use cases.
Score: 11.72951300809094
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Object-Centric Motion Generation (OCMG) plays a key role in a variety of industrial applications$\unicode{x2014}$such as robotic spray painting and welding$\unicode{x2014}$requiring efficient, scalable, and generalizable algorithms to plan multiple long-horizon trajectories over free-form 3D objects. However, existing solutions rely on specialized heuristics, expensive optimization routines, or restrictive geometry assumptions that limit their adaptability to real-world scenarios. In this work, we introduce a novel, fully data-driven framework that tackles OCMG directly from 3D point clouds, learning to generalize expert path patterns across free-form surfaces. We propose MaskPlanner, a deep learning method that predicts local path segments for a given object while simultaneously inferring "path masks" to group these segments into distinct paths. This design induces the network to capture both local geometric patterns and global task requirements in a single forward pass. Extensive experimentation on a realistic robotic spray painting scenario shows that our approach attains near-complete coverage (above 99%) for unseen objects, while it remains task-agnostic and does not explicitly optimize for paint deposition. Moreover, our real-world validation on a 6-DoF specialized painting robot demonstrates that the generated trajectories are directly executable and yield expert-level painting quality. Our findings crucially highlight the potential of the proposed learning method for OCMG to reduce engineering overhead and seamlessly adapt to several industrial use cases.

Related papers

SeqAffordSplat: Scene-level Sequential Affordance Reasoning on 3D Gaussian Splatting [85.87902260102652]
We introduce the novel task of Sequential 3D Gaussian Affordance Reasoning.<n>We then propose SeqSplatNet, an end-to-end framework that directly maps an instruction to a sequence of 3D affordance masks.<n>Our method sets a new state-of-the-art on our challenging benchmark, effectively advancing affordance reasoning from single-step interactions to complex, sequential tasks at the scene level.
arXiv Detail & Related papers (2025-07-31T17:56:55Z)
OV-MAP : Open-Vocabulary Zero-Shot 3D Instance Segmentation Map for Robots [18.200635521222267]
OV-MAP is a novel approach to open-world 3D mapping for mobile robots by integrating open-features into 3D maps to enhance object recognition capabilities.<n>We employ a class-agnostic segmentation model to project 2D masks into 3D space, combined with a supplemented depth image created by merging raw and synthetic depth from point clouds.<n>This approach, along with a 3D mask voting mechanism, enables accurate zero-shot 3D instance segmentation without relying on 3D supervised segmentation models.
arXiv Detail & Related papers (2025-06-13T08:49:23Z)
UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting [64.31900521467362]
No existing pre-training method is equally effective for both object- and scene-level point clouds.<n>We introduce UniPre3D, the first unified pre-training method that can be seamlessly applied to point clouds of any scale and 3D models of any architecture.
arXiv Detail & Related papers (2025-06-11T17:23:21Z)
Towards Cross-device and Training-free Robotic Grasping in 3D Open World [20.406334587479623]
This paper presents a novel pipeline capable of executing object grasping tasks in open-world scenarios without the necessity for training.<n>We propose to engage a training-free binary clustering algorithm that improves segmentation precision and possesses the capability to cluster and localize unseen objects for executing grasping operations.
arXiv Detail & Related papers (2024-11-27T08:23:28Z)
Triple Point Masking [49.39218611030084]
Existing 3D mask learning methods encounter performance bottlenecks under limited data. We introduce a triple point masking scheme, named TPM, which serves as a scalable framework for pre-training of masked autoencoders. Extensive experiments show that the four baselines equipped with the proposed TPM achieve comprehensive performance improvements on various downstream tasks.
arXiv Detail & Related papers (2024-09-26T05:33:30Z)
Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries. We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images. Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z)
Flatten Anything: Unsupervised Neural Surface Parameterization [76.4422287292541]
We introduce the Flatten Anything Model (FAM), an unsupervised neural architecture to achieve global free-boundary surface parameterization. Compared with previous methods, our FAM directly operates on discrete surface points without utilizing connectivity information. Our FAM is fully-automated without the need for pre-cutting and can deal with highly-complex topologies.
arXiv Detail & Related papers (2024-05-23T14:39:52Z)
ParaPoint: Learning Global Free-Boundary Surface Parameterization of 3D Point Clouds [52.03819676074455]
ParaPoint is an unsupervised neural learning pipeline for achieving global free-boundary surface parameterization. This work makes the first attempt to investigate neural point cloud parameterization that pursues both global mappings and free boundaries.
arXiv Detail & Related papers (2024-03-15T14:35:05Z)
Toward a Plug-and-Play Vision-Based Grasping Module for Robotics [0.0]
This paper introduces a vision-based grasping framework that can easily be transferred across multiple manipulators. The framework generates diverse repertoires of open-loop grasping trajectories, enhancing adaptability while maintaining a diversity of grasps.
arXiv Detail & Related papers (2023-10-06T16:16:00Z)
Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.<n>Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.<n>Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z)
PaintNet: Unstructured Multi-Path Learning from 3D Point Clouds for Robotic Spray Painting [13.182797149468204]
Industrial robotic problems such as spray painting and welding require planning of multiple trajectories to solve the task. Existing solutions make strong assumptions on the form of input surfaces and the nature of output paths. By leveraging on recent advances in 3D deep learning, we introduce a novel framework capable of dealing with arbitrary 3D surfaces.
arXiv Detail & Related papers (2022-11-13T15:41:50Z)
DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools [96.38972082580294]
DiffSkill is a novel framework that uses a differentiable physics simulator for skill abstraction to solve deformable object manipulation tasks. In particular, we first obtain short-horizon skills using individual tools from a gradient-based simulator. We then learn a neural skill abstractor from the demonstration trajectories which takes RGBD images as input.
arXiv Detail & Related papers (2022-03-31T17:59:38Z)
Supervised Training of Dense Object Nets using Optimal Descriptors for Industrial Robotic Applications [57.87136703404356]
Dense Object Nets (DONs) by Florence, Manuelli and Tedrake introduced dense object descriptors as a novel visual object representation for the robotics community. In this paper we show that given a 3D model of an object, we can generate its descriptor space image, which allows for supervised training of DONs. We compare the training methods on generating 6D grasps for industrial objects and show that our novel supervised training approach improves the pick-and-place performance in industry-relevant tasks.
arXiv Detail & Related papers (2021-02-16T11:40:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.