Multi-level Reasoning for Robotic Assembly: From Sequence Inference to
Contact Selection
- URL: http://arxiv.org/abs/2312.10571v1
- Date: Sun, 17 Dec 2023 00:47:13 GMT
- Title: Multi-level Reasoning for Robotic Assembly: From Sequence Inference to
Contact Selection
- Authors: Xinghao Zhu, Devesh K. Jha, Diego Romeres, Lingfeng Sun, Masayoshi
Tomizuka, Anoop Cherian
- Abstract summary: We present the Part Assembly Sequence Transformer (PAST) to infer assembly sequences from a target blueprint.
We then use a motion planner and optimization to generate part movements and contacts.
Experimental results show that our approach generalizes better than prior methods.
- Score: 74.40109927350856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automating the assembly of objects from their parts is a complex problem with
innumerable applications in manufacturing, maintenance, and recycling. Unlike
existing research, which is limited to target segmentation, pose regression, or
using fixed target blueprints, our work presents a holistic multi-level
framework for part assembly planning consisting of part assembly sequence
inference, part motion planning, and robot contact optimization. We present the
Part Assembly Sequence Transformer (PAST) -- a sequence-to-sequence neural
network -- to infer assembly sequences recursively from a target blueprint. We
then use a motion planner and optimization to generate part movements and
contacts. To train PAST, we introduce D4PAS: a large-scale Dataset for Part
Assembly Sequences (D4PAS) consisting of physically valid sequences for
industrial objects. Experimental results show that our approach generalizes
better than prior methods while needing significantly less computational time
for inference.
Related papers
- SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation [62.58480650443393]
Segment Anything (SAM) is a vision-foundation model for generalizable scene understanding and sequence imitation.
We develop a novel multi-channel heatmap that enables the prediction of the action sequence in a single pass.
arXiv Detail & Related papers (2024-05-30T00:32:51Z) - SPAFormer: Sequential 3D Part Assembly with Transformers [52.980803808373516]
We introduce SPAFormer, an innovative model designed to overcome the explosion challenge in the 3D Part Assembly task.
It addresses this problem by leveraging constraints from assembly sequences, effectively reducing the solution space's complexity.
It further enhances assembly through knowledge enhancement strategies that utilize the attributes of parts and their sequence information.
arXiv Detail & Related papers (2024-03-09T10:53:11Z) - ASAP: Automated Sequence Planning for Complex Robotic Assembly with
Physical Feasibility [27.424678100675163]
We present ASAP, a physics-based planning approach for automatically generating a sequence for general-shaped assemblies.
A search can be guided by either geometrics or graph neural networks trained on data with simulation labels.
We show the superior performance of ASAP at generating physically realistic assembly sequence plans on a large dataset of hundreds of complex product assemblies.
arXiv Detail & Related papers (2023-09-29T00:27:40Z) - ZJU ReLER Submission for EPIC-KITCHEN Challenge 2023: Semi-Supervised
Video Object Segmentation [62.98078087018469]
We introduce MSDeAOT, a variant of the AOT framework that incorporates transformers at multiple feature scales.
MSDeAOT efficiently propagates object masks from previous frames to the current frame using a feature scale with a stride of 16.
We also employ GPM in a more refined feature scale with a stride of 8, leading to improved accuracy in detecting and tracking small objects.
arXiv Detail & Related papers (2023-07-05T03:43:15Z) - RegFormer: An Efficient Projection-Aware Transformer Network for
Large-Scale Point Cloud Registration [73.69415797389195]
We propose an end-to-end transformer network (RegFormer) for large-scale point cloud alignment.
Specifically, a projection-aware hierarchical transformer is proposed to capture long-range dependencies and filter outliers.
Our transformer has linear complexity, which guarantees high efficiency even for large-scale scenes.
arXiv Detail & Related papers (2023-03-22T08:47:37Z) - Efficient and Feasible Robotic Assembly Sequence Planning via Graph
Representation Learning [22.447462847331312]
We propose a holistic graphical approach including a graph representation called Assembly Graph for product assemblies.
With GRACE, we are able to extract meaningful information from the graph input and predict assembly sequences in a step-by-step manner.
In experiments, we show that our approach can predict feasible assembly sequences across product variants of aluminum profiles.
arXiv Detail & Related papers (2023-03-17T17:23:14Z) - 3D Part Assembly Generation with Instance Encoded Transformer [22.330218525999857]
We propose a multi-layer transformer-based framework that involves geometric and relational reasoning between parts to update the part poses iteratively.
We extend our framework to a new task called in-process part assembly.
Our method achieves far more than 10% improvements over the current state-of-the-art in multiple metrics on the public PartNet dataset.
arXiv Detail & Related papers (2022-07-05T02:40:57Z) - Efficient and Robust Training of Dense Object Nets for Multi-Object
Robot Manipulation [8.321536457963655]
We propose a framework for robust and efficient training of Dense Object Nets (DON)
We focus on training with multi-object data instead of singulated objects, combined with a well-chosen augmentation scheme.
We demonstrate the robustness and accuracy of our proposed framework on a real-world robotic grasping task.
arXiv Detail & Related papers (2022-06-24T08:24:42Z) - Graph-based Reinforcement Learning meets Mixed Integer Programs: An
application to 3D robot assembly discovery [34.25379651790627]
We tackle the problem of building arbitrary, predefined target structures entirely from scratch using a set of Tetris-like building blocks and a robotic manipulator.
Our novel hierarchical approach aims at efficiently decomposing the overall task into three feasible levels that benefit mutually from each other.
arXiv Detail & Related papers (2022-03-08T14:44:51Z) - Prototypical Cross-Attention Networks for Multiple Object Tracking and
Segmentation [95.74244714914052]
Multiple object tracking and segmentation requires detecting, tracking, and segmenting objects belonging to a set of given classes.
We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich-temporal information online.
PCAN outperforms current video instance tracking and segmentation competition winners on Youtube-VIS and BDD100K datasets.
arXiv Detail & Related papers (2021-06-22T17:57:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.