Brick-by-Brick: Combinatorial Construction with Deep Reinforcement
Learning
- URL: http://arxiv.org/abs/2110.15481v1
- Date: Fri, 29 Oct 2021 01:09:51 GMT
- Title: Brick-by-Brick: Combinatorial Construction with Deep Reinforcement
Learning
- Authors: Hyunsoo Chung, Jungtaek Kim, Boris Knyazev, Jinhwi Lee, Graham W.
Taylor, Jaesik Park, Minsu Cho
- Abstract summary: We introduce a novel formulation, complex construction, which requires a building agent to assemble unit primitives sequentially.
To construct a target object, we provide incomplete knowledge about the desired target (i.e., 2D images) instead of exact and explicit information to the agent.
We demonstrate that the proposed method successfully learns to construct an unseen object conditioned on a single image or multiple views of a target object.
- Score: 52.85981207514049
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Discovering a solution in a combinatorial space is prevalent in many
real-world problems but it is also challenging due to diverse complex
constraints and the vast number of possible combinations. To address such a
problem, we introduce a novel formulation, combinatorial construction, which
requires a building agent to assemble unit primitives (i.e., LEGO bricks)
sequentially -- every connection between two bricks must follow a fixed rule,
while no bricks mutually overlap. To construct a target object, we provide
incomplete knowledge about the desired target (i.e., 2D images) instead of
exact and explicit volumetric information to the agent. This problem requires a
comprehensive understanding of partial information and long-term planning to
append a brick sequentially, which leads us to employ reinforcement learning.
The approach has to consider a variable-sized action space where a large number
of invalid actions, which would cause overlap between bricks, exist. To resolve
these issues, our model, dubbed Brick-by-Brick, adopts an action validity
prediction network that efficiently filters invalid actions for an actor-critic
network. We demonstrate that the proposed method successfully learns to
construct an unseen object conditioned on a single image or multiple views of a
target object.
Related papers
- TreeSBA: Tree-Transformer for Self-Supervised Sequential Brick Assembly [51.29305265324916]
We propose a class-agnostic tree-transformer framework to predict the sequential assembly actions from input multi-view images.
A major challenge of the sequential brick assembly task is that the step-wise action labels are costly and tedious to obtain in practice.
We mitigate this problem by leveraging synthetic-to-real transfer learning.
arXiv Detail & Related papers (2024-07-22T14:05:27Z) - Sequential Brick Assembly with Efficient Constraint Satisfaction [39.869693447362145]
We address the problem of generating a sequence of LEGO brick assembly with high-fidelity structures.
Our method performs a brick structure assessment to predict the next brick position and its confidence by employing a U-shaped sparse 3D convolutional network.
Instead of using handcrafted brick assembly datasets, our model is trained with a large number of 3D objects that allow to create a new high-fidelity structure.
arXiv Detail & Related papers (2022-10-03T15:35:08Z) - Break and Make: Interactive Structural Understanding Using LEGO Bricks [61.01136603613139]
We build a fully interactive 3D simulator that allows learning agents to assemble, disassemble and manipulate LEGO models.
We take a first step towards solving this problem using sequence-to-sequence models.
arXiv Detail & Related papers (2022-07-27T18:33:09Z) - GANzzle: Reframing jigsaw puzzle solving as a retrieval task using a
generative mental image [15.132848477903314]
We infer a mental image from all pieces, which a given piece can then be matched against avoiding the explosion.
We learn how to reconstruct the image given a set of unordered pieces, allowing the model to learn a joint embedding space to match an encoding of each piece to the cropped layer of the generator.
In doing so our model is puzzle size agnostic, in contrast to prior deep learning methods which are single size.
arXiv Detail & Related papers (2022-07-12T16:02:00Z) - Collaborative Learning for Hand and Object Reconstruction with
Attention-guided Graph Convolution [49.10497573378427]
Estimating the pose and shape of hands and objects under interaction finds numerous applications including augmented and virtual reality.
Our algorithm is optimisation to object models, and it learns the physical rules governing hand-object interaction.
Experiments using four widely-used benchmarks show that our framework achieves beyond state-of-the-art accuracy in 3D pose estimation, as well as recovers dense 3D hand and object shapes.
arXiv Detail & Related papers (2022-04-27T17:00:54Z) - Blocks Assemble! Learning to Assemble with Large-Scale Structured
Reinforcement Learning [23.85678777628229]
Assembly of multi-part physical structures is a valuable end product for autonomous robotics.
We introduce a naturalistic physics-based environment with a set of connectable magnet blocks inspired by children's toy kits.
We find that the combination of large-scale reinforcement learning and graph-based policies is an effective recipe for training agents.
arXiv Detail & Related papers (2022-03-15T18:21:02Z) - Reinforcement Learning with Combinatorial Actions: An Application to
Vehicle Routing [9.995347522610674]
We develop a framework for value-function-based deep reinforcement learning with a reinforcement action space.
We present an application of this framework to the capacitated vehicle routing problem (CVRP)
On each instance, we model an action as the construction of a single route, and consider a deterministic policy which is improved through a simple policy algorithm.
arXiv Detail & Related papers (2020-10-22T19:32:21Z) - CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and
Transfer Learning [138.40338621974954]
CausalWorld is a benchmark for causal structure and transfer learning in a robotic manipulation environment.
Tasks consist of constructing 3D shapes from a given set of blocks - inspired by how children learn to build complex structures.
arXiv Detail & Related papers (2020-10-08T23:01:13Z) - Object-Aware Multi-Branch Relation Networks for Spatio-Temporal Video
Grounding [90.12181414070496]
We propose a novel object-aware multi-branch relation network for object-aware relation discovery.
We then propose multi-branch reasoning to capture critical object relationships between the main branch and auxiliary branches.
arXiv Detail & Related papers (2020-08-16T15:39:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.