Multi-3D-Models Registration-Based Augmented Reality (AR) Instructions
for Assembly
- URL: http://arxiv.org/abs/2311.16337v2
- Date: Wed, 29 Nov 2023 03:24:31 GMT
- Title: Multi-3D-Models Registration-Based Augmented Reality (AR) Instructions
for Assembly
- Authors: Seda Tuzun Canadinc and Wei Yan
- Abstract summary: BRICKxAR (M3D) visualizes rendered 3D assembly parts at the assembly location of the physical assembly model.
BRICKxAR (M3D) utilizes deep learning-trained 3D model-based registration.
- Score: 7.716174636585781
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper introduces a novel, markerless, step-by-step, in-situ 3D Augmented
Reality (AR) instruction method and its application - BRICKxAR (Multi 3D
Models/M3D) - for small parts assembly. BRICKxAR (M3D) realistically visualizes
rendered 3D assembly parts at the assembly location of the physical assembly
model (Figure 1). The user controls the assembly process through a user
interface. BRICKxAR (M3D) utilizes deep learning-trained 3D model-based
registration. Object recognition and tracking become challenging as the
assembly model updates at each step. Additionally, not every part in a 3D
assembly may be visible to the camera during the assembly. BRICKxAR (M3D)
combines multiple assembly phases with a step count to address these
challenges. Thus, using fewer phases simplifies the complex assembly process
while step count facilitates accurate object recognition and precise
visualization of each step. A testing and heuristic evaluation of the BRICKxAR
(M3D) prototype and qualitative analysis were conducted with users and experts
in visualization and human-computer interaction. Providing robust 3D AR
instructions and allowing the handling of the assembly model, BRICKxAR (M3D)
has the potential to be used at different scales ranging from manufacturing
assembly to construction.
Related papers
- Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images [24.10809783713574]
This paper introduces a novel task: translating multi-view images of a structural 3D model into a detailed sequence of assembly instructions.
We propose an end-to-end model known as the Neural Assembler.
arXiv Detail & Related papers (2024-04-25T08:53:23Z) - ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance [76.7746870349809]
We present ComboVerse, a 3D generation framework that produces high-quality 3D assets with complex compositions by learning to combine multiple models.
Our proposed framework emphasizes spatial alignment of objects, compared with standard score distillation sampling.
arXiv Detail & Related papers (2024-03-19T03:39:43Z) - Weakly Supervised Monocular 3D Detection with a Single-View Image [58.57978772009438]
Monocular 3D detection aims for precise 3D object localization from a single-view image.
We propose SKD-WM3D, a weakly supervised monocular 3D detection framework.
We show that SKD-WM3D surpasses the state-of-the-art clearly and is even on par with many fully supervised methods.
arXiv Detail & Related papers (2024-02-29T13:26:47Z) - 3D-GPT: Procedural 3D Modeling with Large Language Models [47.72968643115063]
We introduce 3D-GPT, a framework utilizing large language models(LLMs) for instruction-driven 3D modeling.
3D-GPT positions LLMs as proficient problem solvers, dissecting the procedural 3D modeling tasks into accessible segments and appointing the apt agent for each task.
Our empirical investigations confirm that 3D-GPT not only interprets and executes instructions, delivering reliable results but also collaborates effectively with human designers.
arXiv Detail & Related papers (2023-10-19T17:41:48Z) - Score-PA: Score-based 3D Part Assembly [6.25037277839849]
We introduce the Score-based 3D Part Assembly framework (Score-PA) for 3D part assembly.
Knowing that score-based methods are typically time-consuming during the inference stage.
We introduce a novel algorithm called the Fast Predictor-Corrector Sampler (FPC) that accelerates the sampling process.
arXiv Detail & Related papers (2023-09-08T09:10:03Z) - A Unified Framework for 3D Point Cloud Visual Grounding [60.75319271082741]
This paper takes the initiative step to integrate 3DREC and 3DRES into a unified framework, termed 3DRefTR.
Its key idea is to build upon a mature 3DREC model and leverage ready query embeddings and visual tokens from the 3DREC model to construct a dedicated mask branch.
This elaborate design enables 3DRefTR to achieve both well-performing 3DRES and 3DREC capacities with only a 6% additional latency compared to the original 3DREC model.
arXiv Detail & Related papers (2023-08-23T03:20:31Z) - Multiview Compressive Coding for 3D Reconstruction [77.95706553743626]
We introduce a simple framework that operates on 3D points of single objects or whole scenes.
Our model, Multiview Compressive Coding, learns to compress the input appearance and geometry to predict the 3D structure.
arXiv Detail & Related papers (2023-01-19T18:59:52Z) - Translating a Visual LEGO Manual to a Machine-Executable Plan [26.0127179598152]
We study the problem of translating an image-based, step-by-step assembly manual created by human designers into machine-interpretable instructions.
We present a novel learning-based framework, the Manual-to-Executable-Plan Network (MEPNet), which reconstructs the assembly steps from a sequence of manual images.
arXiv Detail & Related papers (2022-07-25T23:35:46Z) - Learning 3D Part Assembly from a Single Image [20.175502864488493]
We introduce a novel problem, single-image-guided 3D part assembly, along with a learningbased solution.
We study this problem in the setting of furniture assembly from a given complete set of parts and a single image depicting the entire assembled object.
arXiv Detail & Related papers (2020-03-21T21:19:28Z) - Monocular Real-time Hand Shape and Motion Capture using Multi-modal Data [77.34069717612493]
We present a novel method for monocular hand shape and pose estimation at unprecedented runtime performance of 100fps.
This is enabled by a new learning based architecture designed such that it can make use of all the sources of available hand training data.
It features a 3D hand joint detection module and an inverse kinematics module which regresses not only 3D joint positions but also maps them to joint rotations in a single feed-forward pass.
arXiv Detail & Related papers (2020-03-21T03:51:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.