Teachable Reality: Prototyping Tangible Augmented Reality with Everyday
Objects by Leveraging Interactive Machine Teaching
- URL: http://arxiv.org/abs/2302.11046v1
- Date: Tue, 21 Feb 2023 23:03:49 GMT
- Title: Teachable Reality: Prototyping Tangible Augmented Reality with Everyday
Objects by Leveraging Interactive Machine Teaching
- Authors: Kyzyl Monteiro, Ritik Vatsal, Neil Chulpongsatorn, Aman Parnami, Ryo
Suzuki
- Abstract summary: Teachable Reality is an augmented reality (AR) prototyping tool for creating interactive tangible AR applications with arbitrary everyday objects.
It identifies the user-defined tangible and gestural interactions using an on-demand computer vision model.
Our approach can lower the barrier to creating functional AR prototypes while also allowing flexible and general-purpose prototyping experiences.
- Score: 4.019017835137353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces Teachable Reality, an augmented reality (AR)
prototyping tool for creating interactive tangible AR applications with
arbitrary everyday objects. Teachable Reality leverages vision-based
interactive machine teaching (e.g., Teachable Machine), which captures
real-world interactions for AR prototyping. It identifies the user-defined
tangible and gestural interactions using an on-demand computer vision model.
Based on this, the user can easily create functional AR prototypes without
programming, enabled by a trigger-action authoring interface. Therefore, our
approach allows the flexibility, customizability, and generalizability of
tangible AR applications that can address the limitation of current
marker-based approaches. We explore the design space and demonstrate various AR
prototypes, which include tangible and deformable interfaces, context-aware
assistants, and body-driven AR applications. The results of our user study and
expert interviews confirm that our approach can lower the barrier to creating
functional AR prototypes while also allowing flexible and general-purpose
prototyping experiences.
Related papers
- Survey of User Interface Design and Interaction Techniques in Generative AI Applications [79.55963742878684]
We aim to create a compendium of different user-interaction patterns that can be used as a reference for designers and developers alike.
We also strive to lower the entry barrier for those attempting to learn more about the design of generative AI applications.
arXiv Detail & Related papers (2024-10-28T23:10:06Z) - ARPOV: Expanding Visualization of Object Detection in AR with Panoramic Mosaic Stitching [0.0]
ARPOV is an interactive visual analytics tool for analyzing object detection model outputs tailored to video captured by an AR headset.
The proposed tool leverages panorama stitching to expand the view of the environment while automatically filtering undesirable frames.
arXiv Detail & Related papers (2024-10-01T20:29:14Z) - Polaris: Open-ended Interactive Robotic Manipulation via Syn2Real Visual Grounding and Large Language Models [53.22792173053473]
We introduce an interactive robotic manipulation framework called Polaris.
Polaris integrates perception and interaction by utilizing GPT-4 alongside grounded vision models.
We propose a novel Synthetic-to-Real (Syn2Real) pose estimation pipeline.
arXiv Detail & Related papers (2024-08-15T06:40:38Z) - LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models [50.259006481656094]
We present a novel interactive application aimed towards understanding the internal mechanisms of large vision-language models.
Our interface is designed to enhance the interpretability of the image patches, which are instrumental in generating an answer.
We present a case study of how our application can aid in understanding failure mechanisms in a popular large multi-modal model: LLaVA.
arXiv Detail & Related papers (2024-04-03T23:57:34Z) - Voila-A: Aligning Vision-Language Models with User's Gaze Attention [56.755993500556734]
We introduce gaze information as a proxy for human attention to guide Vision-Language Models (VLMs)
We propose a novel approach, Voila-A, for gaze alignment to enhance the interpretability and effectiveness of these models in real-world applications.
arXiv Detail & Related papers (2023-12-22T17:34:01Z) - Typing on Any Surface: A Deep Learning-based Method for Real-Time
Keystroke Detection in Augmented Reality [4.857109990499532]
Mid-air keyboard interface, wireless keyboards or voice input, either suffer from poor ergonomic design, limited accuracy, or are simply embarrassing to use in public.
This paper proposes and validates a deep-learning based approach, that enables AR applications to accurately predict keystrokes from the user perspective RGB video stream.
A two-stage model, combing an off-the-shelf hand landmark extractor and a novel adaptive Convolutional Recurrent Neural Network (C-RNN) was trained.
arXiv Detail & Related papers (2023-08-31T23:58:25Z) - Systematic Adaptation of Communication-focused Machine Learning Models
from Real to Virtual Environments for Human-Robot Collaboration [1.392250707100996]
This paper presents a systematic framework for the real to virtual adaptation using limited size of virtual dataset.
Hand gestures recognition which has been a topic of much research and subsequent commercialization in the real world has been possible because of the creation of large, labelled datasets.
arXiv Detail & Related papers (2023-07-21T03:24:55Z) - ArK: Augmented Reality with Knowledge Interactive Emergent Ability [115.72679420999535]
We develop an infinite agent that learns to transfer knowledge memory from general foundation models to novel domains.
The heart of our approach is an emerging mechanism, dubbed Augmented Reality with Knowledge Inference Interaction (ArK)
We show that our ArK approach, combined with large foundation models, significantly improves the quality of generated 2D/3D scenes.
arXiv Detail & Related papers (2023-05-01T17:57:01Z) - Learning Action-Effect Dynamics for Hypothetical Vision-Language
Reasoning Task [50.72283841720014]
We propose a novel learning strategy that can improve reasoning about the effects of actions.
We demonstrate the effectiveness of our proposed approach and discuss its advantages over previous baselines in terms of performance, data efficiency, and generalization capability.
arXiv Detail & Related papers (2022-12-07T05:41:58Z) - OpenDR: An Open Toolkit for Enabling High Performance, Low Footprint
Deep Learning for Robotics [0.0]
We present the Open Deep Learning Toolkit for Robotics (OpenDR)
OpenDR aims at developing an open, non-proprietary, efficient, and modular toolkit that can be easily used by robotics companies and research institutions.
arXiv Detail & Related papers (2022-03-01T12:59:59Z) - Modular approach to data preprocessing in ALOHA and application to a
smart industry use case [0.0]
The paper addresses a modular approach, integrated into the ALOHA tool flow, to support the data preprocessing and transformation pipeline.
To demonstrate the effectiveness of the approach, we present some experimental results related to a keyword spotting use case.
arXiv Detail & Related papers (2021-02-02T06:48:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.