Teachable Reality: Prototyping Tangible Augmented Reality with Everyday
Objects by Leveraging Interactive Machine Teaching
- URL: http://arxiv.org/abs/2302.11046v1
- Date: Tue, 21 Feb 2023 23:03:49 GMT
- Title: Teachable Reality: Prototyping Tangible Augmented Reality with Everyday
Objects by Leveraging Interactive Machine Teaching
- Authors: Kyzyl Monteiro, Ritik Vatsal, Neil Chulpongsatorn, Aman Parnami, Ryo
Suzuki
- Abstract summary: Teachable Reality is an augmented reality (AR) prototyping tool for creating interactive tangible AR applications with arbitrary everyday objects.
It identifies the user-defined tangible and gestural interactions using an on-demand computer vision model.
Our approach can lower the barrier to creating functional AR prototypes while also allowing flexible and general-purpose prototyping experiences.
- Score: 4.019017835137353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces Teachable Reality, an augmented reality (AR)
prototyping tool for creating interactive tangible AR applications with
arbitrary everyday objects. Teachable Reality leverages vision-based
interactive machine teaching (e.g., Teachable Machine), which captures
real-world interactions for AR prototyping. It identifies the user-defined
tangible and gestural interactions using an on-demand computer vision model.
Based on this, the user can easily create functional AR prototypes without
programming, enabled by a trigger-action authoring interface. Therefore, our
approach allows the flexibility, customizability, and generalizability of
tangible AR applications that can address the limitation of current
marker-based approaches. We explore the design space and demonstrate various AR
prototypes, which include tangible and deformable interfaces, context-aware
assistants, and body-driven AR applications. The results of our user study and
expert interviews confirm that our approach can lower the barrier to creating
functional AR prototypes while also allowing flexible and general-purpose
prototyping experiences.
Related papers
- LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models [50.259006481656094]
We present a novel interactive application aimed towards understanding the internal mechanisms of large vision-language models.
Our interface is designed to enhance the interpretability of the image patches, which are instrumental in generating an answer.
We present a case study of how our application can aid in understanding failure mechanisms in a popular large multi-modal model: LLaVA.
arXiv Detail & Related papers (2024-04-03T23:57:34Z) - Voila-A: Aligning Vision-Language Models with User's Gaze Attention [56.755993500556734]
We introduce gaze information as a proxy for human attention to guide Vision-Language Models (VLMs)
We propose a novel approach, Voila-A, for gaze alignment to enhance the interpretability and effectiveness of these models in real-world applications.
arXiv Detail & Related papers (2023-12-22T17:34:01Z) - Typing on Any Surface: A Deep Learning-based Method for Real-Time
Keystroke Detection in Augmented Reality [4.857109990499532]
Mid-air keyboard interface, wireless keyboards or voice input, either suffer from poor ergonomic design, limited accuracy, or are simply embarrassing to use in public.
This paper proposes and validates a deep-learning based approach, that enables AR applications to accurately predict keystrokes from the user perspective RGB video stream.
A two-stage model, combing an off-the-shelf hand landmark extractor and a novel adaptive Convolutional Recurrent Neural Network (C-RNN) was trained.
arXiv Detail & Related papers (2023-08-31T23:58:25Z) - Systematic Adaptation of Communication-focused Machine Learning Models
from Real to Virtual Environments for Human-Robot Collaboration [1.392250707100996]
This paper presents a systematic framework for the real to virtual adaptation using limited size of virtual dataset.
Hand gestures recognition which has been a topic of much research and subsequent commercialization in the real world has been possible because of the creation of large, labelled datasets.
arXiv Detail & Related papers (2023-07-21T03:24:55Z) - Active Class Selection for Few-Shot Class-Incremental Learning [14.386434861320023]
For real-world applications, robots will need to continually learn in their environments through limited interactions with their users.
We develop a novel framework that can allow an autonomous agent to continually learn new objects by asking its users to label only a few of the most informative objects in the environment.
arXiv Detail & Related papers (2023-07-05T20:16:57Z) - Visual Affordance Prediction for Guiding Robot Exploration [56.17795036091848]
We develop an approach for learning visual affordances for guiding robot exploration.
We use a Transformer-based model to learn a conditional distribution in the latent embedding space of a VQ-VAE.
We show how the trained affordance model can be used for guiding exploration by acting as a goal-sampling distribution, during visual goal-conditioned policy learning in robotic manipulation.
arXiv Detail & Related papers (2023-05-28T17:53:09Z) - ArK: Augmented Reality with Knowledge Interactive Emergent Ability [115.72679420999535]
We develop an infinite agent that learns to transfer knowledge memory from general foundation models to novel domains.
The heart of our approach is an emerging mechanism, dubbed Augmented Reality with Knowledge Inference Interaction (ArK)
We show that our ArK approach, combined with large foundation models, significantly improves the quality of generated 2D/3D scenes.
arXiv Detail & Related papers (2023-05-01T17:57:01Z) - Learning Action-Effect Dynamics for Hypothetical Vision-Language
Reasoning Task [50.72283841720014]
We propose a novel learning strategy that can improve reasoning about the effects of actions.
We demonstrate the effectiveness of our proposed approach and discuss its advantages over previous baselines in terms of performance, data efficiency, and generalization capability.
arXiv Detail & Related papers (2022-12-07T05:41:58Z) - MONAI Label: A framework for AI-assisted Interactive Labeling of 3D
Medical Images [49.664220687980006]
The lack of annotated datasets is a major bottleneck for training new task-specific supervised machine learning models.
We present MONAI Label, a free and open-source framework that facilitates the development of applications based on artificial intelligence (AI) models.
arXiv Detail & Related papers (2022-03-23T12:33:11Z) - OpenDR: An Open Toolkit for Enabling High Performance, Low Footprint
Deep Learning for Robotics [0.0]
We present the Open Deep Learning Toolkit for Robotics (OpenDR)
OpenDR aims at developing an open, non-proprietary, efficient, and modular toolkit that can be easily used by robotics companies and research institutions.
arXiv Detail & Related papers (2022-03-01T12:59:59Z) - Modular approach to data preprocessing in ALOHA and application to a
smart industry use case [0.0]
The paper addresses a modular approach, integrated into the ALOHA tool flow, to support the data preprocessing and transformation pipeline.
To demonstrate the effectiveness of the approach, we present some experimental results related to a keyword spotting use case.
arXiv Detail & Related papers (2021-02-02T06:48:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.