Systematic Adaptation of Communication-focused Machine Learning Models
from Real to Virtual Environments for Human-Robot Collaboration
- URL: http://arxiv.org/abs/2307.11327v1
- Date: Fri, 21 Jul 2023 03:24:55 GMT
- Title: Systematic Adaptation of Communication-focused Machine Learning Models
from Real to Virtual Environments for Human-Robot Collaboration
- Authors: Debasmita Mukherjee, Ritwik Singhai and Homayoun Najjaran
- Abstract summary: This paper presents a systematic framework for the real to virtual adaptation using limited size of virtual dataset.
Hand gestures recognition which has been a topic of much research and subsequent commercialization in the real world has been possible because of the creation of large, labelled datasets.
- Score: 1.392250707100996
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Virtual reality has proved to be useful in applications in several fields
ranging from gaming, medicine, and training to development of interfaces that
enable human-robot collaboration. It empowers designers to explore applications
outside of the constraints posed by the real world environment and develop
innovative solutions and experiences. Hand gestures recognition which has been
a topic of much research and subsequent commercialization in the real world has
been possible because of the creation of large, labelled datasets. In order to
utilize the power of natural and intuitive hand gestures in the virtual domain
for enabling embodied teleoperation of collaborative robots, similarly large
datasets must be created so as to keep the working interface easy to learn and
flexible enough to add more gestures. Depending on the application, this may be
computationally or economically prohibitive. Thus, the adaptation of trained
deep learning models that perform well in the real environment to the virtual
may be a solution to this challenge. This paper presents a systematic framework
for the real to virtual adaptation using limited size of virtual dataset along
with guidelines for creating a curated dataset. Finally, while hand gestures
have been considered as the communication mode, the guidelines and
recommendations presented are generic. These are applicable to other modes such
as body poses and facial expressions which have large datasets available in the
real domain which must be adapted to the virtual one.
Related papers
- Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies.
Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors.
We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z) - VR-GPT: Visual Language Model for Intelligent Virtual Reality Applications [2.5022287664959446]
This study introduces a pioneering approach utilizing Visual Language Models within VR environments to enhance user interaction and task efficiency.
Our system facilitates real-time, intuitive user interactions through natural language processing, without relying on visual text instructions.
arXiv Detail & Related papers (2024-05-19T12:56:00Z) - RealDex: Towards Human-like Grasping for Robotic Dexterous Hand [64.47045863999061]
We introduce RealDex, a pioneering dataset capturing authentic dexterous hand grasping motions infused with human behavioral patterns.
RealDex holds immense promise in advancing humanoid robot for automated perception, cognition, and manipulation in real-world scenarios.
arXiv Detail & Related papers (2024-02-21T14:59:46Z) - ArK: Augmented Reality with Knowledge Interactive Emergent Ability [115.72679420999535]
We develop an infinite agent that learns to transfer knowledge memory from general foundation models to novel domains.
The heart of our approach is an emerging mechanism, dubbed Augmented Reality with Knowledge Inference Interaction (ArK)
We show that our ArK approach, combined with large foundation models, significantly improves the quality of generated 2D/3D scenes.
arXiv Detail & Related papers (2023-05-01T17:57:01Z) - Synthetic-to-Real Domain Adaptation for Action Recognition: A Dataset and Baseline Performances [76.34037366117234]
We introduce a new dataset called Robot Control Gestures (RoCoG-v2)
The dataset is composed of both real and synthetic videos from seven gesture classes.
We present results using state-of-the-art action recognition and domain adaptation algorithms.
arXiv Detail & Related papers (2023-03-17T23:23:55Z) - Accelerating Interactive Human-like Manipulation Learning with GPU-based
Simulation and High-quality Demonstrations [25.393382192511716]
We present an immersive virtual reality teleoperation interface designed for interactive human-like manipulation on contact rich tasks.
We demonstrate the complementary strengths of massively parallel RL and imitation learning, yielding robust and natural behaviors.
arXiv Detail & Related papers (2022-12-05T09:37:27Z) - DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to
Reality [64.51295032956118]
We train a policy that can perform robust dexterous manipulation on an anthropomorphic robot hand.
Our work reaffirms the possibilities of sim-to-real transfer for dexterous manipulation in diverse kinds of hardware and simulator setups.
arXiv Detail & Related papers (2022-10-25T01:51:36Z) - The Gesture Authoring Space: Authoring Customised Hand Gestures for
Grasping Virtual Objects in Immersive Virtual Environments [81.5101473684021]
This work proposes a hand gesture authoring tool for object specific grab gestures allowing virtual objects to be grabbed as in the real world.
The presented solution uses template matching for gesture recognition and requires no technical knowledge to design and create custom tailored hand gestures.
The study showed that gestures created with the proposed approach are perceived by users as a more natural input modality than the others.
arXiv Detail & Related papers (2022-07-03T18:33:33Z) - Mutual Scene Synthesis for Mixed Reality Telepresence [4.504833177846264]
Mixed reality telepresence allows participants to engage in a wide spectrum of activities, previously not possible in 2D screen-based communication methods.
We propose a novel mutual scene synthesis method that takes the participants' spaces as input, and generates a virtual synthetic scene that corresponds to the functional features of all participants' local spaces.
Our method combines a mutual function optimization module with a deep-learning conditional scene augmentation process to generate a scene mutually and physically accessible to all participants of a mixed reality telepresence scenario.
arXiv Detail & Related papers (2022-04-01T02:08:11Z) - Efficient Realistic Data Generation Framework leveraging Deep
Learning-based Human Digitization [0.0]
The proposed method takes as input real background images and populates them with human figures in various poses.
A benchmarking and evaluation in the corresponding tasks shows that synthetic data can be effectively used as a supplement to real data.
arXiv Detail & Related papers (2021-06-28T08:07:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.