MotionInput v2.0 supporting DirectX: A modular library of open-source
gesture-based machine learning and computer vision methods for interacting
and controlling existing software with a webcam
- URL: http://arxiv.org/abs/2108.04357v1
- Date: Tue, 10 Aug 2021 08:23:21 GMT
- Title: MotionInput v2.0 supporting DirectX: A modular library of open-source
gesture-based machine learning and computer vision methods for interacting
and controlling existing software with a webcam
- Authors: Ashild Kummen, Guanlin Li, Ali Hassan, Teodora Ganeva, Qianying Lu,
Robert Shaw, Chenuka Ratwatte, Yang Zou, Lu Han, Emil Almazov, Sheena Visram,
Andrew Taylor, Neil J Sebire, Lee Stott, Yvonne Rogers, Graham Roberts, Dean
Mohamedally
- Abstract summary: MotionInput v2.0 maps human motion gestures to input operations for existing applications and games.
Three use case areas assisted the development of the modules: creativity software, office and clinical software, and gaming software.
- Score: 11.120698968989108
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Touchless computer interaction has become an important consideration during
the COVID-19 pandemic period. Despite progress in machine learning and computer
vision that allows for advanced gesture recognition, an integrated collection
of such open-source methods and a user-customisable approach to utilising them
in a low-cost solution for touchless interaction in existing software is still
missing. In this paper, we introduce the MotionInput v2.0 application. This
application utilises published open-source libraries and additional gesture
definitions developed to take the video stream from a standard RGB webcam as
input. It then maps human motion gestures to input operations for existing
applications and games. The user can choose their own preferred way of
interacting from a series of motion types, including single and bi-modal hand
gesturing, full-body repetitive or extremities-based exercises, head and facial
movements, eye tracking, and combinations of the above. We also introduce a
series of bespoke gesture recognition classifications as DirectInput triggers,
including gestures for idle states, auto calibration, depth capture from a 2D
RGB webcam stream and tracking of facial motions such as mouth motions,
winking, and head direction with rotation. Three use case areas assisted the
development of the modules: creativity software, office and clinical software,
and gaming software. A collection of open-source libraries has been integrated
and provide a layer of modular gesture mapping on top of existing mouse and
keyboard controls in Windows via DirectX. With ease of access to webcams
integrated into most laptops and desktop computers, touchless computing becomes
more available with MotionInput v2.0, in a federated and locally processed
method.
Related papers
- Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs [16.41735119504929]
This work focuses on generating realistic, physically-based human behaviors from multi-modal inputs, which may only partially specify the desired motion.
The input may come from a VR controller providing arm motion and body velocity, partial key-point animation, computer vision applied to videos, or even higher-level motion goals.
We introduce the Masked Humanoid Controller (MHC), a novel approach that applies multi-objective imitation learning on augmented and selectively masked motion demonstrations.
arXiv Detail & Related papers (2025-02-08T17:02:11Z) - MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation [65.74312406211213]
This paper presents a method that allows users to design cinematic video shots in the context of image-to-video generation.
By connecting insights from classical computer graphics and contemporary video generation techniques, we demonstrate the ability to achieve 3D-aware motion control in I2V synthesis.
arXiv Detail & Related papers (2025-02-06T18:41:04Z) - Extraction Of Cumulative Blobs From Dynamic Gestures [0.0]
Gesture recognition is based on CV technology that allows the computer to interpret human motions as commands.
A simple night vision camera can be used as our camera for motion capture.
The video stream from the camera is fed into a Raspberry Pi which has a Python program running OpenCV module.
arXiv Detail & Related papers (2025-01-07T18:59:28Z) - Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes [90.39860012099393]
Sitcom-Crafter is a system for human motion generation in 3D space.
Central to the function generation modules is our novel 3D scene-aware human-human interaction module.
Augmentation modules encompass plot comprehension for command generation, motion synchronization for seamless integration of different motion types.
arXiv Detail & Related papers (2024-10-14T17:56:19Z) - Animate Your Motion: Turning Still Images into Dynamic Videos [58.63109848837741]
We introduce Scene and Motion Conditional Diffusion (SMCD), a novel methodology for managing multimodal inputs.
SMCD incorporates a recognized motion conditioning module and investigates various approaches to integrate scene conditions.
Our design significantly enhances video quality, motion precision, and semantic coherence.
arXiv Detail & Related papers (2024-03-15T10:36:24Z) - The Gesture Authoring Space: Authoring Customised Hand Gestures for
Grasping Virtual Objects in Immersive Virtual Environments [81.5101473684021]
This work proposes a hand gesture authoring tool for object specific grab gestures allowing virtual objects to be grabbed as in the real world.
The presented solution uses template matching for gesture recognition and requires no technical knowledge to design and create custom tailored hand gestures.
The study showed that gestures created with the proposed approach are perceived by users as a more natural input modality than the others.
arXiv Detail & Related papers (2022-07-03T18:33:33Z) - Muscle Vision: Real Time Keypoint Based Pose Classification of Physical
Exercises [52.77024349608834]
3D human pose recognition extrapolated from video has advanced to the point of enabling real-time software applications.
We propose a new machine learning pipeline and web interface that performs human pose recognition on a live video feed to detect when common exercises are performed and classify them accordingly.
arXiv Detail & Related papers (2022-03-23T00:55:07Z) - Click to Move: Controlling Video Generation with Sparse Motion [30.437648200928603]
Click to Move (C2M) is a novel framework for video generation where the user can control the motion of the synthesized video through mouse clicks.
Our model receives as input an initial frame, its corresponding segmentation map and the sparse motion vectors encoding the input provided by the user.
It outputs a plausible video sequence starting from the given frame and with a motion that is consistent with user input.
arXiv Detail & Related papers (2021-08-19T17:33:13Z) - SHREC 2021: Track on Skeleton-based Hand Gesture Recognition in the Wild [62.450907796261646]
Recognition of hand gestures can be performed directly from the stream of hand skeletons estimated by software.
Despite the recent advancements in gesture and action recognition from skeletons, it is unclear how well the current state-of-the-art techniques can perform in a real-world scenario.
This paper presents the results of the SHREC 2021: Track on Skeleton-based Hand Gesture Recognition in the Wild contest.
arXiv Detail & Related papers (2021-06-21T10:57:49Z) - Gestop : Customizable Gesture Control of Computer Systems [0.3553493344868413]
Gestop is a framework that learns to detect gestures from demonstrations and is customizable by end-users.
It enables users to interact in real-time with computers having only RGB cameras, using gestures.
arXiv Detail & Related papers (2020-10-25T19:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.