Simultaneous Gesture Classification and Localization with an Automatic
Gesture Annotation Model
- URL: http://arxiv.org/abs/2401.11150v1
- Date: Sat, 20 Jan 2024 07:11:03 GMT
- Title: Simultaneous Gesture Classification and Localization with an Automatic
Gesture Annotation Model
- Authors: Junxiao Shen, Xuhai Xu, Ran Tan, Amy Karlson, Evan Strasnick
- Abstract summary: We propose a novel annotation model that can automatically annotate gesture classes and identify their temporal ranges.
Our ablation study demonstrates that our annotation model design surpasses the baseline in terms of both gesture classification accuracy (3-4% improvement) and localization accuracy (71-75% improvement)
- Score: 10.898703544071934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training a real-time gesture recognition model heavily relies on annotated
data. However, manual data annotation is costly and demands substantial human
effort. In order to address this challenge, we propose a novel annotation model
that can automatically annotate gesture classes and identify their temporal
ranges. Our ablation study demonstrates that our annotation model design
surpasses the baseline in terms of both gesture classification accuracy (3-4\%
improvement) and localization accuracy (71-75\% improvement). We believe that
this annotation model has immense potential to improve the training of
downstream gesture recognition models using unlabeled datasets.
Related papers
- Selective Annotation via Data Allocation: These Data Should Be Triaged to Experts for Annotation Rather Than the Model [42.70608373297776]
We propose a selective annotation framework called SANT.
It effectively takes advantage of both the triage-to-human and triage-to-model data through the proposed error-aware triage and bi-weighting mechanisms.
Experimental results show that SANT consistently outperforms other baselines, leading to higher-quality annotation through its proper allocation of data to both expert and model workers.
arXiv Detail & Related papers (2024-05-20T14:52:05Z) - Guiding Attention in End-to-End Driving Models [49.762868784033785]
Vision-based end-to-end driving models trained by imitation learning can lead to affordable solutions for autonomous driving.
We study how to guide the attention of these models to improve their driving quality by adding a loss term during training.
In contrast to previous work, our method does not require these salient semantic maps to be available during testing time.
arXiv Detail & Related papers (2024-04-30T23:18:51Z) - Self-Supervised Representation Learning for Online Handwriting Text
Classification [0.8594140167290099]
We propose the novel Part of Stroke Masking (POSM) as a pretext task for pretraining models to extract informative representations from the online handwriting of individuals in English and Chinese languages.
To evaluate the quality of the extracted representations, we use both intrinsic and extrinsic evaluation methods.
The pretrained models are fine-tuned to achieve state-of-the-art results in tasks such as writer identification, gender classification, and handedness classification.
arXiv Detail & Related papers (2023-10-10T14:07:49Z) - How Good is the Model in Model-in-the-loop Event Coreference Resolution
Annotation? [3.712417884848568]
We propose a model-in-the-loop annotation approach for event coreference resolution, where a machine learning model suggests likely corefering event pairs only.
We evaluate the effectiveness of this approach by first simulating the annotation process and then, using a novel annotator-centric Recall- effort trade-off metric, we compare the results of various underlying models and datasets.
arXiv Detail & Related papers (2023-06-06T18:06:24Z) - Self-Training of Handwritten Word Recognition for Synthetic-to-Real
Adaptation [4.111899441919165]
We propose a self-training approach to train a Handwritten Text Recognition model.
The proposed training scheme uses an initial model trained on synthetic data to make predictions for the unlabeled target dataset.
We evaluate the proposed method on four widely used benchmark datasets and show its effectiveness on closing the gap to a model trained in a fully-supervised manner.
arXiv Detail & Related papers (2022-06-07T09:43:25Z) - Annotation Error Detection: Analyzing the Past and Present for a More
Coherent Future [63.99570204416711]
We reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets.
We define a uniform evaluation setup including a new formalization of the annotation error detection task.
We release our datasets and implementations in an easy-to-use and open source software package.
arXiv Detail & Related papers (2022-06-05T22:31:45Z) - Dynamic Supervisor for Cross-dataset Object Detection [52.95818230087297]
Cross-dataset training in object detection tasks is complicated because the inconsistency in the category range across datasets transforms fully supervised learning into semi-supervised learning.
We propose a dynamic supervisor framework that updates the annotations multiple times through multiple-updated submodels trained using hard and soft labels.
In the final generated annotations, both recall and precision improve significantly through the integration of hard-label training with soft-label training.
arXiv Detail & Related papers (2022-04-01T03:18:46Z) - Distilling Interpretable Models into Human-Readable Code [71.11328360614479]
Human-readability is an important and desirable standard for machine-learned model interpretability.
We propose to train interpretable models using conventional methods, and then distill them into concise, human-readable code.
We describe a piecewise-linear curve-fitting algorithm that produces high-quality results efficiently and reliably across a broad range of use cases.
arXiv Detail & Related papers (2021-01-21T01:46:36Z) - Data Cleansing with Contrastive Learning for Vocal Note Event
Annotations [1.859931123372708]
We propose a novel data cleansing model for time-varying, structured labels.
Our model is trained in a contrastive learning manner by automatically creating local deformations of likely correct labels.
We demonstrate that the accuracy of a transcription model improves greatly when trained using our proposed strategy.
arXiv Detail & Related papers (2020-08-05T12:24:37Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z) - Temporal Embeddings and Transformer Models for Narrative Text
Understanding [72.88083067388155]
We present two approaches to narrative text understanding for character relationship modelling.
The temporal evolution of these relations is described by dynamic word embeddings, that are designed to learn semantic changes over time.
A supervised learning approach based on the state-of-the-art transformer model BERT is used instead to detect static relations between characters.
arXiv Detail & Related papers (2020-03-19T14:23:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.