Related papers: Single-stage intake gesture detection using CTC loss and extended prefix beam search

Single-stage intake gesture detection using CTC loss and extended prefix beam search

URL: http://arxiv.org/abs/2008.02999v2
Date: Sat, 21 Nov 2020 01:05:45 GMT
Title: Single-stage intake gesture detection using CTC loss and extended prefix beam search
Authors: Philipp V. Rouast and Marc T. P. Adam
Abstract summary: Accurate detection of individual intake gestures is a key step towards automatic dietary monitoring. We propose a single-stage approach which directly decodes the probabilities learned from sensor data into sparse intake detections.
Score: 8.22379888383833
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Accurate detection of individual intake gestures is a key step towards automatic dietary monitoring. Both inertial sensor data of wrist movements and video data depicting the upper body have been used for this purpose. The most advanced approaches to date use a two-stage approach, in which (i) frame-level intake probabilities are learned from the sensor data using a deep neural network, and then (ii) sparse intake events are detected by finding the maxima of the frame-level probabilities. In this study, we propose a single-stage approach which directly decodes the probabilities learned from sensor data into sparse intake detections. This is achieved by weakly supervised training using Connectionist Temporal Classification (CTC) loss, and decoding using a novel extended prefix beam search decoding algorithm. Benefits of this approach include (i) end-to-end training for detections, (ii) simplified timing requirements for intake gesture labels, and (iii) improved detection performance compared to existing approaches. Across two separate datasets, we achieve relative $F_1$ score improvements between 1.9% and 6.2% over the two-stage approach for intake detection and eating/drinking detection tasks, for both video and inertial sensors.

Related papers

A Low-cost Strategic Monitoring Approach for Scalable and Interpretable Error Detection in Deep Neural Networks [6.537257913467249]
We present a highly compact run-time monitoring approach for deep computer vision networks. It can efficiently detect silent data corruption originating from both hardware memory and input faults.
arXiv Detail & Related papers (2023-10-31T10:45:55Z)
DiffusionEngine: Diffusion Model is Scalable Data Engine for Object Detection [41.436817746749384]
Diffusion Model is a scalable data engine for object detection. DiffusionEngine (DE) provides high-quality detection-oriented training pairs in a single stage.
arXiv Detail & Related papers (2023-09-07T17:55:01Z)
Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations. In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z)
A Novel Two Stream Decision Level Fusion of Vision and Inertial Sensors Data for Automatic Multimodal Human Activity Recognition System [2.5214116139219787]
This paper presents a novel multimodal human activity recognition system. It uses a two-stream decision level fusion of vision and inertial sensors. The accuracies obtained by the proposed system are 96.9 %, 97.6 %, 98.7 %, and 95.9 % respectively.
arXiv Detail & Related papers (2023-06-27T19:29:35Z)
Multi-modal Sensor Data Fusion for In-situ Classification of Animal Behavior Using Accelerometry and GNSS Data [16.47484520898938]
We examine using data from multiple sensing modes, i.e., accelerometry and global navigation satellite system (GNSS) for classifying animal behavior. We develop multi-modal animal behavior classification algorithms using two real-world datasets collected via smart cattle collar and ear tags.
arXiv Detail & Related papers (2022-06-24T04:54:03Z)
Weakly Supervised Change Detection Using Guided Anisotropic Difusion [97.43170678509478]
We propose original ideas that help us to leverage such datasets in the context of change detection. First, we propose the guided anisotropic diffusion (GAD) algorithm, which improves semantic segmentation results. We then show its potential in two weakly-supervised learning strategies tailored for change detection.
arXiv Detail & Related papers (2021-12-31T10:03:47Z)
From One to Many: A Deep Learning Coincident Gravitational-Wave Search [58.720142291102135]
We construct a two-detector search for gravitational waves from binary black hole mergers using neural networks trained on non-spinning binary black hole data from a single detector. We find that none of these simple two-detector networks are capable of improving the sensitivity over applying networks individually to the data from the detectors.
arXiv Detail & Related papers (2021-08-24T13:25:02Z)
ESAD: End-to-end Deep Semi-supervised Anomaly Detection [85.81138474858197]
We propose a new objective function that measures the KL-divergence between normal and anomalous data. The proposed method significantly outperforms several state-of-the-arts on multiple benchmark datasets.
arXiv Detail & Related papers (2020-12-09T08:16:35Z)
Dense Label Encoding for Boundary Discontinuity Free Rotation Detection [69.75559390700887]
This paper explores a relatively less-studied methodology based on classification. We propose new techniques to push its frontier in two aspects. Experiments and visual analysis on large-scale public datasets for aerial images show the effectiveness of our approach.
arXiv Detail & Related papers (2020-11-19T05:42:02Z)
Sequential Drift Detection in Deep Learning Classifiers [4.022057598291766]
We utilize neural network embeddings to detect data drift by formulating the drift detection within an appropriate sequential decision framework. We introduce a loss function which evaluates an algorithm's ability to balance these two concerns.
arXiv Detail & Related papers (2020-07-31T14:46:21Z)
EHSOD: CAM-Guided End-to-end Hybrid-Supervised Object Detection with Cascade Refinement [53.69674636044927]
We present EHSOD, an end-to-end hybrid-supervised object detection system. It can be trained in one shot on both fully and weakly-annotated data. It achieves comparable results on multiple object detection benchmarks with only 30% fully-annotated data.
arXiv Detail & Related papers (2020-02-18T08:04:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.