Multimodal Contrastive Learning with Hard Negative Sampling for Human
Activity Recognition
- URL: http://arxiv.org/abs/2309.01262v1
- Date: Sun, 3 Sep 2023 20:00:37 GMT
- Title: Multimodal Contrastive Learning with Hard Negative Sampling for Human
Activity Recognition
- Authors: Hyeongju Choi, Apoorva Beedu, Irfan Essa
- Abstract summary: Human Activity Recognition (HAR) systems have been extensively studied by the vision and ubiquitous computing communities.
We propose a hard negative sampling method for multimodal HAR with a hard negative sampling loss for skeleton and IMU data pairs.
We demonstrate the robustness of our approach forlearning strong feature representation for HAR tasks, and on the limited data setting.
- Score: 14.88934924520362
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human Activity Recognition (HAR) systems have been extensively studied by the
vision and ubiquitous computing communities due to their practical applications
in daily life, such as smart homes, surveillance, and health monitoring.
Typically, this process is supervised in nature and the development of such
systems requires access to large quantities of annotated data.
However, the higher costs and challenges associated with obtaining good
quality annotations have rendered the application of self-supervised methods an
attractive option and contrastive learning comprises one such method.
However, a major component of successful contrastive learning is the
selection of good positive and negative samples.
Although positive samples are directly obtainable, sampling good negative
samples remain a challenge.
As human activities can be recorded by several modalities like camera and IMU
sensors, we propose a hard negative sampling method for multimodal HAR with a
hard negative sampling loss for skeleton and IMU data pairs.
We exploit hard negatives that have different labels from the anchor but are
projected nearby in the latent space using an adjustable concentration
parameter.
Through extensive experiments on two benchmark datasets: UTD-MHAD and MMAct,
we demonstrate the robustness of our approach forlearning strong feature
representation for HAR tasks, and on the limited data setting.
We further show that our model outperforms all other state-of-the-art methods
for UTD-MHAD dataset, and self-supervised methods for MMAct: Cross session,
even when uni-modal data are used during downstream activity recognition.
Related papers
- C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Cross-Domain HAR: Few Shot Transfer Learning for Human Activity
Recognition [0.2944538605197902]
We present an approach for economic use of publicly available labeled HAR datasets for effective transfer learning.
We introduce a novel transfer learning framework, Cross-Domain HAR, which follows the teacher-student self-training paradigm.
We demonstrate the effectiveness of our approach for practically relevant few shot activity recognition scenarios.
arXiv Detail & Related papers (2023-10-22T19:13:25Z) - Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement
Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.
We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy.
Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - Temporal Output Discrepancy for Loss Estimation-based Active Learning [65.93767110342502]
We present a novel deep active learning approach that queries the oracle for data annotation when the unlabeled sample is believed to incorporate high loss.
Our approach achieves superior performances than the state-of-the-art active learning methods on image classification and semantic segmentation tasks.
arXiv Detail & Related papers (2022-12-20T19:29:37Z) - Split-PU: Hardness-aware Training Strategy for Positive-Unlabeled
Learning [42.26185670834855]
Positive-Unlabeled (PU) learning aims to learn a model with rare positive samples and abundant unlabeled samples.
This paper focuses on improving the commonly-used nnPU with a novel training pipeline.
arXiv Detail & Related papers (2022-11-30T05:48:31Z) - Responsible Active Learning via Human-in-the-loop Peer Study [88.01358655203441]
We propose a responsible active learning method, namely Peer Study Learning (PSL), to simultaneously preserve data privacy and improve model stability.
We first introduce a human-in-the-loop teacher-student architecture to isolate unlabelled data from the task learner (teacher) on the cloud-side.
During training, the task learner instructs the light-weight active learner which then provides feedback on the active sampling criterion.
arXiv Detail & Related papers (2022-11-24T13:18:27Z) - ColloSSL: Collaborative Self-Supervised Learning for Human Activity
Recognition [9.652822438412903]
A major bottleneck in training robust Human-Activity Recognition models (HAR) is the need for large-scale labeled sensor datasets.
Because labeling large amounts of sensor data is an expensive task, unsupervised and semi-supervised learning techniques have emerged.
We present a novel technique called Collaborative Self-Supervised Learning (ColloSSL) which leverages unlabeled data collected from multiple devices.
arXiv Detail & Related papers (2022-02-01T21:05:05Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - An Efficient Data Imputation Technique for Human Activity Recognition [3.0117625632585705]
We propose a methodology for extrapolating missing samples of a dataset to better recognize the human daily living activities.
The proposed method efficiently pre-processes the data captures and utilizes the k-Nearest Neighbors (KNN) imputation technique.
The proposed methodology elegantly extrapolated a similar pattern of activities as they were in the real dataset.
arXiv Detail & Related papers (2020-07-08T22:05:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.