CLAN: A Contrastive Learning based Novelty Detection Framework for Human
Activity Recognition
- URL: http://arxiv.org/abs/2401.10288v1
- Date: Wed, 17 Jan 2024 03:57:36 GMT
- Title: CLAN: A Contrastive Learning based Novelty Detection Framework for Human
Activity Recognition
- Authors: Hyunju Kim and Dongman Lee
- Abstract summary: CLAN is a two-tower contrastive learning-based novelty detection framework for human activity recognition.
It is tailored to challenges with human activity characteristics, including the significance of temporal and frequency features.
Experiments on four real-world human activity datasets show that CLAN surpasses the best performance of existing novelty detection methods.
- Score: 3.0108863071498035
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In ambient assisted living, human activity recognition from time series
sensor data mainly focuses on predefined activities, often overlooking new
activity patterns. We propose CLAN, a two-tower contrastive learning-based
novelty detection framework with diverse types of negative pairs for human
activity recognition. It is tailored to challenges with human activity
characteristics, including the significance of temporal and frequency features,
complex activity dynamics, shared features across activities, and sensor
modality variations. The framework aims to construct invariant representations
of known activity robust to the challenges. To generate suitable negative
pairs, it selects data augmentation methods according to the temporal and
frequency characteristics of each dataset. It derives the key representations
against meaningless dynamics by contrastive and classification losses-based
representation learning and score function-based novelty detection that
accommodate dynamic numbers of the different types of augmented samples. The
proposed two-tower model extracts the representations in terms of time and
frequency, mutually enhancing expressiveness for distinguishing between new and
known activities, even when they share common features. Experiments on four
real-world human activity datasets show that CLAN surpasses the best
performance of existing novelty detection methods, improving by 8.3%, 13.7%,
and 53.3% in AUROC, balanced accuracy, and FPR@TPR0.95 metrics respectively.
Related papers
- Visual-Geometric Collaborative Guidance for Affordance Learning [63.038406948791454]
We propose a visual-geometric collaborative guided affordance learning network that incorporates visual and geometric cues.
Our method outperforms the representative models regarding objective metrics and visual quality.
arXiv Detail & Related papers (2024-10-15T07:35:51Z) - Disentangled Interaction Representation for One-Stage Human-Object
Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding.
Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction.
Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z) - Dataset Bias in Human Activity Recognition [57.91018542715725]
This contribution statistically curates the training data to assess to what degree the physical characteristics of humans influence HAR performance.
We evaluate the performance of a state-of-the-art convolutional neural network on two HAR datasets that vary in the sensors, activities, and recording for time-series HAR.
arXiv Detail & Related papers (2023-01-19T12:33:50Z) - DisenHCN: Disentangled Hypergraph Convolutional Networks for
Spatiotemporal Activity Prediction [53.76601630407521]
We propose a hypergraph network model called DisenHCN to bridge the gaps in existing solutions.
In particular, we first unify fine-grained user similarity and the complex matching between user preferences andtemporal activity into a heterogeneous hypergraph.
We then disentangle the user representations into different aspects (location-aware, time-aware, and activity-aware) and aggregate corresponding aspect's features on the constructed hypergraph.
arXiv Detail & Related papers (2022-08-14T06:51:54Z) - UMSNet: An Universal Multi-sensor Network for Human Activity Recognition [10.952666953066542]
This paper proposes a universal multi-sensor network (UMSNet) for human activity recognition.
In particular, we propose a new lightweight sensor residual block (called LSR block), which improves the performance.
Our framework has a clear structure and can be directly applied to various types of multi-modal Time Series Classification tasks.
arXiv Detail & Related papers (2022-05-24T03:29:54Z) - A Novel Skeleton-Based Human Activity Discovery Technique Using Particle
Swarm Optimization with Gaussian Mutation [0.0]
Human activity discovery aims to distinguish the activities performed by humans, without any prior information of what defines each activity.
In this paper, a novel unsupervised approach is proposed to perform human activity discovery in 3D skeleton sequences.
Experiments on three datasets have been presented and the results show the proposed method has superior performance in discovering activities.
arXiv Detail & Related papers (2022-01-14T06:28:38Z) - Generative Partial Visual-Tactile Fused Object Clustering [81.17645983141773]
We propose a Generative Partial Visual-Tactile Fused (i.e., GPVTF) framework for object clustering.
A conditional cross-modal clustering generative adversarial network is then developed to synthesize one modality conditioning on the other modality.
To the end, two pseudo-label based KL-divergence losses are employed to update the corresponding modality-specific encoders.
arXiv Detail & Related papers (2020-12-28T02:37:03Z) - A Tree-structure Convolutional Neural Network for Temporal Features
Exaction on Sensor-based Multi-resident Activity Recognition [4.619245607612873]
We propose an end-to-end Tree-Structure Convolutional neural network based framework for Multi-Resident Activity Recognition (TSC-MRAR)
First, we treat each sample as an event and obtain the current event embedding through the previous sensor readings in the sliding window.
Then, in order to automatically generate the temporal features, a tree-structure network is designed to derive the temporal dependence of nearby readings.
arXiv Detail & Related papers (2020-11-05T14:31:00Z) - Sequential Weakly Labeled Multi-Activity Localization and Recognition on
Wearable Sensors using Recurrent Attention Networks [13.64024154785943]
We propose a recurrent attention network (RAN) to handle sequential weakly labeled multi-activity recognition and location tasks.
Our RAN model can simultaneously infer multi-activity types from the coarse-grained sequential weak labels.
It will greatly reduce the burden of manual labeling.
arXiv Detail & Related papers (2020-04-13T04:57:09Z) - Human Activity Recognition from Wearable Sensor Data Using
Self-Attention [2.9023633922848586]
We present a self-attention based neural network model for activity recognition from body-worn sensor data.
We performed experiments on four popular publicly available HAR datasets: PAMAP2, Opportunity, Skoda and USC-HAD.
Our model achieve significant performance improvement over recent state-of-the-art models in both benchmark test subjects and Leave-one-out-subject evaluation.
arXiv Detail & Related papers (2020-03-17T14:16:57Z) - ZSTAD: Zero-Shot Temporal Activity Detection [107.63759089583382]
We propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected.
We design an end-to-end deep network based on R-C3D as the architecture for this solution.
Experiments on both the THUMOS14 and the Charades datasets show promising performance in terms of detecting unseen activities.
arXiv Detail & Related papers (2020-03-12T02:40:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.