Investigating Enhancements to Contrastive Predictive Coding for Human
Activity Recognition
- URL: http://arxiv.org/abs/2211.06173v1
- Date: Fri, 11 Nov 2022 12:54:58 GMT
- Title: Investigating Enhancements to Contrastive Predictive Coding for Human
Activity Recognition
- Authors: Harish Haresamudram, Irfan Essa, Thomas Ploetz
- Abstract summary: Contrastive Predictive Coding (CPC) is a technique that learns effective representations by leveraging properties of time-series data.
In this work, we propose enhancements to CPC, by systematically investigating the architecture, the aggregator network, and the future timestep prediction.
Our method shows substantial improvements on four of six target datasets, demonstrating its ability to empower a wide range of application scenarios.
- Score: 7.086647707011785
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The dichotomy between the challenging nature of obtaining annotations for
activities, and the more straightforward nature of data collection from
wearables, has resulted in significant interest in the development of
techniques that utilize large quantities of unlabeled data for learning
representations. Contrastive Predictive Coding (CPC) is one such method,
learning effective representations by leveraging properties of time-series data
to setup a contrastive future timestep prediction task. In this work, we
propose enhancements to CPC, by systematically investigating the encoder
architecture, the aggregator network, and the future timestep prediction,
resulting in a fully convolutional architecture, thereby improving
parallelizability. Across sensor positions and activities, our method shows
substantial improvements on four of six target datasets, demonstrating its
ability to empower a wide range of application scenarios. Further, in the
presence of very limited labeled data, our technique significantly outperforms
both supervised and self-supervised baselines, positively impacting situations
where collecting only a few seconds of labeled data may be possible. This is
promising, as CPC does not require specialized data transformations or
reconstructions for learning effective representations.
Related papers
- Interactive Counterfactual Generation for Univariate Time Series [7.331969743532515]
Our approach aims to enhance the transparency and understanding of deep learning models' decision processes.
By abstracting user interactions with the projected data points, our method facilitates an intuitive generation of counterfactual explanations.
We validate this method using the ECG5000 benchmark dataset, demonstrating significant improvements in interpretability and user understanding of time series classification.
arXiv Detail & Related papers (2024-08-20T08:19:55Z) - CUDC: A Curiosity-Driven Unsupervised Data Collection Method with
Adaptive Temporal Distances for Offline Reinforcement Learning [62.58375643251612]
We propose a Curiosity-driven Unsupervised Data Collection (CUDC) method to expand feature space using adaptive temporal distances for task-agnostic data collection.
With this adaptive reachability mechanism in place, the feature representation can be diversified, and the agent can navigate itself to collect higher-quality data with curiosity.
Empirically, CUDC surpasses existing unsupervised methods in efficiency and learning performance in various downstream offline RL tasks of the DeepMind control suite.
arXiv Detail & Related papers (2023-12-19T14:26:23Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Multi-dataset Training of Transformers for Robust Action Recognition [75.5695991766902]
We study the task of robust feature representations, aiming to generalize well on multiple datasets for action recognition.
Here, we propose a novel multi-dataset training paradigm, MultiTrain, with the design of two new loss terms, namely informative loss and projection loss.
We verify the effectiveness of our method on five challenging datasets, Kinetics-400, Kinetics-700, Moments-in-Time, Activitynet and Something-something-v2.
arXiv Detail & Related papers (2022-09-26T01:30:43Z) - Contrastive Predictive Coding for Human Activity Recognition [5.766384728949437]
We introduce the Contrastive Predictive Coding framework to human activity recognition, which captures the long-term temporal structure of sensor data streams.
CPC-based pre-training is self-supervised, and the resulting learned representations can be integrated into standard activity chains.
It leads to significantly improved recognition performance when only small amounts of labeled training data are available.
arXiv Detail & Related papers (2020-12-09T21:44:36Z) - Online Descriptor Enhancement via Self-Labelling Triplets for Visual
Data Association [28.03285334702022]
We propose a self-supervised method for incrementally refining visual descriptors to improve performance in the task of object-level visual data association.
Our method optimize deep descriptor generators online, by continuously training a widely available image classification network pre-trained with domain-independent data.
We show that our approach surpasses other visual data-association methods applied to a tracking-by-detection task, and show that it provides better performance-gains when compared to other methods that attempt to adapt to observed information.
arXiv Detail & Related papers (2020-11-06T17:42:04Z) - Representation Learning for Sequence Data with Deep Autoencoding
Predictive Components [96.42805872177067]
We propose a self-supervised representation learning method for sequence data, based on the intuition that useful representations of sequence data should exhibit a simple structure in the latent space.
We encourage this latent structure by maximizing an estimate of predictive information of latent feature sequences, which is the mutual information between past and future windows at each time step.
We demonstrate that our method recovers the latent space of noisy dynamical systems, extracts predictive features for forecasting tasks, and improves automatic speech recognition when used to pretrain the encoder on large amounts of unlabeled data.
arXiv Detail & Related papers (2020-10-07T03:34:01Z) - Provably Efficient Causal Reinforcement Learning with Confounded
Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.