Representation Learning for Wearable-Based Applications in the Case of
Missing Data
- URL: http://arxiv.org/abs/2401.05437v2
- Date: Fri, 12 Jan 2024 11:14:58 GMT
- Title: Representation Learning for Wearable-Based Applications in the Case of
Missing Data
- Authors: Janosch Jungo, Yutong Xiang, Shkurta Gashi, Christian Holz
- Abstract summary: multimodal sensor data in real-world environments is still challenging due to low data quality and limited data annotations.
We investigate representation learning for imputing missing wearable data and compare it with state-of-the-art statistical approaches.
Our study provides insights for the design and development of masking-based self-supervised learning tasks.
- Score: 20.37256375888501
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Wearable devices continuously collect sensor data and use it to infer an
individual's behavior, such as sleep, physical activity, and emotions. Despite
the significant interest and advancements in this field, modeling multimodal
sensor data in real-world environments is still challenging due to low data
quality and limited data annotations. In this work, we investigate
representation learning for imputing missing wearable data and compare it with
state-of-the-art statistical approaches. We investigate the performance of the
transformer model on 10 physiological and behavioral signals with different
masking ratios. Our results show that transformers outperform baselines for
missing data imputation of signals that change more frequently, but not for
monotonic signals. We further investigate the impact of imputation strategies
and masking rations on downstream classification tasks. Our study provides
insights for the design and development of masking-based self-supervised
learning tasks and advocates the adoption of hybrid-based imputation strategies
to address the challenge of missing data in wearable devices.
Related papers
- Heterogeneous quantization regularizes spiking neural network activity [0.0]
We present a data-blind neuromorphic signal conditioning strategy whereby analog data are normalized and quantized into spike phase representations.
We extend this mechanism by adding a data-aware calibration step whereby the range and density of the quantization weights adapt to accumulated input statistics.
arXiv Detail & Related papers (2024-09-27T02:25:44Z) - Machine Learning Techniques for Sensor-based Human Activity Recognition with Data Heterogeneity -- A Review [0.8142555609235358]
Sensor-based Human Activity Recognition (HAR) is crucial in ubiquitous computing.
HAR confronts challenges, particularly in data distribution assumptions.
This review investigates how machine learning addresses data heterogeneity in HAR.
arXiv Detail & Related papers (2024-03-12T22:22:14Z) - Amplifying Pathological Detection in EEG Signaling Pathways through
Cross-Dataset Transfer Learning [10.212217551908525]
We study the effectiveness of data and model scaling and cross-dataset knowledge transfer in a real-world pathology classification task.
We identify the challenges of possible negative transfer and emphasize the significance of some key components.
Our findings indicate a small and generic model (e.g. ShallowNet) performs well on a single dataset, however, a larger model (e.g. TCN) performs better on transfer and learning from a larger and diverse dataset.
arXiv Detail & Related papers (2023-09-19T20:09:15Z) - Convolutional Monge Mapping Normalization for learning on sleep data [63.22081662149488]
We propose a new method called Convolutional Monge Mapping Normalization (CMMN)
CMMN consists in filtering the signals in order to adapt their power spectrum density (PSD) to a Wasserstein barycenter estimated on training data.
Numerical experiments on sleep EEG data show that CMMN leads to significant and consistent performance gains independent from the neural network architecture.
arXiv Detail & Related papers (2023-05-30T08:24:01Z) - Graph Neural Networks with Trainable Adjacency Matrices for Fault
Diagnosis on Multivariate Sensor Data [69.25738064847175]
It is necessary to consider the behavior of the signals in each sensor separately, to take into account their correlation and hidden relationships with each other.
The graph nodes can be represented as data from the different sensors, and the edges can display the influence of these data on each other.
It was proposed to construct a graph during the training of graph neural network. This allows to train models on data where the dependencies between the sensors are not known in advance.
arXiv Detail & Related papers (2022-10-20T11:03:21Z) - DynImp: Dynamic Imputation for Wearable Sensing Data Through Sensory and
Temporal Relatedness [78.98998551326812]
We argue that traditional methods have rarely made use of both times-series dynamics of the data as well as the relatedness of the features from different sensors.
We propose a model, termed as DynImp, to handle different time point's missingness with nearest neighbors along feature axis.
We show that the method can exploit the multi-modality features from related sensors and also learn from history time-series dynamics to reconstruct the data under extreme missingness.
arXiv Detail & Related papers (2022-09-26T21:59:14Z) - Convolutional generative adversarial imputation networks for
spatio-temporal missing data in storm surge simulations [86.5302150777089]
Generative Adversarial Imputation Nets (GANs) and GAN-based techniques have attracted attention as unsupervised machine learning methods.
We name our proposed method as Con Conval Generative Adversarial Imputation Nets (Conv-GAIN)
arXiv Detail & Related papers (2021-11-03T03:50:48Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - ReLearn: A Robust Machine Learning Framework in Presence of Missing Data
for Multimodal Stress Detection from Physiological Signals [5.042598205771715]
We propose ReLearn, a robust machine learning framework for stress detection from biomarkers extracted from multimodal physiological signals.
ReLearn effectively copes with missing data and outliers both at training and inference phases.
Our experiments show that the proposed framework obtains a cross-validation accuracy of 86.8% even if more than 50% of samples within the features are missing.
arXiv Detail & Related papers (2021-04-29T11:53:01Z) - Description of Structural Biases and Associated Data in Sensor-Rich
Environments [6.548580592686077]
We study activity recognition in the context of sensor-rich environments.
We address the problem of inductive biases and their impact on the data collection process.
We propose a metamodeling process in which the sensor data is structured in layers.
arXiv Detail & Related papers (2021-04-11T00:26:59Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.