Unsupervised Multi-Modal Representation Learning for Affective Computing
with Multi-Corpus Wearable Data
- URL: http://arxiv.org/abs/2008.10726v1
- Date: Mon, 24 Aug 2020 22:01:55 GMT
- Title: Unsupervised Multi-Modal Representation Learning for Affective Computing
with Multi-Corpus Wearable Data
- Authors: Kyle Ross, Paul Hungler, Ali Etemad
- Abstract summary: We propose an unsupervised framework to reduce the reliance on human supervision.
The proposed framework utilizes two stacked convolutional autoencoders to learn latent representations from wearable electrocardiogram (ECG) and electrodermal activity (EDA) signals.
Our method outperforms current state-of-the-art results that have performed arousal detection on the same datasets.
- Score: 16.457778420360537
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With recent developments in smart technologies, there has been a growing
focus on the use of artificial intelligence and machine learning for affective
computing to further enhance the user experience through emotion recognition.
Typically, machine learning models used for affective computing are trained
using manually extracted features from biological signals. Such features may
not generalize well for large datasets and may be sub-optimal in capturing the
information from the raw input data. One approach to address this issue is to
use fully supervised deep learning methods to learn latent representations of
the biosignals. However, this method requires human supervision to label the
data, which may be unavailable or difficult to obtain. In this work we propose
an unsupervised framework reduce the reliance on human supervision. The
proposed framework utilizes two stacked convolutional autoencoders to learn
latent representations from wearable electrocardiogram (ECG) and electrodermal
activity (EDA) signals. These representations are utilized within a random
forest model for binary arousal classification. This approach reduces human
supervision and enables the aggregation of datasets allowing for higher
generalizability. To validate this framework, an aggregated dataset comprised
of the AMIGOS, ASCERTAIN, CLEAS, and MAHNOB-HCI datasets is created. The
results of our proposed method are compared with using convolutional neural
networks, as well as methods that employ manual extraction of hand-crafted
features. The methodology used for fusing the two modalities is also
investigated. Lastly, we show that our method outperforms current
state-of-the-art results that have performed arousal detection on the same
datasets using ECG and EDA biosignals. The results show the wide-spread
applicability for stacked convolutional autoencoders to be used with machine
learning for affective computing.
Related papers
- Joint-Embedding Masked Autoencoder for Self-supervised Learning of
Dynamic Functional Connectivity from the Human Brain [18.165807360855435]
Graph Neural Networks (GNNs) have shown promise in learning dynamic functional connectivity for distinguishing phenotypes from human brain networks.
We introduce the Spatio-Temporal Joint Embedding Masked Autoencoder (ST-JEMA), drawing inspiration from the Joint Embedding Predictive Architecture (JEPA) in computer vision.
arXiv Detail & Related papers (2024-03-11T04:49:41Z) - Reinforcement Learning Based Multi-modal Feature Fusion Network for
Novel Class Discovery [47.28191501836041]
In this paper, we employ a Reinforcement Learning framework to simulate the cognitive processes of humans.
We also deploy a Member-to-Leader Multi-Agent framework to extract and fuse features from multi-modal information.
We demonstrate the performance of our approach in both the 3D and 2D domains by employing the OS-MN40, OS-MN40-Miss, and Cifar10 datasets.
arXiv Detail & Related papers (2023-08-26T07:55:32Z) - Defect Classification in Additive Manufacturing Using CNN-Based Vision
Processing [76.72662577101988]
This paper examines two scenarios: first, using convolutional neural networks (CNNs) to accurately classify defects in an image dataset from AM and second, applying active learning techniques to the developed classification model.
This allows the construction of a human-in-the-loop mechanism to reduce the size of the data required to train and generate training data.
arXiv Detail & Related papers (2023-07-14T14:36:58Z) - Application of federated learning techniques for arrhythmia
classification using 12-lead ECG signals [0.11184789007828977]
This work uses a Federated Learning (FL) privacy-preserving methodology to train AI models over heterogeneous sets of high-definition ECG.
We demonstrated comparable performance to models trained using CL, IID, and non-IID approaches.
arXiv Detail & Related papers (2022-08-23T14:21:16Z) - Decision Forest Based EMG Signal Classification with Low Volume Dataset
Augmented with Random Variance Gaussian Noise [51.76329821186873]
We produce a model that can classify six different hand gestures with a limited number of samples that generalizes well to a wider audience.
We appeal to a set of more elementary methods such as the use of random bounds on a signal, but desire to show the power these methods can carry in an online setting.
arXiv Detail & Related papers (2022-06-29T23:22:18Z) - Neurosymbolic hybrid approach to driver collision warning [64.02492460600905]
There are two main algorithmic approaches to autonomous driving systems.
Deep learning alone has achieved state-of-the-art results in many areas.
But sometimes it can be very difficult to debug if the deep learning model doesn't work.
arXiv Detail & Related papers (2022-03-28T20:29:50Z) - Addressing Data Scarcity in Multimodal User State Recognition by
Combining Semi-Supervised and Supervised Learning [1.1688030627514532]
We present a multimodal machine learning approach for detecting dis-/agreement and confusion states in a human-robot interaction environment.
We achieve an average F1-score of 81.1% for dis-/agreement detection with a small amount of labeled data and a large unlabeled data set.
arXiv Detail & Related papers (2022-02-08T10:41:41Z) - Clustering augmented Self-Supervised Learning: Anapplication to Land
Cover Mapping [10.720852987343896]
We introduce a new method for land cover mapping by using a clustering based pretext task for self-supervised learning.
We demonstrate the effectiveness of the method on two societally relevant applications.
arXiv Detail & Related papers (2021-08-16T19:35:43Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Uncovering the structure of clinical EEG signals with self-supervised
learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available.
This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG)
By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.