CNN Autoencoders for Hierarchical Feature Extraction and Fusion in Multi-sensor Human Activity Recognition
- URL: http://arxiv.org/abs/2502.04489v1
- Date: Thu, 06 Feb 2025 20:36:41 GMT
- Title: CNN Autoencoders for Hierarchical Feature Extraction and Fusion in Multi-sensor Human Activity Recognition
- Authors: Saeed Arabzadeh, Farshad Almasganj, Mohammad Mahdi Ahmadi,
- Abstract summary: We introduce a Hierarchically Unsupervised Fusion model designed to extract, and fuse features from IMU sensors data.
The tuned model is applied to the UCI-HAR, DaLiAc, and Parkinson's disease gait da-tasets.
- Score: 0.0
- License:
- Abstract: Deep learning methods have been widely used for Human Activity Recognition (HAR) using recorded signals from Iner-tial Measurement Units (IMUs) sensors that are installed on various parts of the human body. For this type of HAR, sev-eral challenges exist, the most significant of which is the analysis of multivarious IMU sensors data. Here, we introduce a Hierarchically Unsupervised Fusion (HUF) model designed to extract, and fuse features from IMU sensors data via a hybrid structure of Convolutional Neural Networks (CNN)s and Autoencoders (AE)s. First, we design a stack CNN-AE to embed short-time signals into sets of high dimensional features. Second, we develop another CNN-AE network to locally fuse the extracted features from each sensor unit. Finally, we unify all the sensor features through a third CNN-AE architecture as globally feature fusion to create a unique feature set. Additionally, we analyze the effects of varying the model hyperparameters. The best results are achieved with eight convolutional layers in each AE. Furthermore, it is determined that an overcomplete AE with 256 kernels in the code layer is suitable for feature extraction in the first block of the proposed HUF model; this number reduces to 64 in the last block of the model to customize the size of the applied features to the classifier. The tuned model is applied to the UCI-HAR, DaLiAc, and Parkinson's disease gait da-tasets, achieving the classification accuracies of 97%, 97%, and 88%, respectively, which are nearly 3% better com-pared to the state-of-the-art supervised methods.
Related papers
- Decomposing and Fusing Intra- and Inter-Sensor Spatio-Temporal Signal for Multi-Sensor Wearable Human Activity Recognition [12.359681612030682]
We propose the DecomposeWHAR model to better model the relationships between modality variables.
The decomposition creates high-dimensional representations of each intra-sensor variable.
The fusion phase begins by capturing relationships between intra-sensor variables and fusing their features at both the channel and variable levels.
arXiv Detail & Related papers (2025-01-19T01:52:28Z) - Multi-Sensor Fusion for UAV Classification Based on Feature Maps of Image and Radar Data [4.392337343771302]
We propose a system that fuses already processed multi-sensor data into a new Deep Neural Network to increase its classification accuracy towards UAV detection.
The model fuses high-level features extracted from individual object detection and classification models associated with thermal, optronic, and radar data.
arXiv Detail & Related papers (2024-10-21T15:12:37Z) - Multiple-Input Auto-Encoder Guided Feature Selection for IoT Intrusion Detection Systems [30.16714420093091]
This paper first introduces a novel neural network architecture called Multiple-Input Auto-Encoder (MIAE)
MIAE consists of multiple sub-encoders that can process inputs from different sources with different characteristics.
To distil and retain more relevant features but remove less important/redundant ones during the training process, we further design and embed a feature selection layer.
This layer learns the importance of features in the representation vector, facilitating the selection of informative features from the representation vector.
arXiv Detail & Related papers (2024-03-22T03:54:04Z) - A Novel Two Stream Decision Level Fusion of Vision and Inertial Sensors
Data for Automatic Multimodal Human Activity Recognition System [2.5214116139219787]
This paper presents a novel multimodal human activity recognition system.
It uses a two-stream decision level fusion of vision and inertial sensors.
The accuracies obtained by the proposed system are 96.9 %, 97.6 %, 98.7 %, and 95.9 % respectively.
arXiv Detail & Related papers (2023-06-27T19:29:35Z) - Differentiable Frequency-based Disentanglement for Aerial Video Action
Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos.
Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras.
We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z) - DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection [83.18142309597984]
Lidars and cameras are critical sensors that provide complementary information for 3D detection in autonomous driving.
We develop a family of generic multi-modal 3D detection models named DeepFusion, which is more accurate than previous methods.
arXiv Detail & Related papers (2022-03-15T18:46:06Z) - EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - CNN based Multistage Gated Average Fusion (MGAF) for Human Action
Recognition Using Depth and Inertial Sensors [1.52292571922932]
Convolutional Neural Network (CNN) provides leverage to extract and fuse features from all layers of its architecture.
We propose novel Multistage Gated Average Fusion (MGAF) network which extracts and fuses features from all layers of CNN.
arXiv Detail & Related papers (2020-10-29T11:49:13Z) - DecAug: Augmenting HOI Detection via Decomposition [54.65572599920679]
Current algorithms suffer from insufficient training samples and category imbalance within datasets.
We propose an efficient and effective data augmentation method called DecAug for HOI detection.
Experiments show that our method brings up to 3.3 mAP and 1.6 mAP improvements on V-COCO and HICODET dataset.
arXiv Detail & Related papers (2020-10-02T13:59:05Z) - Contextual-Bandit Anomaly Detection for IoT Data in Distributed
Hierarchical Edge Computing [65.78881372074983]
IoT devices can hardly afford complex deep neural networks (DNN) models, and offloading anomaly detection tasks to the cloud incurs long delay.
We propose and build a demo for an adaptive anomaly detection approach for distributed hierarchical edge computing (HEC) systems.
We show that our proposed approach significantly reduces detection delay without sacrificing accuracy, as compared to offloading detection tasks to the cloud.
arXiv Detail & Related papers (2020-04-15T06:13:33Z) - ASFD: Automatic and Scalable Face Detector [129.82350993748258]
We propose a novel Automatic and Scalable Face Detector (ASFD)
ASFD is based on a combination of neural architecture search techniques as well as a new loss design.
Our ASFD-D6 outperforms the prior strong competitors, and our lightweight ASFD-D0 runs at more than 120 FPS with Mobilenet for VGA-resolution images.
arXiv Detail & Related papers (2020-03-25T06:00:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.