Inertial Sensor Data To Image Encoding For Human Action Recognition
- URL: http://arxiv.org/abs/2105.13533v1
- Date: Fri, 28 May 2021 01:22:52 GMT
- Title: Inertial Sensor Data To Image Encoding For Human Action Recognition
- Authors: Zeeshan Ahmad, Naimul Khan
- Abstract summary: Convolutional Neural Networks (CNNs) are successful deep learning models in the field of computer vision.
In this paper, we use 4 types of spatial domain methods for transforming inertial sensor data to activity images.
For creating a multimodal fusion framework, we made each type of activity images multimodal by convolving with two spatial domain filters.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Convolutional Neural Networks (CNNs) are successful deep learning models in
the field of computer vision. To get the maximum advantage of CNN model for
Human Action Recognition (HAR) using inertial sensor data, in this paper, we
use 4 types of spatial domain methods for transforming inertial sensor data to
activity images, which are then utilized in a novel fusion framework. These
four types of activity images are Signal Images (SI), Gramian Angular Field
(GAF) Images, Markov Transition Field (MTF) Images and Recurrence Plot (RP)
Images. Furthermore, for creating a multimodal fusion framework and to exploit
activity image, we made each type of activity images multimodal by convolving
with two spatial domain filters : Prewitt filter and High-boost filter.
Resnet-18, a CNN model, is used to learn deep features from multi-modalities.
Learned features are extracted from the last pooling layer of each ReNet and
then fused by canonical correlation based fusion (CCF) for improving the
accuracy of human action recognition. These highly informative features are
served as input to a multiclass Support Vector Machine (SVM). Experimental
results on three publicly available inertial datasets show the superiority of
the proposed method over the current state-of-the-art.
Related papers
- DA-HFNet: Progressive Fine-Grained Forgery Image Detection and Localization Based on Dual Attention [12.36906630199689]
We construct a DA-HFNet forged image dataset guided by text or image-assisted GAN and Diffusion model.
Our goal is to utilize a hierarchical progressive network to capture forged artifacts at different scales for detection and localization.
arXiv Detail & Related papers (2024-06-03T16:13:33Z) - Rethinking Transformers Pre-training for Multi-Spectral Satellite
Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks.
Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data.
In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - Attention Mechanism for Contrastive Learning in GAN-based Image-to-Image
Translation [3.90801108629495]
We propose a GAN-based model that is capable of generating high-quality images across different domains.
We leverage Contrastive Learning to train the model in a self-supervised way using image data acquired in the real world using real sensors and simulated images from 3D games.
arXiv Detail & Related papers (2023-02-23T14:23:23Z) - Semantic Labeling of High Resolution Images Using EfficientUNets and
Transformers [5.177947445379688]
We propose a new segmentation model that combines convolutional neural networks with deep transformers.
Our results demonstrate that the proposed methodology improves segmentation accuracy compared to state-of-the-art techniques.
arXiv Detail & Related papers (2022-06-20T12:03:54Z) - Learning Enriched Features for Fast Image Restoration and Enhancement [166.17296369600774]
This paper presents a holistic goal of maintaining spatially-precise high-resolution representations through the entire network.
We learn an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
Our approach achieves state-of-the-art results for a variety of image processing tasks, including defocus deblurring, image denoising, super-resolution, and image enhancement.
arXiv Detail & Related papers (2022-04-19T17:59:45Z) - M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection [74.19291916812921]
forged images generated by Deepfake techniques pose a serious threat to the trustworthiness of digital information.
In this paper, we aim to capture the subtle manipulation artifacts at different scales for Deepfake detection.
We introduce a high-quality Deepfake dataset, SR-DF, which consists of 4,000 DeepFake videos generated by state-of-the-art face swapping and facial reenactment methods.
arXiv Detail & Related papers (2021-04-20T05:43:44Z) - CNN based Multistage Gated Average Fusion (MGAF) for Human Action
Recognition Using Depth and Inertial Sensors [1.52292571922932]
Convolutional Neural Network (CNN) provides leverage to extract and fuse features from all layers of its architecture.
We propose novel Multistage Gated Average Fusion (MGAF) network which extracts and fuses features from all layers of CNN.
arXiv Detail & Related papers (2020-10-29T11:49:13Z) - DoFE: Domain-oriented Feature Embedding for Generalizable Fundus Image
Segmentation on Unseen Datasets [96.92018649136217]
We present a novel Domain-oriented Feature Embedding (DoFE) framework to improve the generalization ability of CNNs on unseen target domains.
Our DoFE framework dynamically enriches the image features with additional domain prior knowledge learned from multi-source domains.
Our framework generates satisfying segmentation results on unseen datasets and surpasses other domain generalization and network regularization methods.
arXiv Detail & Related papers (2020-10-13T07:28:39Z) - Towards Improved Human Action Recognition Using Convolutional Neural
Networks and Multimodal Fusion of Depth and Inertial Sensor Data [1.52292571922932]
This paper attempts at improving the accuracy of Human Action Recognition (HAR) by fusion of depth and inertial sensor data.
We transform the depth data into Sequential Front view Images(SFI) and fine-tune the pre-trained AlexNet on these images.
Inertial data is converted into Signal Images (SI) and another convolutional neural network (CNN) is trained on these images.
arXiv Detail & Related papers (2020-08-22T03:41:34Z) - Learning Enriched Features for Real Image Restoration and Enhancement [166.17296369600774]
convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task.
We present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network.
Our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details.
arXiv Detail & Related papers (2020-03-15T11:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.