Multidomain Multimodal Fusion For Human Action Recognition Using
Inertial Sensors
- URL: http://arxiv.org/abs/2008.09748v1
- Date: Sat, 22 Aug 2020 03:46:12 GMT
- Title: Multidomain Multimodal Fusion For Human Action Recognition Using
Inertial Sensors
- Authors: Zeeshan Ahmad and Naimul Khan
- Abstract summary: We propose a novel multidomain multimodal fusion framework that extracts complementary and distinct features from different domains of the input modality.
Features in different domains are extracted by Convolutional Neural networks (CNNs) and then fused by Canonical Correlation based Fusion (CCF) for improving the accuracy of human action recognition.
- Score: 1.52292571922932
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the major reasons for misclassification of multiplex actions during
action recognition is the unavailability of complementary features that provide
the semantic information about the actions. In different domains these features
are present with different scales and intensities. In existing literature,
features are extracted independently in different domains, but the benefits
from fusing these multidomain features are not realized. To address this
challenge and to extract complete set of complementary information, in this
paper, we propose a novel multidomain multimodal fusion framework that extracts
complementary and distinct features from different domains of the input
modality. We transform input inertial data into signal images, and then make
the input modality multidomain and multimodal by transforming spatial domain
information into frequency and time-spectrum domain using Discrete Fourier
Transform (DFT) and Gabor wavelet transform (GWT) respectively. Features in
different domains are extracted by Convolutional Neural networks (CNNs) and
then fused by Canonical Correlation based Fusion (CCF) for improving the
accuracy of human action recognition. Experimental results on three inertial
datasets show the superiority of the proposed method when compared to the
state-of-the-art.
Related papers
- Investigating the potential of Sparse Mixtures-of-Experts for multi-domain neural machine translation [59.41178047749177]
We focus on multi-domain Neural Machine Translation, with the goal of developing efficient models which can handle data from various domains seen during training and are robust to domains unseen during training.
We hypothesize that Sparse Mixture-of-Experts (SMoE) models are a good fit for this task, as they enable efficient model scaling.
We conduct a series of experiments aimed at validating the utility of SMoE for the multi-domain scenario, and find that a straightforward width scaling of Transformer is a simpler and surprisingly more efficient approach in practice, and reaches the same performance level as SMoE.
arXiv Detail & Related papers (2024-07-01T09:45:22Z) - A Multi-Stage Adaptive Feature Fusion Neural Network for Multimodal Gait
Recognition [15.080096318551346]
Most existing gait recognition algorithms are unimodal, and a few multimodal gait recognition algorithms perform multimodal fusion only once.
We propose a multi-stage feature fusion strategy (MSFFS), which performs multimodal fusions at different stages in the feature extraction process.
Also, we propose an adaptive feature fusion module (AFFM) that considers the semantic association between silhouettes and skeletons.
arXiv Detail & Related papers (2023-12-22T03:25:15Z) - Unified Contrastive Fusion Transformer for Multimodal Human Action
Recognition [13.104967563769533]
We introduce a new multimodal fusion architecture, referred to as Unified Contrastive Fusion Transformer (UCFFormer)
UCFFormer integrates data with diverse distributions to enhance human action recognition (HAR) performance.
We present the Factorized Time-Modality Attention to perform self-attention efficiently for the Unified Transformer.
arXiv Detail & Related papers (2023-09-10T14:10:56Z) - Improving Anomaly Segmentation with Multi-Granularity Cross-Domain
Alignment [17.086123737443714]
Anomaly segmentation plays a pivotal role in identifying atypical objects in images, crucial for hazard detection in autonomous driving systems.
While existing methods demonstrate noteworthy results on synthetic data, they often fail to consider the disparity between synthetic and real-world data domains.
We introduce the Multi-Granularity Cross-Domain Alignment framework, tailored to harmonize features across domains at both the scene and individual sample levels.
arXiv Detail & Related papers (2023-08-16T22:54:49Z) - Learning multi-domain feature relation for visible and Long-wave
Infrared image patch matching [39.88037892637296]
We present the largest visible and Long-wave Infrared (LWIR) image patch matching dataset, termed VL-CMIM.
In addition, a multi-domain feature relation learning network (MD-FRN) is proposed.
arXiv Detail & Related papers (2023-08-09T11:23:32Z) - Robust Domain Adaptive Object Detection with Unified Multi-Granularity Alignment [59.831917206058435]
Domain adaptive detection aims to improve the generalization of detectors on target domain.
Recent approaches achieve domain adaption through feature alignment in different granularities via adversarial learning.
We introduce a unified multi-granularity alignment (MGA)-based detection framework for domain-invariant feature learning.
arXiv Detail & Related papers (2023-01-01T08:38:07Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - Variational Attention: Propagating Domain-Specific Knowledge for
Multi-Domain Learning in Crowd Counting [75.80116276369694]
In crowd counting, due to the problem of laborious labelling, it is perceived intractability of collecting a new large-scale dataset.
We resort to the multi-domain joint learning and propose a simple but effective Domain-specific Knowledge Propagating Network (DKPNet)
It is mainly achieved by proposing the novel Variational Attention(VA) technique for explicitly modeling the attention distributions for different domains.
arXiv Detail & Related papers (2021-08-18T08:06:37Z) - AFAN: Augmented Feature Alignment Network for Cross-Domain Object
Detection [90.18752912204778]
Unsupervised domain adaptation for object detection is a challenging problem with many real-world applications.
We propose a novel augmented feature alignment network (AFAN) which integrates intermediate domain image generation and domain-adversarial training.
Our approach significantly outperforms the state-of-the-art methods on standard benchmarks for both similar and dissimilar domain adaptations.
arXiv Detail & Related papers (2021-06-10T05:01:20Z) - Learning to Combine: Knowledge Aggregation for Multi-Source Domain
Adaptation [56.694330303488435]
We propose a Learning to Combine for Multi-Source Domain Adaptation (LtC-MSDA) framework.
In the nutshell, a knowledge graph is constructed on the prototypes of various domains to realize the information propagation among semantically adjacent representations.
Our approach outperforms existing methods with a remarkable margin.
arXiv Detail & Related papers (2020-07-17T07:52:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.