Multi-Modal Recurrent Fusion for Indoor Localization
- URL: http://arxiv.org/abs/2203.00510v2
- Date: Wed, 2 Mar 2022 02:40:09 GMT
- Title: Multi-Modal Recurrent Fusion for Indoor Localization
- Authors: Jianyuan Yu and Pu (Perry) Wang and Toshiaki Koike-Akino and Philip V.
Orlik
- Abstract summary: This paper considers indoor localization using multi-modal wireless signals including Wi-Fi, inertial measurement unit (IMU), and ultra-wideband (UWB)
A multi-stream recurrent fusion method is proposed to combine the current hidden state of each modality in the context of recurrent neural networks.
- Score: 24.138127040942127
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This paper considers indoor localization using multi-modal wireless signals
including Wi-Fi, inertial measurement unit (IMU), and ultra-wideband (UWB). By
formulating the localization as a multi-modal sequence regression problem, a
multi-stream recurrent fusion method is proposed to combine the current hidden
state of each modality in the context of recurrent neural networks while
accounting for the modality uncertainty which is directly learned from its own
immediate past states. The proposed method was evaluated on the large-scale
SPAWC2021 multi-modal localization dataset and compared with a wide range of
baseline methods including the trilateration method, traditional fingerprinting
methods, and convolution network-based methods.
Related papers
- Application of Multimodal Fusion Deep Learning Model in Disease Recognition [14.655086303102575]
This paper introduces an innovative multi-modal fusion deep learning approach to overcome the drawbacks of traditional single-modal recognition techniques.
During the feature extraction stage, cutting-edge deep learning models are applied to distill advanced features from image-based, temporal, and structured data sources.
The findings demonstrate significant advantages of the multimodal fusion model across multiple evaluation metrics.
arXiv Detail & Related papers (2024-05-22T23:09:49Z) - A Multimodal Intermediate Fusion Network with Manifold Learning for
Stress Detection [1.2430809884830318]
This paper introduces an intermediate multimodal fusion network with manifold learning-based dimensionality reduction.
We compare various dimensionality reduction techniques for different variations of unimodal and multimodal networks.
We observe that the intermediate-level fusion with the Multi-Dimensional Scaling (MDS) manifold method showed promising results with an accuracy of 96.00%.
arXiv Detail & Related papers (2024-03-12T21:06:19Z) - Improved off-policy training of diffusion samplers [93.66433483772055]
We study the problem of training diffusion models to sample from a distribution with an unnormalized density or energy function.
We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods.
Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work.
arXiv Detail & Related papers (2024-02-07T18:51:49Z) - Score-based Source Separation with Applications to Digital Communication
Signals [72.6570125649502]
We propose a new method for separating superimposed sources using diffusion-based generative models.
Motivated by applications in radio-frequency (RF) systems, we are interested in sources with underlying discrete nature.
Our method can be viewed as a multi-source extension to the recently proposed score distillation sampling scheme.
arXiv Detail & Related papers (2023-06-26T04:12:40Z) - Generalizing Multimodal Variational Methods to Sets [35.69942798534849]
This paper presents a novel variational method on sets called the Set Multimodal VAE (SMVAE) for learning a multimodal latent space.
By modeling the joint-modality posterior distribution directly, the proposed SMVAE learns to exchange information between multiple modalities and compensate for the drawbacks caused by factorization.
arXiv Detail & Related papers (2022-12-19T23:50:19Z) - Multi-view Multi-label Anomaly Network Traffic Classification based on
MLP-Mixer Neural Network [55.21501819988941]
Existing network traffic classification based on convolutional neural networks (CNNs) often emphasizes local patterns of traffic data while ignoring global information associations.
We propose an end-to-end network traffic classification method.
arXiv Detail & Related papers (2022-10-30T01:52:05Z) - Multi-Modal Mutual Information Maximization: A Novel Approach for
Unsupervised Deep Cross-Modal Hashing [73.29587731448345]
We propose a novel method, dubbed Cross-Modal Info-Max Hashing (CMIMH)
We learn informative representations that can preserve both intra- and inter-modal similarities.
The proposed method consistently outperforms other state-of-the-art cross-modal retrieval methods.
arXiv Detail & Related papers (2021-12-13T08:58:03Z) - Globally Convergent Multilevel Training of Deep Residual Networks [0.0]
We propose a globally convergent multilevel training method for deep residual networks (ResNets)
The devised method operates in hybrid (stochastic-deterministic) settings by adaptively adjusting mini-batch sizes during the training.
arXiv Detail & Related papers (2021-07-15T19:08:58Z) - Deep Multimodal Fusion by Channel Exchanging [87.40768169300898]
This paper proposes a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities.
The validity of such exchanging process is also guaranteed by sharing convolutional filters yet keeping separate BN layers across modalities, which, as an add-on benefit, allows our multimodal architecture to be almost as compact as a unimodal network.
arXiv Detail & Related papers (2020-11-10T09:53:20Z) - Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for
Gesture Recognition [89.0152015268929]
We propose the first neural architecture search (NAS)-based method for RGB-D gesture recognition.
The proposed method includes two key components: 1) enhanced temporal representation via the 3D Central Difference Convolution (3D-CDC) family, and optimized backbones for multi-modal-rate branches and lateral connections.
The resultant multi-rate network provides a new perspective to understand the relationship between RGB and depth modalities and their temporal dynamics.
arXiv Detail & Related papers (2020-08-21T10:45:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.