Spatial Attention-based Distribution Integration Network for Human Pose
Estimation
- URL: http://arxiv.org/abs/2311.05323v1
- Date: Thu, 9 Nov 2023 12:43:01 GMT
- Title: Spatial Attention-based Distribution Integration Network for Human Pose
Estimation
- Authors: Sihan Gao, Jing Zhu, Xiaoxuan Zhuang, Zhaoyue Wang, and Qijin Li
- Abstract summary: We present the Spatial Attention-based Distribution Integration Network (SADI-NET) to improve the accuracy of localization.
Our network consists of three efficient models: the receptive fortified module (RFM), spatial fusion module (SFM), and distribution learning module (DLM)
Our model obtained a remarkable $92.10%$ percent accuracy on the MPII test dataset, demonstrating significant improvements over existing models and establishing state-of-the-art performance.
- Score: 0.8052382324386398
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, human pose estimation has made significant progress through
the implementation of deep learning techniques. However, these techniques still
face limitations when confronted with challenging scenarios, including
occlusion, diverse appearances, variations in illumination, and overlap. To
cope with such drawbacks, we present the Spatial Attention-based Distribution
Integration Network (SADI-NET) to improve the accuracy of localization in such
situations. Our network consists of three efficient models: the receptive
fortified module (RFM), spatial fusion module (SFM), and distribution learning
module (DLM). Building upon the classic HourglassNet architecture, we replace
the basic block with our proposed RFM. The RFM incorporates a dilated residual
block and attention mechanism to expand receptive fields while enhancing
sensitivity to spatial information. In addition, the SFM incorporates
multi-scale characteristics by employing both global and local attention
mechanisms. Furthermore, the DLM, inspired by residual log-likelihood
estimation (RLE), reconfigures a predicted heatmap using a trainable
distribution weight. For the purpose of determining the efficacy of our model,
we conducted extensive experiments on the MPII and LSP benchmarks.
Particularly, our model obtained a remarkable $92.10\%$ percent accuracy on the
MPII test dataset, demonstrating significant improvements over existing models
and establishing state-of-the-art performance.
Related papers
- OccLoff: Learning Optimized Feature Fusion for 3D Occupancy Prediction [5.285847977231642]
3D semantic occupancy prediction is crucial for ensuring the safety in autonomous driving.
Existing fusion-based occupancy methods typically involve performing a 2D-to-3D view transformation on image features.
We propose OccLoff, a framework that Learns to optimize Feature Fusion for 3D occupancy prediction.
arXiv Detail & Related papers (2024-11-06T06:34:27Z) - IRASNet: Improved Feature-Level Clutter Reduction for Domain Generalized SAR-ATR [11.197991954581155]
This study proposes a framework particularly designed for domain-generalized SAR-ATR called IRASNet.
IRASNet enables effective feature-level clutter reduction and domain-invariant feature learning.
IRASNet not only enhances performance but also significantly improves feature-level clutter reduction, making it a valuable advancement in the field of radar image pattern recognition.
arXiv Detail & Related papers (2024-09-25T11:53:58Z) - Enhancing Automatic Modulation Recognition through Robust Global Feature
Extraction [12.868218616042292]
Modulated signals exhibit long temporal dependencies.
Human experts analyze patterns in constellation diagrams to classify modulation schemes.
Classical convolutional-based networks excel at extracting local features but struggle to capture global relationships.
arXiv Detail & Related papers (2024-01-02T06:31:24Z) - Embedded feature selection in LSTM networks with multi-objective
evolutionary ensemble learning for time series forecasting [49.1574468325115]
We present a novel feature selection method embedded in Long Short-Term Memory networks.
Our approach optimize the weights and biases of the LSTM in a partitioned manner.
Experimental evaluations on air quality time series data from Italy and southeast Spain demonstrate that our method substantially improves the ability generalization of conventional LSTMs.
arXiv Detail & Related papers (2023-12-29T08:42:10Z) - Diffusion Models Without Attention [110.5623058129782]
Diffusion State Space Model (DiffuSSM) is an architecture that supplants attention mechanisms with a more scalable state space model backbone.
Our focus on FLOP-efficient architectures in diffusion training marks a significant step forward.
arXiv Detail & Related papers (2023-11-30T05:15:35Z) - SatDM: Synthesizing Realistic Satellite Image with Semantic Layout
Conditioning using Diffusion Models [0.0]
Denoising Diffusion Probabilistic Models (DDPMs) have demonstrated significant promise in synthesizing realistic images from semantic layouts.
In this paper, a conditional DDPM model capable of taking a semantic map and generating high-quality, diverse, and correspondingly accurate satellite images is implemented.
The effectiveness of our proposed model is validated using a meticulously labeled dataset introduced within the context of this study.
arXiv Detail & Related papers (2023-09-28T19:39:13Z) - Multi-Agent Reinforcement Learning for Adaptive Mesh Refinement [17.72127385405445]
We present a novel formulation of adaptive mesh refinement (AMR) as a fully-cooperative Markov game.
We design a novel deep multi-agent reinforcement learning algorithm called Value Decomposition Graph Network (VDGN)
We show that VDGN policies significantly outperform error threshold-based policies in global error and cost metrics.
arXiv Detail & Related papers (2022-11-02T00:41:32Z) - FedDM: Iterative Distribution Matching for Communication-Efficient
Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry.
We propose FedDM to build the global training objective from multiple local surrogate functions.
In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z) - Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution
Detection [55.028065567756066]
Out-of-distribution (OOD) detection has recently received much attention from the machine learning community due to its importance in deploying machine learning models in real-world applications.
In this paper we propose an uncertainty quantification approach by modelling the distribution of features.
We incorporate an efficient ensemble mechanism, namely batch-ensemble, to construct the batch-ensemble neural networks (BE-SNNs) and overcome the feature collapse problem.
We show that BE-SNNs yield superior performance on several OOD benchmarks, such as the Two-Moons dataset, the FashionMNIST vs MNIST dataset, FashionM
arXiv Detail & Related papers (2022-06-26T16:00:22Z) - Multi-Branch Deep Radial Basis Function Networks for Facial Emotion
Recognition [80.35852245488043]
We propose a CNN based architecture enhanced with multiple branches formed by radial basis function (RBF) units.
RBF units capture local patterns shared by similar instances using an intermediate representation.
We show it is the incorporation of local information what makes the proposed model competitive.
arXiv Detail & Related papers (2021-09-07T21:05:56Z) - Edge-assisted Democratized Learning Towards Federated Analytics [67.44078999945722]
We show the hierarchical learning structure of the proposed edge-assisted democratized learning mechanism, namely Edge-DemLearn.
We also validate Edge-DemLearn as a flexible model training mechanism to build a distributed control and aggregation methodology in regions.
arXiv Detail & Related papers (2020-12-01T11:46:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.