$P^2$ Net: Augmented Parallel-Pyramid Net for Attention Guided Pose
Estimation
- URL: http://arxiv.org/abs/2010.14076v1
- Date: Mon, 26 Oct 2020 02:10:12 GMT
- Title: $P^2$ Net: Augmented Parallel-Pyramid Net for Attention Guided Pose
Estimation
- Authors: Luanxuan Hou, Jie Cao, Yuan Zhao, Haifeng Shen, Jian Tang, Ran He
- Abstract summary: We propose an augmented Parallel-Pyramid Net with feature refinement by dilated bottleneck and attention module.
A parallel-pyramid structure is followed to compensate the information loss introduced by the network.
Our method achieves the best performance on the challenging MSCOCO and MPII datasets.
- Score: 69.25492391672064
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an augmented Parallel-Pyramid Net ($P^2~Net$) with feature
refinement by dilated bottleneck and attention module. During data
preprocessing, we proposed a differentiable auto data augmentation ($DA^2$)
method. We formulate the problem of searching data augmentaion policy in a
differentiable form, so that the optimal policy setting can be easily updated
by back propagation during training. $DA^2$ improves the training efficiency. A
parallel-pyramid structure is followed to compensate the information loss
introduced by the network. We innovate two fusion structures, i.e. Parallel
Fusion and Progressive Fusion, to process pyramid features from backbone
network. Both fusion structures leverage the advantages of spatial information
affluence at high resolution and semantic comprehension at low resolution
effectively. We propose a refinement stage for the pyramid features to further
boost the accuracy of our network. By introducing dilated bottleneck and
attention module, we increase the receptive field for the features with limited
complexity and tune the importance to different feature channels. To further
refine the feature maps after completion of feature extraction stage, an
Attention Module ($AM$) is defined to extract weighted features from different
scale feature maps generated by the parallel-pyramid structure. Compared with
the traditional up-sampling refining, $AM$ can better capture the relationship
between channels. Experiments corroborate the effectiveness of our proposed
method. Notably, our method achieves the best performance on the challenging
MSCOCO and MPII datasets.
Related papers
- EPMF: Efficient Perception-aware Multi-sensor Fusion for 3D Semantic Segmentation [62.210091681352914]
We study multi-sensor fusion for 3D semantic segmentation for many applications, such as autonomous driving and robotics.
In this work, we investigate a collaborative fusion scheme called perception-aware multi-sensor fusion (PMF)
We propose a two-stream network to extract features from the two modalities separately. The extracted features are fused by effective residual-based fusion modules.
arXiv Detail & Related papers (2021-06-21T10:47:26Z) - A^2-FPN: Attention Aggregation based Feature Pyramid Network for
Instance Segmentation [68.10621089649486]
We propose Attention Aggregation based Feature Pyramid Network (A2-FPN) to improve multi-scale feature learning.
A2-FPN achieves an improvement of 2.0% and 1.4% mask AP when integrated into the strong baselines such as Cascade Mask R-CNN and Hybrid Task Cascade.
arXiv Detail & Related papers (2021-05-07T11:51:08Z) - GridDehazeNet+: An Enhanced Multi-Scale Network with Intra-Task
Knowledge Transfer for Single Image Dehazing [12.982905875008214]
We propose an enhanced multi-scale network, dubbed GridDehazeNet+, for single image dehazing.
It consists of three modules: pre-processing, backbone, and post-processing.
arXiv Detail & Related papers (2021-03-25T17:35:36Z) - Efficient Human Pose Estimation by Learning Deeply Aggregated
Representations [67.24496300046255]
We propose an efficient human pose estimation network (DANet) by learning deeply aggregated representations.
Our networks could achieve comparable or even better accuracy with much smaller model complexity.
arXiv Detail & Related papers (2020-12-13T10:58:07Z) - Adaptive Context-Aware Multi-Modal Network for Depth Completion [107.15344488719322]
We propose to adopt the graph propagation to capture the observed spatial contexts.
We then apply the attention mechanism on the propagation, which encourages the network to model the contextual information adaptively.
Finally, we introduce the symmetric gated fusion strategy to exploit the extracted multi-modal features effectively.
Our model, named Adaptive Context-Aware Multi-Modal Network (ACMNet), achieves the state-of-the-art performance on two benchmarks.
arXiv Detail & Related papers (2020-08-25T06:00:06Z) - Multi-Fidelity Bayesian Optimization via Deep Neural Networks [19.699020509495437]
In many applications, the objective function can be evaluated at multiple fidelities to enable a trade-off between the cost and accuracy.
We propose Deep Neural Network Multi-Fidelity Bayesian Optimization (DNN-MFBO) that can flexibly capture all kinds of complicated relationships between the fidelities.
We show the advantages of our method in both synthetic benchmark datasets and real-world applications in engineering design.
arXiv Detail & Related papers (2020-07-06T23:28:40Z) - Augmented Parallel-Pyramid Net for Attention Guided Pose-Estimation [90.28365183660438]
This paper proposes an augmented parallel-pyramid net with attention partial module and differentiable auto-data augmentation.
We define a new pose search space where the sequences of data augmentations are formulated as a trainable and operational CNN component.
Notably, our method achieves the top-1 accuracy on the challenging COCO keypoint benchmark and the state-of-the-art results on the MPII datasets.
arXiv Detail & Related papers (2020-03-17T03:52:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.