Toward fast and accurate human pose estimation via soft-gated skip
connections
- URL: http://arxiv.org/abs/2002.11098v1
- Date: Tue, 25 Feb 2020 18:51:51 GMT
- Title: Toward fast and accurate human pose estimation via soft-gated skip
connections
- Authors: Adrian Bulat and Jean Kossaifi and Georgios Tzimiropoulos and Maja
Pantic
- Abstract summary: This paper is on highly accurate and highly efficient human pose estimation.
We re-analyze this design choice in the context of improving both the accuracy and the efficiency over the state-of-the-art.
Our model achieves state-of-the-art results on the MPII and LSP datasets.
- Score: 97.06882200076096
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper is on highly accurate and highly efficient human pose estimation.
Recent works based on Fully Convolutional Networks (FCNs) have demonstrated
excellent results for this difficult problem. While residual connections within
FCNs have proved to be quintessential for achieving high accuracy, we
re-analyze this design choice in the context of improving both the accuracy and
the efficiency over the state-of-the-art. In particular, we make the following
contributions: (a) We propose gated skip connections with per-channel learnable
parameters to control the data flow for each channel within the module within
the macro-module. (b) We introduce a hybrid network that combines the HourGlass
and U-Net architectures which minimizes the number of identity connections
within the network and increases the performance for the same parameter budget.
Our model achieves state-of-the-art results on the MPII and LSP datasets. In
addition, with a reduction of 3x in model size and complexity, we show no
decrease in performance when compared to the original HourGlass network.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - DA-Flow: Dual Attention Normalizing Flow for Skeleton-based Video Anomaly Detection [52.74152717667157]
We propose a lightweight module called Dual Attention Module (DAM) for capturing cross-dimension interaction relationships in-temporal skeletal data.
It employs the frame attention mechanism to identify the most significant frames and the skeleton attention mechanism to capture broader relationships across fixed partitions with minimal parameters and flops.
arXiv Detail & Related papers (2024-06-05T06:18:03Z) - Accelerating Deep Neural Networks via Semi-Structured Activation
Sparsity [0.0]
Exploiting sparsity in the network's feature maps is one of the ways to reduce its inference latency.
We propose a solution to induce semi-structured activation sparsity exploitable through minor runtime modifications.
Our approach yields a speed improvement of $1.25 times$ with a minimal accuracy drop of $1.1%$ for the ResNet18 model on the ImageNet dataset.
arXiv Detail & Related papers (2023-09-12T22:28:53Z) - Hierarchical Federated Learning in Wireless Networks: Pruning Tackles Bandwidth Scarcity and System Heterogeneity [32.321021292376315]
We propose a pruning-enabled hierarchical federated learning (PHFL) in heterogeneous networks (HetNets)
We first derive an upper bound of the convergence rate that clearly demonstrates the impact of the model pruning and wireless communications.
We validate the effectiveness of our proposed PHFL algorithm in terms of test accuracy, wall clock time, energy consumption and bandwidth requirement.
arXiv Detail & Related papers (2023-08-03T07:03:33Z) - IMDeception: Grouped Information Distilling Super-Resolution Network [7.6146285961466]
Single-Image-Super-Resolution (SISR) is a classical computer vision problem that has benefited from the recent advancements in deep learning methods.
In this work, we propose the Global Progressive Refinement Module (GPRM) as a less parameter-demanding alternative to the IIC module for feature aggregation.
We also propose Grouped Information Distilling Blocks (GIDB) to further decrease the number of parameters and floating point operations persecond (FLOPS)
Experiments reveal that the proposed network performs on par with state-of-the-art models despite having a limited number of parameters and FLOPS
arXiv Detail & Related papers (2022-04-25T06:43:45Z) - Towards Bi-directional Skip Connections in Encoder-Decoder Architectures
and Beyond [95.46272735589648]
We propose backward skip connections that bring decoded features back to the encoder.
Our design can be jointly adopted with forward skip connections in any encoder-decoder architecture.
We propose a novel two-phase Neural Architecture Search (NAS) algorithm, namely BiX-NAS, to search for the best multi-scale skip connections.
arXiv Detail & Related papers (2022-03-11T01:38:52Z) - CONetV2: Efficient Auto-Channel Size Optimization for CNNs [35.951376988552695]
This work introduces a method that is efficient in computationally constrained environments by examining the micro-search space of channel size.
In tackling channel-size optimization, we design an automated algorithm to extract the dependencies within different connected layers of the network.
We also introduce a novel metric that highly correlates with test accuracy and enables analysis of individual network layers.
arXiv Detail & Related papers (2021-10-13T16:17:19Z) - FasterPose: A Faster Simple Baseline for Human Pose Estimation [65.8413964785972]
We propose a design paradigm for cost-effective network with LR representation for efficient pose estimation, named FasterPose.
We study the training behavior of FasterPose, and formulate a novel regressive cross-entropy (RCE) loss function for accelerating the convergence.
Compared with the previously dominant network of pose estimation, our method reduces 58% of the FLOPs and simultaneously gains 1.3% improvement of accuracy.
arXiv Detail & Related papers (2021-07-07T13:39:08Z) - DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator
Search [55.164053971213576]
convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead.
Structured (channel) pruning is usually applied to reduce the model redundancy while preserving the network structure.
Existing structured pruning methods require hand-crafted rules which may lead to tremendous pruning space.
arXiv Detail & Related papers (2020-11-04T07:43:01Z) - FDFlowNet: Fast Optical Flow Estimation using a Deep Lightweight Network [12.249680550252327]
We present a lightweight yet effective model for real-time optical flow estimation, termed FDFlowNet (fast deep flownet)
We achieve better or similar accuracy on the challenging KITTI and Sintel benchmarks while being about 2 times faster than PWC-Net.
arXiv Detail & Related papers (2020-06-22T14:01:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.