Parallel Multi-Scale Networks with Deep Supervision for Hand Keypoint
Detection
- URL: http://arxiv.org/abs/2112.10275v1
- Date: Sun, 19 Dec 2021 22:38:16 GMT
- Title: Parallel Multi-Scale Networks with Deep Supervision for Hand Keypoint
Detection
- Authors: Renjie Li, Son Tran, Saurabh Garg, Katherine Lawler, Jane Alty, Quan
Bai
- Abstract summary: We propose a novel CNN model named Multi-Scale Deep Supervision Network (P-MSDSNet)
P-MSDSNet learns feature maps at different scales with deep supervisions to produce attention maps for adaptive feature propagation from layers to layers.
We show that P-MSDSNet outperforms the state-of-the-art approaches on benchmark datasets while requiring fewer number of parameters.
- Score: 3.1781111932870716
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Keypoint detection plays an important role in a wide range of applications.
However, predicting keypoints of small objects such as human hands is a
challenging problem. Recent works fuse feature maps of deep Convolutional
Neural Networks (CNNs), either via multi-level feature integration or
multi-resolution aggregation. Despite achieving some success, the feature
fusion approaches increase the complexity and the opacity of CNNs. To address
this issue, we propose a novel CNN model named Multi-Scale Deep Supervision
Network (P-MSDSNet) that learns feature maps at different scales with deep
supervisions to produce attention maps for adaptive feature propagation from
layers to layers. P-MSDSNet has a multi-stage architecture which makes it
scalable while its deep supervision with spatial attention improves
transparency to the feature learning at each stage. We show that P-MSDSNet
outperforms the state-of-the-art approaches on benchmark datasets while
requiring fewer number of parameters. We also show the application of P-MSDSNet
to quantify finger tapping hand movements in a neuroscience study.
Related papers
- Multi-scale Unified Network for Image Classification [33.560003528712414]
CNNs face notable challenges in performance and computational efficiency when dealing with real-world, multi-scale image inputs.
We propose Multi-scale Unified Network (MUSN) consisting of multi-scales, a unified network, and scale-invariant constraint.
MUSN yields an accuracy increase up to 44.53% and diminishes FLOPs by 7.01-16.13% in multi-scale scenarios.
arXiv Detail & Related papers (2024-03-27T06:40:26Z) - Pyramid Feature Attention Network for Monocular Depth Prediction [8.615717738037823]
We propose a Pyramid Feature Attention Network (PFANet) to improve the high-level context features and low-level spatial features.
Our method outperforms state-of-the-art methods on the KITTI dataset.
arXiv Detail & Related papers (2024-03-03T08:33:23Z) - HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness [2.341385717236931]
We propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection.
Our motivation comes from the observation that the multi-granularity properties of geometric priors correlate well with the neural network hierarchies.
Our HiDAnet performs favorably over the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2023-01-18T10:00:59Z) - FuNNscope: Visual microscope for interactively exploring the loss
landscape of fully connected neural networks [77.34726150561087]
We show how to explore high-dimensional landscape characteristics of neural networks.
We generalize observations on small neural networks to more complex systems.
An interactive dashboard opens up a number of possible application networks.
arXiv Detail & Related papers (2022-04-09T16:41:53Z) - Multi-level Second-order Few-shot Learning [111.0648869396828]
We propose a Multi-level Second-order (MlSo) few-shot learning network for supervised or unsupervised few-shot image classification and few-shot action recognition.
We leverage so-called power-normalized second-order base learner streams combined with features that express multiple levels of visual abstraction.
We demonstrate respectable results on standard datasets such as Omniglot, mini-ImageNet, tiered-ImageNet, Open MIC, fine-grained datasets such as CUB Birds, Stanford Dogs and Cars, and action recognition datasets such as HMDB51, UCF101, and mini-MIT.
arXiv Detail & Related papers (2022-01-15T19:49:00Z) - Semi-supervised Network Embedding with Differentiable Deep Quantisation [81.49184987430333]
We develop d-SNEQ, a differentiable quantisation method for network embedding.
d-SNEQ incorporates a rank loss to equip the learned quantisation codes with rich high-order information.
It is able to substantially compress the size of trained embeddings, thus reducing storage footprint and accelerating retrieval speed.
arXiv Detail & Related papers (2021-08-20T11:53:05Z) - Interflow: Aggregating Multi-layer Feature Mappings with Attention
Mechanism [0.7614628596146599]
This paper proposes the Interflow algorithm specially for traditional CNN models.
Interflow divides CNNs into several stages according to the depth and makes predictions by the feature mappings in each stage.
It can alleviate gradient vanishing problem, lower the difficulty of network depth selection, and lighten possible over-fitting problem.
arXiv Detail & Related papers (2021-06-26T18:22:01Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Multi-Attention-Network for Semantic Segmentation of Fine Resolution
Remote Sensing Images [10.835342317692884]
The accuracy of semantic segmentation in remote sensing images has been increased significantly by deep convolutional neural networks.
This paper proposes a Multi-Attention-Network (MANet) to address these issues.
A novel attention mechanism of kernel attention with linear complexity is proposed to alleviate the large computational demand in attention.
arXiv Detail & Related papers (2020-09-03T09:08:02Z) - Cross-Attention in Coupled Unmixing Nets for Unsupervised Hyperspectral
Super-Resolution [79.97180849505294]
We propose a novel coupled unmixing network with a cross-attention mechanism, CUCaNet, to enhance the spatial resolution of HSI.
Experiments are conducted on three widely-used HS-MS datasets in comparison with state-of-the-art HSI-SR models.
arXiv Detail & Related papers (2020-07-10T08:08:20Z) - When Residual Learning Meets Dense Aggregation: Rethinking the
Aggregation of Deep Neural Networks [57.0502745301132]
We propose Micro-Dense Nets, a novel architecture with global residual learning and local micro-dense aggregations.
Our micro-dense block can be integrated with neural architecture search based models to boost their performance.
arXiv Detail & Related papers (2020-04-19T08:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.