Binarized 3D Whole-body Human Mesh Recovery
- URL: http://arxiv.org/abs/2311.14323v1
- Date: Fri, 24 Nov 2023 07:51:50 GMT
- Title: Binarized 3D Whole-body Human Mesh Recovery
- Authors: Zhiteng Li, Yulun Zhang, Jing Lin, Haotong Qin, Jinjin Gu, Xin Yuan,
Linghe Kong, Xiaokang Yang
- Abstract summary: We propose a Binarized Dual Residual Network (BiDRN) to estimate the 3D human body, face, and hands parameters efficiently.
BiDRN achieves comparable performance with full-precision method Hand4Whole while using just 22.1% parameters and 14.8% operations.
- Score: 104.13364878565737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D whole-body human mesh recovery aims to reconstruct the 3D human body,
face, and hands from a single image. Although powerful deep learning models
have achieved accurate estimation in this task, they require enormous memory
and computational resources. Consequently, these methods can hardly be deployed
on resource-limited edge devices. In this work, we propose a Binarized Dual
Residual Network (BiDRN), a novel quantization method to estimate the 3D human
body, face, and hands parameters efficiently. Specifically, we design a basic
unit Binarized Dual Residual Block (BiDRB) composed of Local Convolution
Residual (LCR) and Block Residual (BR), which can preserve full-precision
information as much as possible. For LCR, we generalize it to four kinds of
convolutional modules so that full-precision information can be propagated even
between mismatched dimensions. We also binarize the face and hands
box-prediction network as Binaried BoxNet, which can further reduce the model
redundancy. Comprehensive quantitative and qualitative experiments demonstrate
the effectiveness of BiDRN, which has a significant improvement over
state-of-the-art binarization algorithms. Moreover, our proposed BiDRN achieves
comparable performance with full-precision method Hand4Whole while using just
22.1% parameters and 14.8% operations. We will release all the code and
pretrained models.
Related papers
- E2ENet: Dynamic Sparse Feature Fusion for Accurate and Efficient 3D
Medical Image Segmentation [36.367368163120794]
We propose a 3D medical image segmentation model, named Efficient to Efficient Network (E2ENet)
It incorporates two parametrically and computationally efficient designs.
It consistently achieves a superior trade-off between accuracy and efficiency across various resource constraints.
arXiv Detail & Related papers (2023-12-07T22:13:37Z) - Spatiotemporal Modeling Encounters 3D Medical Image Analysis:
Slice-Shift UNet with Multi-View Fusion [0.0]
We propose a new 2D-based model dubbed Slice SHift UNet which encodes three-dimensional features at 2D CNN's complexity.
More precisely multi-view features are collaboratively learned by performing 2D convolutions along the three planes of a volume.
The effectiveness of our approach is validated in Multi-Modality Abdominal Multi-Organ axis (AMOS) and Multi-Atlas Labeling Beyond the Cranial Vault (BTCV) datasets.
arXiv Detail & Related papers (2023-07-24T14:53:23Z) - Binarized Spectral Compressive Imaging [59.18636040850608]
Existing deep learning models for hyperspectral image (HSI) reconstruction achieve good performance but require powerful hardwares with enormous memory and computational resources.
We propose a novel method, Binarized Spectral-Redistribution Network (BiSRNet)
BiSRNet is derived by using the proposed techniques to binarize the base model.
arXiv Detail & Related papers (2023-05-17T15:36:08Z) - KTN: Knowledge Transfer Network for Learning Multi-person 2D-3D
Correspondences [77.56222946832237]
We present a novel framework to detect the densepose of multiple people in an image.
The proposed method, which we refer to Knowledge Transfer Network (KTN), tackles two main problems.
It simultaneously maintains feature resolution and suppresses background pixels, and this strategy results in substantial increase in accuracy.
arXiv Detail & Related papers (2022-06-21T03:11:37Z) - Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric
Segmentation [13.158995287578316]
We propose a dynamic architecture network named Med-DANet to achieve effective accuracy and efficiency trade-off.
For each slice of the input 3D MRI volume, our proposed method learns a slice-specific decision by the Decision Network.
Our proposed method achieves comparable or better results than previous state-of-the-art methods for 3D MRI brain tumor segmentation.
arXiv Detail & Related papers (2022-06-14T03:25:58Z) - A Neural Anthropometer Learning from Body Dimensions Computed on Human
3D Meshes [0.0]
We present a method to calculate right and left arm length, shoulder width, and inseam (crotch height) from 3D meshes with focus on potential medical, virtual try-on and distance tailoring applications.
On the other hand, we use four additional body dimensions calculated using recently published methods to assemble a set of eight body dimensions which we use as a supervision signal to our Neural Anthropometer: a convolutional neural network capable of estimating these dimensions.
arXiv Detail & Related papers (2021-10-06T12:56:05Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - Modelling the Distribution of 3D Brain MRI using a 2D Slice VAE [66.63629641650572]
We propose a method to model 3D MR brain volumes distribution by combining a 2D slice VAE with a Gaussian model that captures the relationships between slices.
We also introduce a novel evaluation method for generated volumes that quantifies how well their segmentations match those of true brain anatomy.
arXiv Detail & Related papers (2020-07-09T13:23:15Z) - Attention-Guided Version of 2D UNet for Automatic Brain Tumor
Segmentation [2.371982686172067]
Gliomas are the most common and aggressive among brain tumors, which cause a short life expectancy in their highest grade.
Deep convolutional neural networks (DCNNs) have achieved a remarkable performance in brain tumor segmentation.
However, this task is still difficult owing to high varying intensity and appearance of gliomas.
arXiv Detail & Related papers (2020-04-04T20:09:06Z) - HEMlets PoSh: Learning Part-Centric Heatmap Triplets for 3D Human Pose
and Shape Estimation [60.35776484235304]
This work attempts to address the uncertainty of lifting the detected 2D joints to the 3D space by introducing an intermediate state-Part-Centric Heatmap Triplets (HEMlets)
The HEMlets utilize three joint-heatmaps to represent the relative depth information of the end-joints for each skeletal body part.
A Convolutional Network (ConvNet) is first trained to predict HEMlets from the input image, followed by a volumetric joint-heatmap regression.
arXiv Detail & Related papers (2020-03-10T04:03:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.