Related papers: High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition

High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition

URL: http://arxiv.org/abs/2307.05541v1
Date: Sat, 8 Jul 2023 19:26:09 GMT
Title: High Fidelity 3D Hand Shape Reconstruction via Scalable Graph Frequency Decomposition
Authors: Tianyu Luan, Yuanhao Zhai, Jingjing Meng, Zhong Li, Zhang Chen, Yi Xu, and Junsong Yuan
Abstract summary: We design a frequency split network to generate 3D hand mesh using different frequency bands in a coarse-to-fine manner. To capture high-frequency personalized details, we transform the 3D mesh into the frequency domain, and propose a novel frequency decomposition loss. Our approach generates fine-grained details for high-fidelity 3D hand reconstruction.
Score: 77.29516516532439
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite the impressive performance obtained by recent single-image hand modeling techniques, they lack the capability to capture sufficient details of the 3D hand mesh. This deficiency greatly limits their applications when high-fidelity hand modeling is required, e.g., personalized hand modeling. To address this problem, we design a frequency split network to generate 3D hand mesh using different frequency bands in a coarse-to-fine manner. To capture high-frequency personalized details, we transform the 3D mesh into the frequency domain, and propose a novel frequency decomposition loss to supervise each frequency component. By leveraging such a coarse-to-fine scheme, hand details that correspond to the higher frequency domain can be preserved. In addition, the proposed network is scalable, and can stop the inference at any resolution level to accommodate different hardware with varying computational powers. To quantitatively evaluate the performance of our method in terms of recovering personalized shape details, we introduce a new evaluation metric named Mean Signal-to-Noise Ratio (MSNR) to measure the signal-to-noise ratio of each mesh frequency component. Extensive experiments demonstrate that our approach generates fine-grained details for high-fidelity 3D hand reconstruction, and our evaluation metric is more effective for measuring mesh details compared with traditional metrics.

Related papers

Freqformer: Image-Demoiréing Transformer via Efficient Frequency Decomposition [83.40450475728792]
We present Freqformer, a Transformer-based framework specifically designed for image demoir'eing through targeted frequency separation.<n>Our method performs an effective frequency decomposition that explicitly splits moir'e patterns into high-frequency spatially-localized textures and low-frequency scale-robust color distortions.<n>Experiments on various demoir'eing benchmarks demonstrate that Freqformer achieves state-of-the-art performance with a compact model size.
arXiv Detail & Related papers (2025-05-25T12:23:10Z)
EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization. We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z)
A Neural Network Architecture Based on Attention Gate Mechanism for 3D Magnetotelluric Forward Modeling [1.5862483908050367]
We propose a novel neural network architecture named MTAGU-Net, which integrates an attention gating mechanism for 3D MT forward modeling. A dual-path attention gating module is designed based on forward response data images and embedded in the skip connections between the encoder and decoder. A synthetic model generation method utilizing 3D Gaussian random field (GRF) accurately replicates the electrical structures of real-world geological scenarios.
arXiv Detail & Related papers (2025-03-14T13:48:25Z)
Sharpening Neural Implicit Functions with Frequency Consolidation Priors [53.6277160912059]
Signed Distance Functions (SDFs) are vital implicit representations to represent high fidelity 3D surfaces. Current methods mainly leverage a neural network to learn an SDF from various supervisions including signed, 3D point clouds, or multi-view images. We introduce a method to sharpen a low frequency SDF observation by recovering its high frequency components, pursuing a sharper and more complete surface.
arXiv Detail & Related papers (2024-12-27T16:18:46Z)
WiLoR: End-to-end 3D Hand Localization and Reconstruction in-the-wild [53.288327629960364]
We present a data-driven pipeline for efficient multi-hand reconstruction in the wild. The proposed pipeline is composed of two components: a real-time fully convolutional hand localization and a high-fidelity transformer-based 3D hand reconstruction model. Our approach outperforms previous methods in both efficiency and accuracy on popular 2D and 3D benchmarks.
arXiv Detail & Related papers (2024-09-18T18:46:51Z)
Implicit Neural Representations with Fourier Kolmogorov-Arnold Networks [4.499833362998488]
Implicit neural representations (INRs) use neural networks to provide continuous and resolution-independent representations of complex signals. The proposed FKAN utilizes learnable activation functions modeled as Fourier series in the first layer to effectively control and learn the task-specific frequency components. Experimental results show that our proposed FKAN model outperforms three state-of-the-art baseline schemes.
arXiv Detail & Related papers (2024-09-14T05:53:33Z)
WiNet: Wavelet-based Incremental Learning for Efficient Medical Image Registration [68.25711405944239]
Deep image registration has demonstrated exceptional accuracy and fast inference. Recent advances have adopted either multiple cascades or pyramid architectures to estimate dense deformation fields in a coarse-to-fine manner. We introduce a model-driven WiNet that incrementally estimates scale-wise wavelet coefficients for the displacement/velocity field across various scales.
arXiv Detail & Related papers (2024-07-18T11:51:01Z)
NeRF-DetS: Enhancing Multi-View 3D Object Detection with Sampling-adaptive Network of Continuous NeRF-based Representation [60.47114985993196]
NeRF-Det unifies the tasks of novel view arithmetic and 3D perception. We introduce a novel 3D perception network structure, NeRF-DetS. NeRF-DetS outperforms competitive NeRF-Det on the ScanNetV2 dataset.
arXiv Detail & Related papers (2024-04-22T06:59:03Z)
Deep Spectral Meshes: Multi-Frequency Facial Mesh Processing with Graph Neural Networks [1.170907599257096]
spectral meshes are introduced as a method to decompose mesh deformations into low- and high-frequency deformations. A parametric model for 3D facial mesh synthesis is built upon the proposed framework. Our model takes further advantage of spectral partitioning by representing different frequency levels with disparate, more suitable representations.
arXiv Detail & Related papers (2024-02-15T23:17:08Z)
Frequency-Adaptive Pan-Sharpening with Mixture of Experts [22.28680499480492]
We propose a novel Frequency Adaptive Mixture of Experts (FAME) learning framework for pan-sharpening. Our method performs the best against other state-of-the-art ones and comprises a strong generalization ability for real-world scenes.
arXiv Detail & Related papers (2024-01-04T08:58:25Z)
HartleyMHA: Self-Attention in Frequency Domain for Resolution-Robust and Parameter-Efficient 3D Image Segmentation [4.48473804240016]
We introduce the HartleyMHA model which is robust to training image resolution with efficient self-attention. We modify the FNO by using the Hartley transform with shared parameters to reduce the model size by orders of magnitude. When tested on the BraTS'19 dataset, it achieved superior robustness to training image resolution than other tested models with less than 1% of their model parameters.
arXiv Detail & Related papers (2023-10-05T18:44:41Z)
Neural Progressive Meshes [54.52990060976026]
We propose a method to transmit 3D meshes with a shared learned generative space. We learn this space using a subdivision-based encoder-decoder architecture trained in advance on a large collection of surfaces. We evaluate our method on a diverse set of complex 3D shapes and demonstrate that it outperforms baselines in terms of compression ratio and reconstruction quality.
arXiv Detail & Related papers (2023-08-10T17:58:02Z)
Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction [57.3636347704271]
3D hand-mesh reconstruction from RGB images facilitates many applications, including augmented reality (AR) This paper presents a novel pipeline by decoupling the hand-mesh reconstruction task into three stages. We can promote high-quality finger-level mesh-image alignment and drive the models together to deliver real-time predictions.
arXiv Detail & Related papers (2021-09-03T20:42:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.