Fine Structure-Aware Sampling: A New Sampling Training Scheme for
Pixel-Aligned Implicit Models in Single-View Human Reconstruction
- URL: http://arxiv.org/abs/2402.19197v1
- Date: Thu, 29 Feb 2024 14:26:46 GMT
- Title: Fine Structure-Aware Sampling: A New Sampling Training Scheme for
Pixel-Aligned Implicit Models in Single-View Human Reconstruction
- Authors: Kennard Yanting Chan, Fayao Liu, Guosheng Lin, Chuan Sheng Foo, Weisi
Lin
- Abstract summary: We introduce Fine Structured-Aware Sampling (FSS) to train pixel-aligned implicit models for single-view human reconstruction.
FSS proactively adapts to the thickness and complexity of surfaces.
It also proposes a mesh thickness loss signal for pixel-aligned implicit models.
- Score: 105.46091601932524
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Pixel-aligned implicit models, such as PIFu, PIFuHD, and ICON, are used for
single-view clothed human reconstruction. These models need to be trained using
a sampling training scheme. Existing sampling training schemes either fail to
capture thin surfaces (e.g. ears, fingers) or cause noisy artefacts in
reconstructed meshes. To address these problems, we introduce Fine
Structured-Aware Sampling (FSS), a new sampling training scheme to train
pixel-aligned implicit models for single-view human reconstruction. FSS
resolves the aforementioned problems by proactively adapting to the thickness
and complexity of surfaces. In addition, unlike existing sampling training
schemes, FSS shows how normals of sample points can be capitalized in the
training process to improve results. Lastly, to further improve the training
process, FSS proposes a mesh thickness loss signal for pixel-aligned implicit
models. It becomes computationally feasible to introduce this loss once a
slight reworking of the pixel-aligned implicit function framework is carried
out. Our results show that our methods significantly outperform SOTA methods
qualitatively and quantitatively. Our code is publicly available at
https://github.com/kcyt/FSS.
Related papers
- Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think [72.48325960659822]
One main bottleneck in training large-scale diffusion models for generation lies in effectively learning these representations.
We study this by introducing a straightforward regularization called REPresentation Alignment (REPA), which aligns the projections of noisy input hidden states in denoising networks with clean image representations obtained from external, pretrained visual encoders.
The results are striking: our simple strategy yields significant improvements in both training efficiency and generation quality when applied to popular diffusion and flow-based transformers, such as DiTs and SiTs.
arXiv Detail & Related papers (2024-10-09T14:34:53Z) - Efficient NeRF Optimization -- Not All Samples Remain Equally Hard [9.404889815088161]
We propose an application of online hard sample mining for efficient training of Neural Radiance Fields (NeRF)
NeRF models produce state-of-the-art quality for many 3D reconstruction and rendering tasks but require substantial computational resources.
arXiv Detail & Related papers (2024-08-06T13:49:01Z) - SGM-PINN: Sampling Graphical Models for Faster Training of Physics-Informed Neural Networks [4.262342157729123]
SGM-PINN is a graph-based importance sampling framework to improve the training efficacy of Physics-Informed Neural Networks (PINNs)
Experiments demonstrate the advantages of the proposed framework, achieving $3times$ faster convergence compared to prior state-of-the-art sampling methods.
arXiv Detail & Related papers (2024-07-10T04:31:50Z) - FSL-Rectifier: Rectify Outliers in Few-Shot Learning via Test-Time Augmentation [7.477118370563593]
Few-shot-learning (FSL) commonly requires a model to identify images (queries) that belong to classes unseen during training.
We generate additional test-class samples by combining original samples with suitable train-class samples via a generative image combiner.
We obtain averaged features via an augmentor, which leads to more typical representations through the averaging.
arXiv Detail & Related papers (2024-02-28T12:37:30Z) - SDWNet: A Straight Dilated Network with Wavelet Transformation for Image
Deblurring [23.86692375792203]
Image deblurring is a computer vision problem that aims to recover a sharp image from a blurred image.
Our model uses dilated convolution to enable the obtainment of the large receptive field with high spatial resolution.
We propose a novel module using the wavelet transform, which effectively helps the network to recover clear high-frequency texture details.
arXiv Detail & Related papers (2021-10-12T07:58:10Z) - Toward Real-World Super-Resolution via Adaptive Downsampling Models [58.38683820192415]
This study proposes a novel method to simulate an unknown downsampling process without imposing restrictive prior knowledge.
We propose a generalizable low-frequency loss (LFL) in the adversarial training framework to imitate the distribution of target LR images without using any paired examples.
arXiv Detail & Related papers (2021-09-08T06:00:32Z) - 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment
Feedback Loop [128.07841893637337]
Regression-based methods have recently shown promising results in reconstructing human meshes from monocular images.
Minor deviation in parameters may lead to noticeable misalignment between the estimated meshes and image evidences.
We propose a Pyramidal Mesh Alignment Feedback (PyMAF) loop to leverage a feature pyramid and rectify the predicted parameters.
arXiv Detail & Related papers (2021-03-30T17:07:49Z) - Scalable Deep Compressive Sensing [43.92187349325869]
Most existing deep learning methods train different models for different subsampling ratios, which brings additional hardware burden.
We develop a general framework named scalable deep compressive sensing (SDCS) for the scalable sampling and reconstruction (SSR) of all existing end-to-end-trained models.
Experimental results show that models with SDCS can achieve SSR without changing their structure while maintaining good performance, and SDCS outperforms other SSR methods.
arXiv Detail & Related papers (2021-01-20T08:42:50Z) - DeFlow: Learning Complex Image Degradations from Unpaired Data with
Conditional Flows [145.83812019515818]
We propose DeFlow, a method for learning image degradations from unpaired data.
We model the degradation process in the latent space of a shared flow-decoder network.
We validate our DeFlow formulation on the task of joint image restoration and super-resolution.
arXiv Detail & Related papers (2021-01-14T18:58:01Z) - Shape My Face: Registering 3D Face Scans by Surface-to-Surface
Translation [75.59415852802958]
Shape-My-Face (SMF) is a powerful encoder-decoder architecture based on an improved point cloud encoder, a novel visual attention mechanism, graph convolutional decoders with skip connections, and a specialized mouth model.
Our model provides topologically-sound meshes with minimal supervision, offers faster training time, has orders of magnitude fewer trainable parameters, is more robust to noise, and can generalize to previously unseen datasets.
arXiv Detail & Related papers (2020-12-16T20:02:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.