Guardians of the Hair: Rescuing Soft Boundaries in Depth, Stereo, and Novel Views
- URL: http://arxiv.org/abs/2601.03362v1
- Date: Tue, 06 Jan 2026 19:02:34 GMT
- Title: Guardians of the Hair: Rescuing Soft Boundaries in Depth, Stereo, and Novel Views
- Authors: Xiang Zhang, Yang Zhang, Lukas Mehl, Markus Gross, Christopher Schroers,
- Abstract summary: This paper introduces HairGuard, a framework designed to recover fine-grained soft boundary details in 3D vision tasks.<n>Experiments demonstrate that HairGuard achieves state-of-the-art performance across monocular depth estimation, stereo image/video conversion, and novel view synthesis.
- Score: 20.270591069701677
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Soft boundaries, like thin hairs, are commonly observed in natural and computer-generated imagery, but they remain challenging for 3D vision due to the ambiguous mixing of foreground and background cues. This paper introduces Guardians of the Hair (HairGuard), a framework designed to recover fine-grained soft boundary details in 3D vision tasks. Specifically, we first propose a novel data curation pipeline that leverages image matting datasets for training and design a depth fixer network to automatically identify soft boundary regions. With a gated residual module, the depth fixer refines depth precisely around soft boundaries while maintaining global depth quality, allowing plug-and-play integration with state-of-the-art depth models. For view synthesis, we perform depth-based forward warping to retain high-fidelity textures, followed by a generative scene painter that fills disoccluded regions and eliminates redundant background artifacts within soft boundaries. Finally, a color fuser adaptively combines warped and inpainted results to produce novel views with consistent geometry and fine-grained details. Extensive experiments demonstrate that HairGuard achieves state-of-the-art performance across monocular depth estimation, stereo image/video conversion, and novel view synthesis, with significant improvements in soft boundary regions.
Related papers
- LaFiTe: A Generative Latent Field for 3D Native Texturing [72.05710323154288]
Existing native approaches are sparse by the absence of a powerful and versatile representation, which severely limits the fidelity and generality of their generated textures.<n>We introduce LaFiTe, which generates high-quality textures constrained by a sparse color representation and UV parameterization.
arXiv Detail & Related papers (2025-12-04T13:33:49Z) - Learning Fine-Grained Geometry for Sparse-View Splatting via Cascade Depth Loss [15.425094458647933]
We introduce Hierarchical Depth-Guided Splatting (HDGS), a depth supervision framework that progressively refines geometry from coarse to fine levels.<n>By enforcing multi-scale depth consistency, our method substantially improves structural fidelity in sparse-view scenarios.
arXiv Detail & Related papers (2025-05-28T12:16:42Z) - Improving Geometric Consistency for 360-Degree Neural Radiance Fields in Indoor Scenarios [3.5229503563299915]
Photo-realistic rendering and novel view synthesis play a crucial role in human-computer interaction tasks.<n>NeRFs often struggle in large, low-textured areas, producing cloudy artifacts known as ''floaters''<n>We introduce a novel depth loss function to enhance rendering quality in challenging, low-feature regions.
arXiv Detail & Related papers (2025-03-17T20:30:48Z) - DepthLab: From Partial to Complete [80.58276388743306]
Missing values remain a common challenge for depth data across its wide range of applications.<n>This work bridges this gap with DepthLab, a foundation depth inpainting model powered by image diffusion priors.<n>Our approach proves its worth in various downstream tasks, including 3D scene inpainting, text-to-3D scene generation, sparse-view reconstruction with DUST3R, and LiDAR depth completion.
arXiv Detail & Related papers (2024-12-24T04:16:38Z) - Depth-aware Volume Attention for Texture-less Stereo Matching [67.46404479356896]
We propose a lightweight volume refinement scheme to tackle the texture deterioration in practical outdoor scenarios.
We introduce a depth volume supervised by the ground-truth depth map, capturing the relative hierarchy of image texture.
Local fine structure and context are emphasized to mitigate ambiguity and redundancy during volume aggregation.
arXiv Detail & Related papers (2024-02-14T04:07:44Z) - Delicate Textured Mesh Recovery from NeRF via Adaptive Surface
Refinement [78.48648360358193]
We present a novel framework that generates textured surface meshes from images.
Our approach begins by efficiently initializing the geometry and view-dependency appearance with a NeRF.
We jointly refine the appearance with geometry and bake it into texture images for real-time rendering.
arXiv Detail & Related papers (2023-03-03T17:14:44Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - DepthGAN: GAN-based Depth Generation of Indoor Scenes from Semantic
Layouts [8.760217259912231]
We propose DepthGAN, a novel method of generating depth maps with only semantic layouts as input.
We show that DepthGAN achieves superior performance both on quantitative results and visual effects in the depth generation task.
We also show that 3D indoor scenes can be reconstructed by our generated depth maps with reasonable structure and spatial coherency.
arXiv Detail & Related papers (2022-03-22T04:18:45Z) - SelfDeco: Self-Supervised Monocular Depth Completion in Challenging
Indoor Environments [50.761917113239996]
We present a novel algorithm for self-supervised monocular depth completion.
Our approach is based on training a neural network that requires only sparse depth measurements and corresponding monocular video sequences without dense depth labels.
Our self-supervised algorithm is designed for challenging indoor environments with textureless regions, glossy and transparent surface, non-Lambertian surfaces, moving people, longer and diverse depth ranges and scenes captured by complex ego-motions.
arXiv Detail & Related papers (2020-11-10T08:55:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.