Related papers: NeuralFur: Animal Fur Reconstruction From Multi-View Images

NeuralFur: Animal Fur Reconstruction From Multi-View Images

URL: http://arxiv.org/abs/2601.12481v1
Date: Sun, 18 Jan 2026 16:46:38 GMT
Title: NeuralFur: Animal Fur Reconstruction From Multi-View Images
Authors: Vanessa Sklyarova, Berna Kabadayi, Anastasios Yiannakidis, Giorgio Becherini, Michael J. Black, Justus Thies,
Abstract summary: Reconstructing realistic animal fur geometry from images is a challenging task due to the fine-scale details, self-occlusion, and view-dependent appearance of fur.<n>We present a first multi-view-based method for high-fidelity 3D fur modeling of animals using a strand-based representation.
Score: 56.497408146667205
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reconstructing realistic animal fur geometry from images is a challenging task due to the fine-scale details, self-occlusion, and view-dependent appearance of fur. In contrast to human hairstyle reconstruction, there are also no datasets that can be leveraged to learn a fur prior for different animals. In this work, we present a first multi-view-based method for high-fidelity 3D fur modeling of animals using a strand-based representation, leveraging the general knowledge of a vision language model. Given multi-view RGB images, we first reconstruct a coarse surface geometry using traditional multi-view stereo techniques. We then use a vision language model (VLM) system to retrieve information about the realistic length structure of the fur for each part of the body. We use this knowledge to construct the animal's furless geometry and grow strands atop it. The fur reconstruction is supervised with both geometric and photometric losses computed from multi-view images. To mitigate orientation ambiguities stemming from the Gabor filters that are applied to the input images, we additionally utilize the VLM to guide the strands' growth direction and their relation to the gravity vector that we incorporate as a loss. With this new schema of using a VLM to guide 3D reconstruction from multi-view inputs, we show generalization across a variety of animals with different fur types. For additional results and code, please refer to https://neuralfur.is.tue.mpg.de.

Related papers

BigMaQ: A Big Macaque Motion and Animation Dataset Bridging Image and 3D Pose Representations [38.868479054644354]
Recognition of dynamic and social behavior in animals is fundamental for advancing ethology, ecology, medicine and neuroscience.<n>Recent progress in deep learning has enabled automated behavior recognition from video, yet an accurate reconstruction of the three-dimensional (3D) pose and shape has not been integrated into this process.<n>$texttBigMaQ$ establishes the first dataset that both integrates dynamic 3D pose-shape representations into the learning task of animal action recognition.
arXiv Detail & Related papers (2026-02-23T14:21:15Z)
Reconstructing Animals and the Wild [51.98009864071166]
We propose a method to reconstruct natural scenes from single images.<n>We base our approach on advances leveraging the strong world priors in Large Language Models.<n>We propose a synthetic dataset comprising one million images and thousands of assets.
arXiv Detail & Related papers (2024-11-27T23:24:27Z)
Learning the 3D Fauna of the Web [70.01196719128912]
We develop 3D-Fauna, an approach that learns a pan-category deformable 3D animal model for more than 100 animal species jointly. One crucial bottleneck of modeling animals is the limited availability of training data. We show that prior category-specific attempts fail to generalize to rare species with limited training images.
arXiv Detail & Related papers (2024-01-04T18:32:48Z)
MagicPony: Learning Articulated 3D Animals in the Wild [81.63322697335228]
We present a new method, dubbed MagicPony, that learns this predictor purely from in-the-wild single-view images of the object category. At its core is an implicit-explicit representation of articulated shape and appearance, combining the strengths of neural fields and meshes.
arXiv Detail & Related papers (2022-11-22T18:59:31Z)
LASSIE: Learning Articulated Shapes from Sparse Image Ensemble via 3D Part Discovery [72.3681707384754]
We propose a practical problem setting to estimate 3D pose and shape of animals given only a few in-the-wild images of a particular animal species. We do not assume any form of 2D or 3D ground-truth annotations, nor do we leverage any multi-view or temporal information. Following these insights, we propose LASSIE, a novel optimization framework which discovers 3D parts in a self-supervised manner.
arXiv Detail & Related papers (2022-07-07T17:00:07Z)
Coarse-to-fine Animal Pose and Shape Estimation [67.39635503744395]
We propose a coarse-to-fine approach to reconstruct 3D animal mesh from a single image. The coarse estimation stage first estimates the pose, shape and translation parameters of the SMAL model. The estimated meshes are then used as a starting point by a graph convolutional network (GCN) to predict a per-vertex deformation in the refinement stage.
arXiv Detail & Related papers (2021-11-16T01:27:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.