Related papers: IntrinsicReal: Adapting IntrinsicAnything from Synthetic to Real Objects

IntrinsicReal: Adapting IntrinsicAnything from Synthetic to Real Objects

URL: http://arxiv.org/abs/2509.00777v1
Date: Sun, 31 Aug 2025 10:15:31 GMT
Title: IntrinsicReal: Adapting IntrinsicAnything from Synthetic to Real Objects
Authors: Xiaokang Wei, Zizheng Yan, Zhangyang Xiong, Yiming Hao, Yipeng Qin, Xiaoguang Han,
Abstract summary: Estimating albedo (a.k.a., intrinsic image decomposition) from single RGB images captured in real-world environments presents a significant challenge.<n>We propose IntrinsicReal, a novel domain adaptation framework that bridges the domain gap for real-world intrinsic image decomposition.<n>Our IntrinsicReal significantly outperforms existing methods, achieving state-of-the-art results for albedo estimation on both synthetic and real-world datasets.
Score: 26.907664563000257
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Estimating albedo (a.k.a., intrinsic image decomposition) from single RGB images captured in real-world environments (e.g., the MVImgNet dataset) presents a significant challenge due to the absence of paired images and their ground truth albedos. Therefore, while recent methods (e.g., IntrinsicAnything) have achieved breakthroughs by harnessing powerful diffusion priors, they remain predominantly trained on large-scale synthetic datasets (e.g., Objaverse) and applied directly to real-world RGB images, which ignores the large domain gap between synthetic and real-world data and leads to suboptimal generalization performance. In this work, we address this gap by proposing IntrinsicReal, a novel domain adaptation framework that bridges the above-mentioned domain gap for real-world intrinsic image decomposition. Specifically, our IntrinsicReal adapts IntrinsicAnything to the real domain by fine-tuning it using its high-quality output albedos selected by a novel dual pseudo-labeling strategy: i) pseudo-labeling with an absolute confidence threshold on classifier predictions, and ii) pseudo-labeling using the relative preference ranking of classifier predictions for individual input objects. This strategy is inspired by human evaluation, where identifying the highest-quality outputs is straightforward, but absolute scores become less reliable for sub-optimal cases. In these situations, relative comparisons of outputs become more accurate. To implement this, we propose a novel two-phase pipeline that sequentially applies these pseudo-labeling techniques to effectively adapt IntrinsicAnything to the real domain. Experimental results show that our IntrinsicReal significantly outperforms existing methods, achieving state-of-the-art results for albedo estimation on both synthetic and real-world datasets.

Related papers

Coarse-to-Fine Hierarchical Alignment for UAV-based Human Detection using Diffusion Models [14.696438400081114]
We introduce a three-stage diffusion-based framework designed to transform synthetic data for UAV-based human detection.<n>Cwd explicitly decouples global style and local content domain discrepancies and bridges those gaps using three modules.<n>Our method achieves up to $+14.1$ improvement of mAP50 on Semantic-Drone benchmark.
arXiv Detail & Related papers (2025-12-15T19:57:36Z)
Bridging the Synthetic-Real Gap: Supervised Domain Adaptation for Robust Spacecraft 6-DoF Pose Estimation [13.83897333268682]
Spacecraft Pose Estimation is a fundamental capability for autonomous space operations such as rendezvous, docking, and in-orbit docking.<n>Existing domain adaptation approaches aim to mitigate this issue but often underperform when a modest number of labeled target samples are available.<n>We propose the first Supervised Domain Adaptation (SDA) framework tailored for SPE keypoint regression.
arXiv Detail & Related papers (2025-09-17T08:03:05Z)
From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios [66.57089888022414]
We introduce DenseWorld, a benchmark spanning a broad set of 25 dense prediction tasks that correspond to urgent real-world applications.<n>We then propose DenseDiT, which exploits generative models' visual priors to perform diverse real-world dense prediction tasks through a unified strategy.<n>DenseDiT achieves superior results using less than 0.01% training data of baselines, underscoring its practical value for real-world deployment.
arXiv Detail & Related papers (2025-06-25T09:40:50Z)
Adaptive Face Recognition Using Adversarial Information Network [57.29464116557734]
Face recognition models often degenerate when training data are different from testing data. We propose a novel adversarial information network (AIN) to address it.
arXiv Detail & Related papers (2023-05-23T02:14:11Z)
One-Shot Domain Adaptive and Generalizable Semantic Segmentation with Class-Aware Cross-Domain Transformers [96.51828911883456]
Unsupervised sim-to-real domain adaptation (UDA) for semantic segmentation aims to improve the real-world test performance of a model trained on simulated data. Traditional UDA often assumes that there are abundant unlabeled real-world data samples available during training for the adaptation. We explore the one-shot unsupervised sim-to-real domain adaptation (OSUDA) and generalization problem, where only one real-world data sample is available.
arXiv Detail & Related papers (2022-12-14T15:54:15Z)
Unsupervised Domain Adaptive Salient Object Detection Through Uncertainty-Aware Pseudo-Label Learning [104.00026716576546]
We propose to learn saliency from synthetic but clean labels, which naturally has higher pixel-labeling quality without the effort of manual annotations. We show that our proposed method outperforms the existing state-of-the-art deep unsupervised SOD methods on several benchmark datasets.
arXiv Detail & Related papers (2022-02-26T16:03:55Z)
Low-confidence Samples Matter for Domain Adaptation [47.552605279925736]
Domain adaptation (DA) aims to transfer knowledge from a label-rich source domain to a related but label-scarce target domain. We propose a novel contrastive learning method by processing low-confidence samples. We evaluate the proposed method in both unsupervised and semi-supervised DA settings.
arXiv Detail & Related papers (2022-02-06T15:45:45Z)
Domain Adaptation for Underwater Image Enhancement [51.71570701102219]
We propose a novel Two-phase Underwater Domain Adaptation network (TUDA) to minimize the inter-domain and intra-domain gap. In the first phase, a new dual-alignment network is designed, including a translation part for enhancing realism of input images, followed by an enhancement part. In the second phase, we perform an easy-hard classification of real data according to the assessed quality of enhanced images, where a rank-based underwater quality assessment method is embedded.
arXiv Detail & Related papers (2021-08-22T06:38:19Z)
Content Disentanglement for Semantically Consistent Synthetic-to-RealDomain Adaptation in Urban Traffic Scenes [39.38387505091648]
Synthetic data generation is an appealing approach to generate novel traffic scenarios in autonomous driving. Deep learning techniques trained solely on synthetic data encounter dramatic performance drops when they are tested on real data. We propose a new, unsupervised, end-to-end domain adaptation network architecture that enables semantically consistent domain adaptation between synthetic and real data.
arXiv Detail & Related papers (2021-05-18T17:42:26Z)
Gradient Matching for Domain Generalization [93.04545793814486]
A critical requirement of machine learning systems is their ability to generalize to unseen domains. We propose an inter-domain gradient matching objective that targets domain generalization. We derive a simpler first-order algorithm named Fish that approximates its optimization.
arXiv Detail & Related papers (2021-04-20T12:55:37Z)
Phase Consistent Ecological Domain Adaptation [76.75730500201536]
We focus on the task of semantic segmentation, where annotated synthetic data are aplenty, but annotating real data is laborious. The first criterion, inspired by visual psychophysics, is that the map between the two image domains be phase-preserving. The second criterion aims to leverage ecological statistics, or regularities in the scene which are manifest in any image of it, regardless of the characteristics of the illuminant or the imaging sensor.
arXiv Detail & Related papers (2020-04-10T06:58:03Z)
Domain Decluttering: Simplifying Images to Mitigate Synthetic-Real Domain Shift and Improve Depth Estimation [16.153683223016973]
We develop an attention module that learns to identify and remove difficult out-of-domain regions in real images. Visualizing the removed regions provides interpretable insights into the synthetic-real domain gap.
arXiv Detail & Related papers (2020-02-27T14:28:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.