Human-Aligned Evaluation of a Pixel-wise DNN Color Constancy Model
- URL: http://arxiv.org/abs/2602.13887v1
- Date: Sat, 14 Feb 2026 21:03:29 GMT
- Title: Human-Aligned Evaluation of a Pixel-wise DNN Color Constancy Model
- Authors: Hamed Heidari-Gorji, Raquel Gil Rodriguez, Karl R. Gegenfurtner,
- Abstract summary: We compare and study a model and human performance with respect to established color constancy mechanisms.<n>Model performance was assessed using the same achromatic object selection task employed in the human experiments.<n>Results show a strong correspondence between the model and human behavior.
- Score: 0.06554326244334864
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We previously investigated color constancy in photorealistic virtual reality (VR) and developed a Deep Neural Network (DNN) that predicts reflectance from rendered images. Here, we combine both approaches to compare and study a model and human performance with respect to established color constancy mechanisms: local surround, maximum flux and spatial mean. Rather than evaluating the model against physical ground truth, model performance was assessed using the same achromatic object selection task employed in the human experiments. The model, a ResNet based U-Net from our previous work, was pre-trained on rendered images to predict surface reflectance. We then applied transfer learning, fine-tuning only the network's decoder on images from the baseline VR condition. To parallel the human experiment, the model's output was used to perform the same achromatic object selection task across all conditions. Results show a strong correspondence between the model and human behavior. Both achieved high constancy under baseline conditions and showed similar, condition-dependent performance declines when the local surround or spatial mean color cues were removed.
Related papers
- Human-level 3D shape perception emerges from multi-view learning [63.048728487674815]
We develop a modeling framework that predicts human 3D shape inferences for arbitrary objects.<n>We achieve this with a novel class of neural networks trained using a visual-spatial objective over naturalistic sensory data.<n>We find that human-level 3D perception can emerge from a simple, scalable learning objective over naturalistic visual-spatial data.
arXiv Detail & Related papers (2026-02-19T18:56:05Z) - Differentiable Inverse Graphics for Zero-shot Scene Reconstruction and Robot Grasping [0.820984376071696]
We introduce a differentiable neuro-graphics model that combines neural foundation models with physics-based differentiable rendering to perform zero-shot scene reconstruction and robot grasping.<n>Our approach offers a pathway towards more data efficient, interpretable and generalizable robot autonomy in novel environments.
arXiv Detail & Related papers (2026-02-04T20:33:50Z) - COLIBRI Fuzzy Model: Color Linguistic-Based Representation and Interpretation [0.0]
This paper introduces the Human Perception-Based Fuzzy Color Model, COLIBRI, to bridge the gap between computational color representations and human visual perception.<n>The proposed model uses fuzzy sets and logic to create a framework for color categorization.<n>Our findings are significant for fields such as design, artificial intelligence, marketing, and human-computer interaction.
arXiv Detail & Related papers (2025-07-15T17:01:45Z) - Object Pose Estimation Using Implicit Representation For Transparent Objects [0.0]
The render-and-compare method renders the object from multiple views and compares it against the given 2D image.
We show that if the object is represented as an implicit (neural) representation in the form of Neural Radiance Field (NeRF), it exhibits a more realistic rendering of the actual scene.
We evaluated our NeRF implementation of the render-and-compare method on transparent datasets and found that it surpassed the current state-of-the-art results.
arXiv Detail & Related papers (2024-10-17T11:51:12Z) - Divergences in Color Perception between Deep Neural Networks and Humans [3.0315685825606633]
We develop experiments for evaluating the perceptual coherence of color embeddings in deep neural networks (DNNs)
We assess how well these algorithms predict human color similarity judgments collected via an online survey.
We compare DNN performance against an interpretable and cognitively plausible model of color perception based on wavelet decomposition.
arXiv Detail & Related papers (2023-09-11T20:26:40Z) - Relightify: Relightable 3D Faces from a Single Image via Diffusion
Models [86.3927548091627]
We present the first approach to use diffusion models as a prior for highly accurate 3D facial BRDF reconstruction from a single image.
In contrast to existing methods, we directly acquire the observed texture from the input image, thus, resulting in more faithful and consistent estimation.
arXiv Detail & Related papers (2023-05-10T11:57:49Z) - NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects [63.04781030984006]
Dynamic Neural Radiance Field (NeRF) is a powerful algorithm capable of rendering photo-realistic novel view images from a monocular RGB video of a dynamic scene.
We address the limitation by reformulating the neural radiance field function to be conditioned on surface position and orientation in the observation space.
We evaluate our model based on the novel view synthesis quality with a self-collected dataset of different moving specular objects in realistic environments.
arXiv Detail & Related papers (2023-03-25T11:03:53Z) - DeepDC: Deep Distance Correlation as a Perceptual Image Quality
Evaluator [53.57431705309919]
ImageNet pre-trained deep neural networks (DNNs) show notable transferability for building effective image quality assessment (IQA) models.
We develop a novel full-reference IQA (FR-IQA) model based exclusively on pre-trained DNN features.
We conduct comprehensive experiments to demonstrate the superiority of the proposed quality model on five standard IQA datasets.
arXiv Detail & Related papers (2022-11-09T14:57:27Z) - CONVIQT: Contrastive Video Quality Estimator [63.749184706461826]
Perceptual video quality assessment (VQA) is an integral component of many streaming and video sharing platforms.
Here we consider the problem of learning perceptually relevant video quality representations in a self-supervised manner.
Our results indicate that compelling representations with perceptual bearing can be obtained using self-supervised learning.
arXiv Detail & Related papers (2022-06-29T15:22:01Z) - Bayesian deep learning of affordances from RGB images [5.939410304994348]
We present a deep learning method to predict the affordances available in the environment directly from RGB images.
Based on previous work on socially accepted affordances, our model is based on a multiscale CNN that combines local and global information from the object and the full image.
Our results show a marginal better performance of deep ensembles, compared to MC-dropout on the Brier score and the Expected Error.
arXiv Detail & Related papers (2021-09-27T07:39:47Z) - Appearance Consensus Driven Self-Supervised Human Mesh Recovery [67.20942777949793]
We present a self-supervised human mesh recovery framework to infer human pose and shape from monocular images.
We achieve state-of-the-art results on the standard model-based 3D pose estimation benchmarks.
The resulting colored mesh prediction opens up the usage of our framework for a variety of appearance-related tasks beyond the pose and shape estimation.
arXiv Detail & Related papers (2020-08-04T05:40:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.