A novel illumination condition varied image dataset-Food Vision Dataset
(FVD) for fair and reliable consumer acceptability predictions from food
- URL: http://arxiv.org/abs/2209.06967v1
- Date: Wed, 14 Sep 2022 22:46:42 GMT
- Title: A novel illumination condition varied image dataset-Food Vision Dataset
(FVD) for fair and reliable consumer acceptability predictions from food
- Authors: Swarna Sethu (1), Dongyi Wang (1 and 2) ((1) Department of Biological
& Agricultural engineering, University of Arkansas, Fayetteville, (2)
Department of Food & Science and Department of Biological & Agricultural
engineering, University of Arkansas, Fayetteville)
- Abstract summary: Group presents a novel dataset, the Food Vision dataset (FVD), to quantify illumination effects on human and computer perceptions.
FVD consists of 675 images captured under 3 different power and 5 different temperature settings every alternate day for five such days.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advances in artificial intelligence promote a wide range of computer
vision applications in many different domains. Digital cameras, acting as human
eyes, can perceive fundamental object properties, such as shapes and colors,
and can be further used for conducting high-level tasks, such as image
classification, and object detections. Human perceptions have been widely
recognized as the ground truth for training and evaluating computer vision
models. However, in some cases, humans can be deceived by what they have seen.
Well-functioned human vision relies on stable external lighting while unnatural
illumination would influence human perception of essential characteristics of
goods. To evaluate the illumination effects on human and computer perceptions,
the group presents a novel dataset, the Food Vision Dataset (FVD), to create an
evaluation benchmark to quantify illumination effects, and to push forward
developments of illumination estimation methods for fair and reliable consumer
acceptability prediction from food appearances. FVD consists of 675 images
captured under 3 different power and 5 different temperature settings every
alternate day for five such days.
Related papers
- Human-level 3D shape perception emerges from multi-view learning [63.048728487674815]
We develop a modeling framework that predicts human 3D shape inferences for arbitrary objects.<n>We achieve this with a novel class of neural networks trained using a visual-spatial objective over naturalistic sensory data.<n>We find that human-level 3D perception can emerge from a simple, scalable learning objective over naturalistic visual-spatial data.
arXiv Detail & Related papers (2026-02-19T18:56:05Z) - Surveillance Facial Image Quality Assessment: A Multi-dimensional Dataset and Lightweight Model [59.39390911456143]
We propose the first comprehensive study on surveillance facial image quality assessment (SFIQA)<n>SFIQA-Bench consists of 5,004 surveillance facial images captured by three widely deployed surveillance cameras in real-world scenarios.<n>A subjective experiment is conducted to collect six dimensional quality ratings, including noise, sharpness, colorfulness, contrast, fidelity and overall quality.
arXiv Detail & Related papers (2026-02-07T06:51:03Z) - Testing the Limits of Fine-Tuning for Improving Visual Cognition in Vision Language Models [51.58859621164201]
We introduce visual stimuli and human judgments on visual cognition tasks to evaluate performance across cognitive domains.<n>We fine-tune models on ground truth data for intuitive physics and causal reasoning.<n>We find that task-specific fine-tuning does not contribute to robust human-like generalization to data with other visual characteristics.
arXiv Detail & Related papers (2025-02-21T18:58:30Z) - When Does Perceptual Alignment Benefit Vision Representations? [76.32336818860965]
We investigate how aligning vision model representations to human perceptual judgments impacts their usability.
We find that aligning models to perceptual judgments yields representations that improve upon the original backbones across many downstream tasks.
Our results suggest that injecting an inductive bias about human perceptual knowledge into vision models can contribute to better representations.
arXiv Detail & Related papers (2024-10-14T17:59:58Z) - Estimating the distribution of numerosity and non-numerical visual magnitudes in natural scenes using computer vision [0.08192907805418582]
We show that in natural visual scenes the frequency of appearance of different numerosities follows a power law distribution.
We show that the correlational structure for numerosity and continuous magnitudes is stable across datasets and scene types.
arXiv Detail & Related papers (2024-09-17T09:49:29Z) - Evaluating Multiview Object Consistency in Humans and Image Models [68.36073530804296]
We leverage an experimental design from the cognitive sciences which requires zero-shot visual inferences about object shape.
We collect 35K trials of behavioral data from over 500 participants.
We then evaluate the performance of common vision models.
arXiv Detail & Related papers (2024-09-09T17:59:13Z) - BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation [57.40024206484446]
We introduce the BEHAVIOR Vision Suite (BVS), a set of tools and assets to generate fully customized synthetic data for systematic evaluation of computer vision models.
BVS supports a large number of adjustable parameters at the scene level.
We showcase three example application scenarios.
arXiv Detail & Related papers (2024-05-15T17:57:56Z) - Relightable Neural Actor with Intrinsic Decomposition and Pose Control [80.06094206522668]
We propose Relightable Neural Actor, a new video-based method for learning a pose-driven neural human model that can be relighted.
For training, our method solely requires a multi-view recording of the human under a known, but static lighting condition.
To evaluate our approach in real-world scenarios, we collect a new dataset with four identities recorded under different light conditions, indoors and outdoors.
arXiv Detail & Related papers (2023-12-18T14:30:13Z) - Self-Supervised Visual Representation Learning on Food Images [6.602838826255494]
Existing deep learning-based methods learn the visual representation for downstream tasks based on human annotation of each food image.
Most food images in real life are obtained without labels, and data annotation requires plenty of time and human effort.
In this paper, we focus on the implementation and analysis of existing representative self-supervised learning methods on food images.
arXiv Detail & Related papers (2023-03-16T02:31:51Z) - ColorSense: A Study on Color Vision in Machine Visual Recognition [57.916512479603064]
We collect 110,000 non-trivial human annotations of foreground and background color labels from visual recognition benchmarks.
We validate the use of our datasets by demonstrating that the level of color discrimination has a dominating effect on the performance of machine perception models.
Our findings suggest that object recognition tasks such as classification and localization are susceptible to color vision bias.
arXiv Detail & Related papers (2022-12-16T18:51:41Z) - Improving generalization by mimicking the human visual diet [34.32585612888424]
We present a new perspective on bridging the generalization gap between biological and computer vision.
Our results demonstrate that incorporating variations and contextual cues ubiquitous in the human visual training data (visual diet) significantly improves generalization to real-world transformations.
arXiv Detail & Related papers (2022-06-15T20:32:24Z) - HSPACE: Synthetic Parametric Humans Animated in Complex Environments [67.8628917474705]
We build a large-scale photo-realistic dataset, Human-SPACE, of animated humans placed in complex indoor and outdoor environments.
We combine a hundred diverse individuals of varying ages, gender, proportions, and ethnicity, with hundreds of motions and scenes, in order to generate an initial dataset of over 1 million frames.
Assets are generated automatically, at scale, and are compatible with existing real time rendering and game engines.
arXiv Detail & Related papers (2021-12-23T22:27:55Z) - HANDS: A Multimodal Dataset for Modeling Towards Human Grasp Intent
Inference in Prosthetic Hands [3.7886097009023376]
Advanced prosthetic hands of the future are anticipated to benefit from improved shared control between a robotic hand and its human user.
multimodal sensor data may include various environment sensors including vision, as well as human physiology and behavior sensors.
A fusion methodology for environmental state and human intent estimation can combine these sources of evidence in order to help prosthetic hand motion planning and control.
arXiv Detail & Related papers (2021-03-08T15:51:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.