Enriching ImageNet with Human Similarity Judgments and Psychological
Embeddings
- URL: http://arxiv.org/abs/2011.11015v1
- Date: Sun, 22 Nov 2020 13:41:54 GMT
- Title: Enriching ImageNet with Human Similarity Judgments and Psychological
Embeddings
- Authors: Brett D. Roads, Bradley C. Love
- Abstract summary: We introduce a dataset that embodies the task-general capabilities of human perception and reasoning.
The Human Similarity Judgments extension to ImageNet (ImageNet-HSJ) is composed of human similarity judgments.
The new dataset supports a range of task and performance metrics, including the evaluation of unsupervised learning algorithms.
- Score: 7.6146285961466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Advances in object recognition flourished in part because of the availability
of high-quality datasets and associated benchmarks. However, these
benchmarks---such as ILSVRC---are relatively task-specific, focusing
predominately on predicting class labels. We introduce a publicly-available
dataset that embodies the task-general capabilities of human perception and
reasoning. The Human Similarity Judgments extension to ImageNet (ImageNet-HSJ)
is composed of human similarity judgments that supplement the ILSVRC validation
set. The new dataset supports a range of task and performance metrics,
including the evaluation of unsupervised learning algorithms. We demonstrate
two methods of assessment: using the similarity judgments directly and using a
psychological embedding trained on the similarity judgments. This embedding
space contains an order of magnitude more points (i.e., images) than previous
efforts based on human judgments. Scaling to the full 50,000 image set was made
possible through a selective sampling process that used variational Bayesian
inference and model ensembles to sample aspects of the embedding space that
were most uncertain. This methodological innovation not only enables scaling,
but should also improve the quality of solutions by focusing sampling where it
is needed. To demonstrate the utility of ImageNet-HSJ, we used the similarity
ratings and the embedding space to evaluate how well several popular models
conform to human similarity judgments. One finding is that more complex models
that perform better on task-specific benchmarks do not better conform to human
semantic judgments. In addition to the human similarity judgments, pre-trained
psychological embeddings and code for inferring variational embeddings are made
publicly available. Collectively, ImageNet-HSJ assets support the appraisal of
internal representations and the development of more human-like models.
Related papers
- Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - Beyond MOS: Subjective Image Quality Score Preprocessing Method Based on Perceptual Similarity [2.290956583394892]
ITU-R BT.500, ITU-T P.910, and ITU-T P.913 have been standardized to clean up the original opinion scores.
PSP exploit the perceptual similarity between images to alleviate subjective bias in less annotated scenarios.
arXiv Detail & Related papers (2024-04-30T16:01:14Z) - Evaluating the Fairness of Discriminative Foundation Models in Computer
Vision [51.176061115977774]
We propose a novel taxonomy for bias evaluation of discriminative foundation models, such as Contrastive Language-Pretraining (CLIP)
We then systematically evaluate existing methods for mitigating bias in these models with respect to our taxonomy.
Specifically, we evaluate OpenAI's CLIP and OpenCLIP models for key applications, such as zero-shot classification, image retrieval and image captioning.
arXiv Detail & Related papers (2023-10-18T10:32:39Z) - Efficient Discovery and Effective Evaluation of Visual Perceptual
Similarity: A Benchmark and Beyond [20.035369732786407]
We introduce the first large-scale fashion visual similarity benchmark dataset, consisting of more than 110K expert-annotated image pairs.
We propose a novel and efficient labeling procedure that can be applied to any dataset.
arXiv Detail & Related papers (2023-08-28T17:59:47Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - Self-similarity Driven Scale-invariant Learning for Weakly Supervised
Person Search [66.95134080902717]
We propose a novel one-step framework, named Self-similarity driven Scale-invariant Learning (SSL)
We introduce a Multi-scale Exemplar Branch to guide the network in concentrating on the foreground and learning scale-invariant features.
Experiments on PRW and CUHK-SYSU databases demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2023-02-25T04:48:11Z) - TISE: A Toolbox for Text-to-Image Synthesis Evaluation [9.092600296992925]
We conduct a study on state-of-the-art methods for single- and multi-object text-to-image synthesis.
We propose a common framework for evaluating these methods.
arXiv Detail & Related papers (2021-12-02T16:39:35Z) - SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity
Distribution Distance [25.109321001368496]
Face Image Quality Assessment (FIQA) has become an indispensable part of the face recognition system.
We propose a novel unsupervised FIQA method that incorporates Similarity Distribution Distance for Face Image Quality Assessment (SDD-FIQA)
Our method generates quality pseudo-labels by calculating the Wasserstein Distance between the intra-class similarity distributions and inter-class similarity distributions.
arXiv Detail & Related papers (2021-03-10T10:23:28Z) - Validation and generalization of pixel-wise relevance in convolutional
neural networks trained for face classification [0.0]
We show how relevance measures vary with and generalize across key model parameters.
Using relevance-based image masking, we find that relevance maps for face classification prove generally stable.
Fine-grained analyses of relevance maps across models revealed asymmetries in generalization that point to specific benefits of choice parameters.
arXiv Detail & Related papers (2020-06-16T23:20:40Z) - Gradient-Induced Co-Saliency Detection [81.54194063218216]
Co-saliency detection (Co-SOD) aims to segment the common salient foreground in a group of relevant images.
In this paper, inspired by human behavior, we propose a gradient-induced co-saliency detection method.
arXiv Detail & Related papers (2020-04-28T08:40:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.