Measuring Representational Harms in Image Captioning
- URL: http://arxiv.org/abs/2206.07173v1
- Date: Tue, 14 Jun 2022 21:08:01 GMT
- Title: Measuring Representational Harms in Image Captioning
- Authors: Angelina Wang and Solon Barocas and Kristen Laird and Hanna Wallach
- Abstract summary: We present a set of techniques for measuring five types of representational harms, as well as the resulting measurements.
Our goal was not to audit this image captioning system, but rather to develop normatively grounded measurement techniques.
We discuss the assumptions underlying our measurement approach and point out when they do not hold.
- Score: 5.543867614999908
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Previous work has largely considered the fairness of image captioning systems
through the underspecified lens of "bias." In contrast, we present a set of
techniques for measuring five types of representational harms, as well as the
resulting measurements obtained for two of the most popular image captioning
datasets using a state-of-the-art image captioning system. Our goal was not to
audit this image captioning system, but rather to develop normatively grounded
measurement techniques, in turn providing an opportunity to reflect on the many
challenges involved. We propose multiple measurement techniques for each type
of harm. We argue that by doing so, we are better able to capture the
multi-faceted nature of each type of harm, in turn improving the (collective)
validity of the resulting measurements. Throughout, we discuss the assumptions
underlying our measurement approach and point out when they do not hold.
Related papers
- Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy and Novel Ensemble Method [35.71703501731081]
We present the first survey and taxonomy of over 70 different image captioning metrics.
We find that despite the diversity of proposed metrics, the vast majority of studies rely on only five popular metrics.
We propose EnsembEval -- an ensemble of evaluation methods achieving the highest reported correlation with human judgements.
arXiv Detail & Related papers (2024-08-09T07:31:06Z) - Information Theoretic Text-to-Image Alignment [49.396917351264655]
We present a novel method that relies on an information-theoretic alignment measure to steer image generation.
Our method is on-par or superior to the state-of-the-art, yet requires nothing but a pre-trained denoising network to estimate MI.
arXiv Detail & Related papers (2024-05-31T12:20:02Z) - Introspective Deep Metric Learning [91.47907685364036]
We propose an introspective deep metric learning framework for uncertainty-aware comparisons of images.
The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling.
arXiv Detail & Related papers (2023-09-11T16:21:13Z) - Taxonomizing and Measuring Representational Harms: A Look at Image
Tagging [12.576454410948292]
We identify four types of representational harms that can be caused by image tagging systems.
We show that attempts to mitigate some of these types of harms may be in tension with one another.
arXiv Detail & Related papers (2023-05-02T20:36:30Z) - Are metrics measuring what they should? An evaluation of image
captioning task metrics [0.21301560294088315]
Image Captioning is a current research task to describe the image content using the objects and their relationships in the scene.
To tackle this task, two important research areas are used, artificial vision, and natural language processing.
We present an evaluation of several kinds of Image Captioning metrics and a comparison between them using the well-known MS COCO dataset.
arXiv Detail & Related papers (2022-07-04T21:51:47Z) - Introspective Deep Metric Learning for Image Retrieval [80.29866561553483]
We argue that a good similarity model should consider the semantic discrepancies with caution to better deal with ambiguous images for more robust training.
We propose to represent an image using not only a semantic embedding but also an accompanying uncertainty embedding, which describes the semantic characteristics and ambiguity of an image, respectively.
The proposed IDML framework improves the performance of deep metric learning through uncertainty modeling and attains state-of-the-art results on the widely used CUB-200-2011, Cars196, and Stanford Online Products datasets.
arXiv Detail & Related papers (2022-05-09T17:51:44Z) - A Prompt Array Keeps the Bias Away: Debiasing Vision-Language Models
with Adversarial Learning [55.96577490779591]
Vision-language models can encode societal biases and stereotypes.
There are challenges to measuring and mitigating these multimodal harms.
We investigate bias measures and apply ranking metrics for image-text representations.
arXiv Detail & Related papers (2022-03-22T17:59:04Z) - Few-Shot Learning with Part Discovery and Augmentation from Unlabeled
Images [79.34600869202373]
We show that inductive bias can be learned from a flat collection of unlabeled images, and instantiated as transferable representations among seen and unseen classes.
Specifically, we propose a novel part-based self-supervised representation learning scheme to learn transferable representations.
Our method yields impressive results, outperforming the previous best unsupervised methods by 7.74% and 9.24%.
arXiv Detail & Related papers (2021-05-25T12:22:11Z) - Intrinsic Image Captioning Evaluation [53.51379676690971]
We propose a learning based metrics for image captioning, which we call Intrinsic Image Captioning Evaluation(I2CE)
Experiment results show that our proposed method can keep robust performance and give more flexible scores to candidate captions when encountered with semantic similar expression or less aligned semantics.
arXiv Detail & Related papers (2020-12-14T08:36:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.