Related papers: Pinpointing Why Object Recognition Performance Degrades Across Income Levels and Geographies

Pinpointing Why Object Recognition Performance Degrades Across Income Levels and Geographies

URL: http://arxiv.org/abs/2304.05391v1
Date: Tue, 11 Apr 2023 17:59:52 GMT
Title: Pinpointing Why Object Recognition Performance Degrades Across Income Levels and Geographies
Authors: Laura Gustafson, Megan Richards, Melissa Hall, Caner Hazirbas, Diane Bouchacourt, Mark Ibrahim
Abstract summary: Deep learning systems' performance degrades significantly across geographies and lower income levels. We take a step in this direction by annotating images from Dollar Street, a popular benchmark of geographically and economically diverse images. These annotations unlock a new granular view into how objects differ across incomes and regions. We then use these object differences to pinpoint model vulnerabilities across incomes and regions.
Score: 8.408398153073096
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite impressive advances in object-recognition, deep learning systems' performance degrades significantly across geographies and lower income levels raising pressing concerns of inequity. Addressing such performance gaps remains a challenge, as little is understood about why performance degrades across incomes or geographies. We take a step in this direction by annotating images from Dollar Street, a popular benchmark of geographically and economically diverse images, labeling each image with factors such as color, shape, and background. These annotations unlock a new granular view into how objects differ across incomes and regions. We then use these object differences to pinpoint model vulnerabilities across incomes and regions. We study a range of modern vision models, finding that performance disparities are most associated with differences in texture, occlusion, and images with darker lighting. We illustrate how insights from our factor labels can surface mitigations to improve models' performance disparities. As an example, we show that mitigating a model's vulnerability to texture can improve performance on the lower income level. We release all the factor annotations along with an interactive dashboard to facilitate research into more equitable vision systems.

Related papers

A review of advancements in low-light image enhancement using deep learning [1.7930949972761197]
In low-light environments, the performance of computer vision algorithms adversely affects key vision tasks such as segmentation, detection, and classification.<n>With the rapid advancement of deep learning, its application to low-light image processing has attracted widespread attention.<n>This review provides detailed elaboration on how various recent approaches (from 2020) operate and their enhancement mechanisms.
arXiv Detail & Related papers (2025-05-09T03:39:23Z)
Semi-LLIE: Semi-supervised Contrastive Learning with Mamba-based Low-light Image Enhancement [59.17372460692809]
This work proposes a mean-teacher-based semi-supervised low-light enhancement (Semi-LLIE) framework that integrates the unpaired data into model training. We introduce a semantic-aware contrastive loss to faithfully transfer the illumination distribution, contributing to enhancing images with natural colors. We also propose novel perceptive loss based on the large-scale vision-language Recognize Anything Model (RAM) to help generate enhanced images with richer textual details.
arXiv Detail & Related papers (2024-09-25T04:05:32Z)
Indoor scene recognition from images under visual corruptions [3.4861209026118836]
This paper presents an innovative approach to indoor scene recognition that leverages multimodal data fusion. We examine two multimodal networks that synergize visual features from CNN models with semantic captions via a Graph Convolutional Network (GCN) Our study shows that this fusion improves markedly model performance, with notable gains in Top-1 accuracy when evaluated against a corrupted subset of the Places365 dataset.
arXiv Detail & Related papers (2024-08-23T12:35:45Z)
Bridging the Digital Divide: Performance Variation across Socio-Economic Factors in Vision-Language Models [31.868468221653025]
We evaluate the performance of a vision-language model (CLIP) on a geo-diverse dataset containing household images associated with different income values. Our results indicate that performance for the poorer groups is consistently lower than the wealthier groups across various topics and countries.
arXiv Detail & Related papers (2023-11-09T21:10:52Z)
Mitigating Bias: Enhancing Image Classification by Improving Model Explanations [9.791305104409057]
Deep learning models tend to rely heavily on simple and easily discernible features in the background of images. We introduce a mechanism that encourages the model to allocate sufficient attention to the foreground. Our findings highlight the importance of foreground attention in enhancing model understanding and representation of the main concepts within images.
arXiv Detail & Related papers (2023-07-04T04:46:44Z)
Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality [50.48859793121308]
Contrastively trained vision-language models have achieved remarkable progress in vision and language representation learning. Recent research has highlighted severe limitations in their ability to perform compositional reasoning over objects, attributes, and relations.
arXiv Detail & Related papers (2023-05-23T08:28:38Z)
Towards Reliable Assessments of Demographic Disparities in Multi-Label Image Classifiers [11.973749734226852]
We consider multi-label image classification and, specifically, object categorization tasks. Design choices and trade-offs for measurement involve more nuance than discussed in prior computer vision literature. We identify several design choices that look merely like implementation details but significantly impact the conclusions of assessments.
arXiv Detail & Related papers (2023-02-16T20:34:54Z)
Mitigating Urban-Rural Disparities in Contrastive Representation Learning with Satellite Imagery [19.93324644519412]
We consider the risk of urban-rural disparities in identification of land-cover features. We propose fair dense representation with contrastive learning (FairDCL) as a method for de-biasing the multi-level latent space of convolution neural network models. The obtained image representation mitigates downstream urban-rural prediction disparities and outperforms state-of-the-art baselines on real-world satellite images.
arXiv Detail & Related papers (2022-11-16T04:59:46Z)
Perceptual Grouping in Contrastive Vision-Language Models [59.1542019031645]
We show how vision-language models are able to understand where objects reside within an image and group together visually related parts of the imagery. We propose a minimal set of modifications that results in models that uniquely learn both semantic and spatial information.
arXiv Detail & Related papers (2022-10-18T17:01:35Z)
Model-Agnostic Graph Regularization for Few-Shot Learning [60.64531995451357]
We present a comprehensive study on graph embedded few-shot learning. We introduce a graph regularization approach that allows a deeper understanding of the impact of incorporating graph information between labels. Our approach improves the performance of strong base learners by up to 2% on Mini-ImageNet and 6.7% on ImageNet-FS.
arXiv Detail & Related papers (2021-02-14T05:28:13Z)
Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map. We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z)
Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning. Current contrastive models are ineffective at localizing the foreground object. We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.