Related papers: Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data

Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data

URL: http://arxiv.org/abs/2308.08656v1
Date: Wed, 16 Aug 2023 20:12:01 GMT
Title: Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data
Authors: Keziah Naggita, Julienne LaChance, Alice Xiang
Abstract summary: We analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe. We present findings for an othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers.
Score: 3.4022338837261525
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Biases in large-scale image datasets are known to influence the performance of computer vision models as a function of geographic context. To investigate the limitations of standard Internet data collection methods in low- and middle-income countries, we analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe as well as the distribution of data according to fine-grained intra-national wealth estimates. Temporal analyses are performed at two-year intervals to expose emerging data trends. Furthermore, we present findings for an ``othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers. The results of our study suggest that further work is required to capture image data representative of African people and their environments and, ultimately, to improve the applicability of computer vision models in a global context.

Related papers

Vision-Language Models under Cultural and Inclusive Considerations [53.614528867159706]
Large vision-language models (VLMs) can assist visually impaired people by describing images from their daily lives. Current evaluation datasets may not reflect diverse cultural user backgrounds or the situational context of this use case. We create a survey to determine caption preferences and propose a culture-centric evaluation benchmark by filtering VizWiz, an existing dataset with images taken by people who are blind. We then evaluate several VLMs, investigating their reliability as visual assistants in a culturally diverse setting.
arXiv Detail & Related papers (2024-07-08T17:50:00Z)
Decomposed evaluations of geographic disparities in text-to-image models [22.491466809896867]
We introduce a new set of metrics, Decomposed Indicators of Disparities in Image Generation (Decomposed-DIG), that allows us to measure geographic disparities in the depiction of objects and backgrounds in generated images. Using Decomposed-DIG, we audit a widely used latent diffusion model and find that generated images depict objects with better realism than backgrounds. We use Decomposed-DIG to pinpoint specific examples of disparities, such as stereotypical background generation in Africa, struggling to generate modern vehicles in Africa, and unrealistically placing some objects in outdoor settings.
arXiv Detail & Related papers (2024-06-17T18:04:23Z)
Towards Geographic Inclusion in the Evaluation of Text-to-Image Models [25.780536950323683]
We study how much annotators in Africa, Europe, and Southeast Asia vary in their perception of geographic representation, visual appeal, and consistency in real and generated images. For example, annotators in different locations often disagree on whether exaggerated, stereotypical depictions of a region are considered geographically representative. We recommend steps for improved automatic and human evaluations.
arXiv Detail & Related papers (2024-05-07T16:23:06Z)
Regional biases in image geolocation estimation: a case study with the SenseCity Africa dataset [0.0]
We apply a state-of-the-art image geolocation estimation model (ISNs) to a crowd-sourced dataset of geolocated images from the African continent (SCA100) Our findings show that the ISNs model tends to over-predict image locations in high-income countries of the Western world. Our results suggest that using IM2GPS3k as a training set and benchmark for image geolocation estimation and other computer vision models overlooks its potential application in the African context.
arXiv Detail & Related papers (2024-04-03T08:27:24Z)
Data Augmentation in Human-Centric Vision [54.97327269866757]
This survey presents a comprehensive analysis of data augmentation techniques in human-centric vision tasks. It delves into a wide range of research areas including person ReID, human parsing, human pose estimation, and pedestrian detection. Our work categorizes data augmentation methods into two main types: data generation and data perturbation.
arXiv Detail & Related papers (2024-03-13T16:05:18Z)
Granularity at Scale: Estimating Neighborhood Socioeconomic Indicators from High-Resolution Orthographic Imagery and Hybrid Learning [1.8369448205408005]
Overhead images can help fill in the gaps where community information is sparse. Recent advancements in machine learning and computer vision have made it possible to quickly extract features from and detect patterns in image data. In this work, we explore how well two approaches, a supervised convolutional neural network and semi-supervised clustering can estimate population density, median household income, and educational attainment.
arXiv Detail & Related papers (2023-09-28T19:30:26Z)
Inspecting the Geographical Representativeness of Images from Text-to-Image Models [52.80961012689933]
We measure the geographical representativeness of generated images using a crowdsourced study comprising 540 participants across 27 countries. For deliberately underspecified inputs without country names, the generated images most reflect the surroundings of the United States followed by India. The overall scores for many countries still remain low, highlighting the need for future models to be more geographically inclusive.
arXiv Detail & Related papers (2023-05-18T16:08:11Z)
GeoNet: Benchmarking Unsupervised Adaptation across Geographies [71.23141626803287]
We study the problem of geographic robustness and make three main contributions. First, we introduce a large-scale dataset GeoNet for geographic adaptation. Second, we hypothesize that the major source of domain shifts arise from significant variations in scene context. Third, we conduct an extensive evaluation of several state-of-the-art unsupervised domain adaptation algorithms and architectures.
arXiv Detail & Related papers (2023-03-27T17:59:34Z)
Studying Bias in GANs through the Lens of Race [91.95264864405493]
We study how the performance and evaluation of generative image models are impacted by the racial composition of their training datasets. Our results show that the racial compositions of generated images successfully preserve that of the training data. However, we observe that truncation, a technique used to generate higher quality images during inference, exacerbates racial imbalances in the data.
arXiv Detail & Related papers (2022-09-06T22:25:56Z)
Predicting Livelihood Indicators from Community-Generated Street-Level Imagery [70.5081240396352]
We propose an inexpensive, scalable, and interpretable approach to predict key livelihood indicators from public crowd-sourced street-level imagery. By comparing our results against ground data collected in nationally-representative household surveys, we demonstrate the performance of our approach in accurately predicting indicators of poverty, population, and health.
arXiv Detail & Related papers (2020-06-15T18:12:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.