Related papers: The World Wide Recipe: A community-centred framework for fine-grained data collection and regional bias operationalisation

Related papers

Bias and Generalizability of Foundation Models across Datasets in Breast Mammography [4.117899774444893]
We explore the fairness and bias of foundation models (FMs) for breast mammography classification.<n>We leverage a large pool of datasets from diverse sources-including data from underrepresented regions and an in-house dataset.<n>Our experiments show that while modality-specific pre-training of FMs enhances performance, classifiers trained on features from individual datasets fail to generalize across domains.
arXiv Detail & Related papers (2025-05-14T06:56:17Z)
Food Delivery Time Prediction in Indian Cities Using Machine Learning Models [0.4893345190925178]
This research addresses gaps by integrating real-time contextual variables into predictive models. We systematically compare various machine learning algorithms, including Linear Regression, Decision Trees, Bagging, Random Forest, XGBoost, and LightGBM. Experimental results demonstrate that the LightGBM model achieves superior predictive accuracy, with an R2 score of 0.76 and Mean Squared Error (MSE) of 20.59, outperforming traditional baseline approaches.
arXiv Detail & Related papers (2025-03-19T13:02:23Z)
Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias [76.85949078144098]
This paper focuses on textual hallucinations, where diffusion models correctly generate individual symbols but assemble them in a nonsensical manner. We observe that such phenomenon is attributed it to the network's local generation bias. We also theoretically analyze the training dynamics for a specific case involving a two-layer learning parity points on a hypercube.
arXiv Detail & Related papers (2025-03-05T15:28:50Z)
Biased Heritage: How Datasets Shape Models in Facial Expression Recognition [13.77824359359967]
We study bias propagation from datasets to trained models in image-based Facial Expression Recognition systems. We introduce new bias metrics specifically designed for multiclass problems with multiple demographic groups. Our findings suggest that preventing emotion-specific demographic patterns should be prioritized over general demographic balance in FER datasets.
arXiv Detail & Related papers (2025-03-05T12:25:22Z)
Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations [63.52709761339949]
We first contribute a dedicated dataset called the Fair Forgery Detection (FairFD) dataset, where we prove the racial bias of public state-of-the-art (SOTA) methods. We design novel metrics including Approach Averaged Metric and Utility Regularized Metric, which can avoid deceptive results. We also present an effective and robust post-processing technique, Bias Pruning with Fair Activations (BPFA), which improves fairness without requiring retraining or weight updates.
arXiv Detail & Related papers (2024-07-19T14:53:18Z)
Multilingual Diversity Improves Vision-Language Representations [66.41030381363244]
Pre-training on this dataset outperforms using English-only or English-dominated datasets on ImageNet. On a geographically diverse task like GeoDE, we also observe improvements across all regions, with the biggest gain coming from Africa.
arXiv Detail & Related papers (2024-05-27T08:08:51Z)
Rethinking Debiasing: Real-World Bias Analysis and Mitigation [17.080528126651977]
We revisit biased distributions in existing benchmarks and real-world datasets. We empirically and theoretically identify key characteristics of real-world biases poorly represented by existing benchmarks. We propose a simple yet effective approach that can be easily applied to existing debiasing methods, named Debias in Destruction (DiD)
arXiv Detail & Related papers (2024-05-24T06:06:41Z)
Diverse Perspectives, Divergent Models: Cross-Cultural Evaluation of Depression Detection on Twitter [4.462334751640166]
We evaluate the generalization of benchmark datasets to build AI models on cross-cultural Twitter data. Our results show that depression detection models do not generalize globally. Pre-trained language models achieve the best generalization compared to Logistic Regression, though still show significant gaps in performance on depressed and non-Western users.
arXiv Detail & Related papers (2024-04-01T03:59:12Z)
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You [64.74707085021858]
We show that multilingual models suffer from significant gender biases just as monolingual models do. We propose a novel benchmark, MAGBIG, intended to foster research on gender bias in multilingual models. Our results show that not only do models exhibit strong gender biases but they also behave differently across languages.
arXiv Detail & Related papers (2024-01-29T12:02:28Z)
FoodFusion: A Latent Diffusion Model for Realistic Food Image Generation [69.91401809979709]
Current state-of-the-art image generation models such as Latent Diffusion Models (LDMs) have demonstrated the capacity to produce visually striking food-related images. We introduce FoodFusion, a Latent Diffusion model engineered specifically for the faithful synthesis of realistic food images from textual descriptions. The development of the FoodFusion model involves harnessing an extensive array of open-source food datasets, resulting in over 300,000 curated image-caption pairs.
arXiv Detail & Related papers (2023-12-06T15:07:12Z)
Leveraging Diffusion Perturbations for Measuring Fairness in Computer Vision [25.414154497482162]
We demonstrate that diffusion models can be leveraged to create such a dataset. We benchmark several vision-language models on a multi-class occupation classification task. We find that images generated with non-Caucasian labels have a significantly higher occupation misclassification rate than images generated with Caucasian labels.
arXiv Detail & Related papers (2023-11-25T19:40:13Z)
All Should Be Equal in the Eyes of Language Models: Counterfactually Aware Fair Text Generation [16.016546693767403]
We propose a framework that dynamically compares the model understanding of diverse demographics to generate more equitable sentences. CAFIE produces fairer text and strikes the best balance between fairness and language modeling capability.
arXiv Detail & Related papers (2023-11-09T15:39:40Z)
Computer Vision Datasets and Models Exhibit Cultural and Linguistic Diversity in Perception [28.716435050743957]
We study how people from different cultural backgrounds observe vastly different concepts even when viewing the same visual stimuli. By comparing textual descriptions generated across 7 languages for the same images, we find significant differences in the semantic content and linguistic expression. Our work points towards the need to accounttuning for and embrace the diversity of human perception in the computer vision community.
arXiv Detail & Related papers (2023-10-22T16:51:42Z)
CBBQ: A Chinese Bias Benchmark Dataset Curated with Human-AI Collaboration for Large Language Models [52.25049362267279]
We present a Chinese Bias Benchmark dataset that consists of over 100K questions jointly constructed by human experts and generative language models. The testing instances in the dataset are automatically derived from 3K+ high-quality templates manually authored with stringent quality control. Extensive experiments demonstrate the effectiveness of the dataset in detecting model bias, with all 10 publicly available Chinese large language models exhibiting strong bias in certain categories.
arXiv Detail & Related papers (2023-06-28T14:14:44Z)
Exposing Bias in Online Communities through Large-Scale Language Models [3.04585143845864]
This work uses the flaw of bias in language models to explore the biases of six different online communities. The bias of the resulting models is evaluated by prompting the models with different demographics and comparing the sentiment and toxicity values of these generations. This work not only affirms how easily bias is absorbed from training data but also presents a scalable method to identify and compare the bias of different datasets or communities.
arXiv Detail & Related papers (2023-06-04T08:09:26Z)
Inspecting the Geographical Representativeness of Images from Text-to-Image Models [52.80961012689933]
We measure the geographical representativeness of generated images using a crowdsourced study comprising 540 participants across 27 countries. For deliberately underspecified inputs without country names, the generated images most reflect the surroundings of the United States followed by India. The overall scores for many countries still remain low, highlighting the need for future models to be more geographically inclusive.
arXiv Detail & Related papers (2023-05-18T16:08:11Z)
Analyzing Bias in Diffusion-based Face Generation Models [75.80072686374564]
Diffusion models are increasingly popular in synthetic data generation and image editing applications. We investigate the presence of bias in diffusion-based face generation models with respect to attributes such as gender, race, and age. We examine how dataset size affects the attribute composition and perceptual quality of both diffusion and Generative Adversarial Network (GAN) based face generation models.
arXiv Detail & Related papers (2023-05-10T18:22:31Z)
Assessing Demographic Bias Transfer from Dataset to Model: A Case Study in Facial Expression Recognition [1.5340540198612824]
Two metrics focus on the representational and stereotypical bias of the dataset, and the third one on the residual bias of the trained model. We demonstrate the usefulness of the metrics by applying them to a FER problem based on the popular Affectnet dataset.
arXiv Detail & Related papers (2022-05-20T09:40:42Z)
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models [73.12069620086311]
We investigate the visual reasoning capabilities and social biases of text-to-image models. First, we measure three visual reasoning skills: object recognition, object counting, and spatial relation understanding. Second, we assess the gender and skin tone biases by measuring the gender/skin tone distribution of generated images.
arXiv Detail & Related papers (2022-02-08T18:36:52Z)
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics [118.75207687144817]
We introduce Data Maps, a model-based tool to characterize and diagnose datasets. We leverage a largely ignored source of information: the behavior of the model on individual instances during training. Our results indicate that a shift in focus from quantity to quality of data could lead to robust models and improved out-of-distribution generalization.
arXiv Detail & Related papers (2020-09-22T20:19:41Z)
REVISE: A Tool for Measuring and Mitigating Bias in Visual Datasets [64.76453161039973]
REVISE (REvealing VIsual biaSEs) is a tool that assists in the investigation of a visual dataset. It surfacing potential biases along three dimensions: (1) object-based, (2) person-based, and (3) geography-based.
arXiv Detail & Related papers (2020-04-16T23:54:37Z)
Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction. We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data. Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.