Related papers: Investigating Associational Biases in Inter-Model Communication of Large Generative Models

Investigating Associational Biases in Inter-Model Communication of Large Generative Models

URL: http://arxiv.org/abs/2601.22093v1
Date: Thu, 29 Jan 2026 18:29:55 GMT
Title: Investigating Associational Biases in Inter-Model Communication of Large Generative Models
Authors: Fethiye Irmak Dogan, Yuval Weiss, Kajal Patel, Jiaee Cheong, Hatice Gunes,
Abstract summary: Social bias in generative AI can manifest as performance disparities and as associational bias.<n>We study how associations evolve within an inter-model communication pipeline that alternates between image generation and image description.<n>Our results reveal demographic drifts toward younger representations for both actions and emotions, as well as toward more female-presenting representations.
Score: 8.394205333688165
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Social bias in generative AI can manifest not only as performance disparities but also as associational bias, whereby models learn and reproduce stereotypical associations between concepts and demographic groups, even in the absence of explicit demographic information (e.g., associating doctors with men). These associations can persist, propagate, and potentially amplify across repeated exchanges in inter-model communication pipelines, where one generative model's output becomes another's input. This is especially salient for human-centred perception tasks, such as human activity recognition and affect prediction, where inferences about behaviour and internal states can lead to errors or stereotypical associations that propagate into unequal treatment. In this work, focusing on human activity and affective expression, we study how such associations evolve within an inter-model communication pipeline that alternates between image generation and image description. Using the RAF-DB and PHASE datasets, we quantify demographic distribution drift induced by model-to-model information exchange and assess whether these drifts are systematic using an explainability pipeline. Our results reveal demographic drifts toward younger representations for both actions and emotions, as well as toward more female-presenting representations, primarily for emotions. We further find evidence that some predictions are supported by spurious visual regions (e.g., background or hair) rather than concept-relevant cues (e.g., body or face). We also examine whether these demographic drifts translate into measurable differences in downstream behaviour, i.e., while predicting activity and emotion labels. Finally, we outline mitigation strategies spanning data-centric, training and deployment interventions, and emphasise the need for careful safeguards when deploying interconnected models in human-centred AI systems.

Related papers

Interpretable Debiasing of Vision-Language Models for Social Fairness [55.85977929985967]
We introduce an interpretable, model-agnostic bias mitigation framework, DeBiasLens, that localizes social attribute neurons in Vision-Language models.<n>We train SAEs on facial image or caption datasets without corresponding social attribute labels to uncover neurons highly responsive to specific demographics.<n>Our research lays the groundwork for future auditing tools, prioritizing social fairness in emerging real-world AI systems.
arXiv Detail & Related papers (2026-02-27T13:37:11Z)
Enhancing Personality Recognition by Comparing the Predictive Power of Traits, Facets, and Nuances [37.83859643892549]
Personality recognition models aim to infer personality traits from different sources of behavioral data.<n>We trained a transformer-based model including cross-modal (audiovisual) and cross-subject (dyad-aware) attention mechanisms.<n>Results show that nuance-level models consistently outperform facet and trait-level models, reducing mean squared error by up to 74% across interaction scenarios.
arXiv Detail & Related papers (2026-02-05T13:35:04Z)
Race, Ethnicity and Their Implication on Bias in Large Language Models [9.202525724606188]
We study how race and ethnicity are represented and operationalized within large language models (LLMs)<n>We find that demographic information is distributed across internal units with substantial cross-model variation.<n> Interventions suppressing such neurons reduce bias but leave substantial residual effects.
arXiv Detail & Related papers (2026-01-19T09:24:24Z)
Interpreting Social Bias in LVLMs via Information Flow Analysis and Multi-Round Dialogue Evaluation [1.7997395646080083]
Large Vision Language Models (LVLMs) have achieved remarkable progress in multimodal tasks, yet they also exhibit notable social biases.<n>We propose an explanatory framework that combines information flow analysis with multi-round dialogue evaluation.<n>Experiments reveal that LVLMs exhibit systematic disparities in information usage when processing images of different demographic groups.
arXiv Detail & Related papers (2025-05-27T12:28:44Z)
BiasConnect: Investigating Bias Interactions in Text-to-Image Models [73.76853483463836]
We introduce BiasConnect, a novel tool designed to analyze and quantify bias interactions in Text-to-Image models.<n>Our method provides empirical estimates that indicate how other bias dimensions shift toward or away from an ideal distribution when a given bias is modified.<n>We demonstrate the utility of BiasConnect for selecting optimal bias mitigation axes, comparing different TTI models on the dependencies they learn, and understanding the amplification of intersectional societal biases in TTI models.
arXiv Detail & Related papers (2025-03-12T19:01:41Z)
Diffusion-Based Imitation Learning for Social Pose Generation [0.0]
Intelligent agents, such as robots and virtual agents, must understand the dynamics of complex social interactions to interact with humans.<n>We explore how using a single modality, the pose behavior, of multiple individuals in a social interaction can be used to generate nonverbal social cues for the facilitator of that interaction.
arXiv Detail & Related papers (2025-01-18T20:31:55Z)
Comparing Fairness of Generative Mobility Models [3.699135947901772]
This work examines the fairness of generative mobility models, addressing the often overlooked dimension of equity in model performance across geographic regions. Predictive models built on crowd flow data are instrumental in understanding urban structures and movement patterns. We propose a novel framework for assessing fairness by measuring utility and equity of generated traces.
arXiv Detail & Related papers (2024-11-07T06:01:12Z)
Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach [61.04606493712002]
Susceptibility to misinformation describes the degree of belief in unverifiable claims that is not observable. Existing susceptibility studies heavily rely on self-reported beliefs. We propose a computational approach to model users' latent susceptibility levels.
arXiv Detail & Related papers (2023-11-16T07:22:56Z)
Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems. Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts. We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z)
D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases. A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network. For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z)
Towards Unbiased Visual Emotion Recognition via Causal Intervention [63.74095927462]
We propose a novel Emotion Recognition Network (IERN) to alleviate the negative effects brought by the dataset bias. A series of designed tests validate the effectiveness of IERN, and experiments on three emotion benchmarks demonstrate that IERN outperforms other state-of-the-art approaches.
arXiv Detail & Related papers (2021-07-26T10:40:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.