Improving Diversity of Demographic Representation in Large Language
Models via Collective-Critiques and Self-Voting
- URL: http://arxiv.org/abs/2310.16523v1
- Date: Wed, 25 Oct 2023 10:17:17 GMT
- Title: Improving Diversity of Demographic Representation in Large Language
Models via Collective-Critiques and Self-Voting
- Authors: Preethi Lahoti, Nicholas Blumm, Xiao Ma, Raghavendra Kotikalapudi,
Sahitya Potluri, Qijun Tan, Hansa Srinivasan, Ben Packer, Ahmad Beirami, Alex
Beutel, Jilin Chen
- Abstract summary: This paper formalizes diversity of representation in generative large language models.
We present evaluation datasets and propose metrics to measure diversity in generated responses along people and culture axes.
We find that LLMs understand the notion of diversity, and that they can reason and critique their own responses for that goal.
- Score: 19.79214899011072
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A crucial challenge for generative large language models (LLMs) is diversity:
when a user's prompt is under-specified, models may follow implicit assumptions
while generating a response, which may result in homogenization of the
responses, as well as certain demographic groups being under-represented or
even erased from the generated responses. In this paper, we formalize diversity
of representation in generative LLMs. We present evaluation datasets and
propose metrics to measure diversity in generated responses along people and
culture axes. We find that LLMs understand the notion of diversity, and that
they can reason and critique their own responses for that goal. This finding
motivated a new prompting technique called collective-critique and self-voting
(CCSV) to self-improve people diversity of LLMs by tapping into its diversity
reasoning capabilities, without relying on handcrafted examples or prompt
tuning. Extensive empirical experiments with both human and automated
evaluations show that our proposed approach is effective at improving people
and culture diversity, and outperforms all baseline methods by a large margin.
Related papers
- One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity [2.5975241792179378]
Researchers have proposed using large language models (LLMs) as replacements for humans in behavioral research.
It is debated whether post-training alignment (RLHF or RLAIF) affects models' internal diversity.
We use a new way of measuring the conceptual diversity of synthetically-generated LLM "populations" by relating the internal variability of simulated individuals to the population-level variability.
arXiv Detail & Related papers (2024-11-07T04:38:58Z) - Unified Generative and Discriminative Training for Multi-modal Large Language Models [88.84491005030316]
Generative training has enabled Vision-Language Models (VLMs) to tackle various complex tasks.
Discriminative training, exemplified by models like CLIP, excels in zero-shot image-text classification and retrieval.
This paper proposes a unified approach that integrates the strengths of both paradigms.
arXiv Detail & Related papers (2024-11-01T01:51:31Z) - Capturing Bias Diversity in LLMs [1.9685736810241874]
This paper presents research on enhancements to Large Language Models (LLMs) through the addition of diversity in its generated outputs.
By developing multiple customised instances of a GPT model, each reflecting biases in specific demographic characteristics including gender, age, and race, we propose, develop and evaluate a framework for a more nuanced and representative AI dialogue which we call BiasGPT.
In this paper, through experiments, we demonstrate the capabilities of a GPT model to embed different biases which, when combined, can open the possibilities of more inclusive AI technologies.
arXiv Detail & Related papers (2024-10-09T17:07:50Z) - PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - The Factuality Tax of Diversity-Intervened Text-to-Image Generation: Benchmark and Fact-Augmented Intervention [61.80236015147771]
We quantify the trade-off between using diversity interventions and preserving demographic factuality in T2I models.
Experiments on DoFaiR reveal that diversity-oriented instructions increase the number of different gender and racial groups.
We propose Fact-Augmented Intervention (FAI) to reflect on verbalized or retrieved factual information about gender and racial compositions of generation subjects in history.
arXiv Detail & Related papers (2024-06-29T09:09:42Z) - Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning [28.654890118684957]
Generative Commonsense Reasoning (GCR) requires a model to reason about a situation using commonsense knowledge.
The diversity of the generation is equally important because it reflects the model's ability to use a range of commonsense knowledge facts.
We propose a simple method that diversifies the LLM generations, while preserving their quality.
arXiv Detail & Related papers (2024-04-25T17:52:39Z) - Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment [84.32768080422349]
Alignment with human preference prevents large language models from generating misleading or toxic content.
We propose a new formulation of prompt diversity, implying a linear correlation with the final performance of LLMs after fine-tuning.
arXiv Detail & Related papers (2024-03-17T07:08:55Z) - On the steerability of large language models toward data-driven personas [98.9138902560793]
Large language models (LLMs) are known to generate biased responses where the opinions of certain groups and populations are underrepresented.
Here, we present a novel approach to achieve controllable generation of specific viewpoints using LLMs.
arXiv Detail & Related papers (2023-11-08T19:01:13Z) - Improving Factuality and Reasoning in Language Models through Multiagent
Debate [95.10641301155232]
We present a complementary approach to improve language responses where multiple language model instances propose and debate their individual responses and reasoning processes over multiple rounds to arrive at a common final answer.
Our findings indicate that this approach significantly enhances mathematical and strategic reasoning across a number of tasks.
Our approach may be directly applied to existing black-box models and uses identical procedure and prompts for all tasks we investigate.
arXiv Detail & Related papers (2023-05-23T17:55:11Z) - Measuring and Improving Semantic Diversity of Dialogue Generation [21.59385143783728]
We introduce a new automatic evaluation metric to measure the semantic diversity of generated responses.
We show that our proposed metric captures human judgments on response diversity better than existing lexical-level diversity metrics.
We also propose a simple yet effective learning method that improves the semantic diversity of generated responses.
arXiv Detail & Related papers (2022-10-11T18:36:54Z) - Evaluating the Evaluation of Diversity in Natural Language Generation [43.05127848086264]
We propose a framework for evaluating diversity metrics in natural language generation systems.
Our framework can advance the understanding of different diversity metrics, an essential step on the road towards better NLG systems.
arXiv Detail & Related papers (2020-04-06T20:44:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.