Does Writing with Language Models Reduce Content Diversity?
- URL: http://arxiv.org/abs/2309.05196v3
- Date: Mon, 1 Jul 2024 16:36:30 GMT
- Title: Does Writing with Language Models Reduce Content Diversity?
- Authors: Vishakh Padmakumar, He He,
- Abstract summary: Large language models (LLMs) have led to a surge in collaborative writing with model assistance.
As different users incorporate suggestions from the same model, there is a risk of decreased diversity in the produced content.
We develop a set of diversity metrics and find that writing with InstructGPT (but not the GPT3) results in a statistically significant reduction in diversity.
- Score: 16.22006159795341
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have led to a surge in collaborative writing with model assistance. As different users incorporate suggestions from the same model, there is a risk of decreased diversity in the produced content, potentially limiting diverse perspectives in public discourse. In this work, we measure the impact of co-writing on diversity via a controlled experiment, where users write argumentative essays in three setups -- using a base LLM (GPT3), a feedback-tuned LLM (InstructGPT), and writing without model help. We develop a set of diversity metrics and find that writing with InstructGPT (but not the GPT3) results in a statistically significant reduction in diversity. Specifically, it increases the similarity between the writings of different authors and reduces the overall lexical and content diversity. We additionally find that this effect is mainly attributable to InstructGPT contributing less diverse text to co-written essays. In contrast, the user-contributed text remains unaffected by model collaboration. This suggests that the recent improvement in generation quality from adapting models to human feedback might come at the cost of more homogeneous and less diverse content.
Related papers
- One fish, two fish, but not the whole sea: Alignment reduces language models' conceptual diversity [2.5975241792179378]
Researchers have proposed using large language models (LLMs) as replacements for humans in behavioral research.
It is debated whether post-training alignment (RLHF or RLAIF) affects models' internal diversity.
We use a new way of measuring the conceptual diversity of synthetically-generated LLM "populations" by relating the internal variability of simulated individuals to the population-level variability.
arXiv Detail & Related papers (2024-11-07T04:38:58Z) - Improving Structural Diversity of Blackbox LLMs via Chain-of-Specification Prompting [28.971248570622603]
We propose a diversity metric called structural diversity, where the user provides a mapping from generated text to features capturing the kinds of diversity that they care about.
In our experiments, we show that for structural diversity in the poetry and code domains, CoS significantly improves diversity compared to several baselines.
arXiv Detail & Related papers (2024-08-12T14:34:06Z) - Inclusivity in Large Language Models: Personality Traits and Gender Bias in Scientific Abstracts [49.97673761305336]
We evaluate three large language models (LLMs) for their alignment with human narrative styles and potential gender biases.
Our findings indicate that, while these models generally produce text closely resembling human authored content, variations in stylistic features suggest significant gender biases.
arXiv Detail & Related papers (2024-06-27T19:26:11Z) - Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment [84.32768080422349]
Alignment with human preference prevents large language models from generating misleading or toxic content.
We propose a new formulation of prompt diversity, implying a linear correlation with the final performance of LLMs after fine-tuning.
arXiv Detail & Related papers (2024-03-17T07:08:55Z) - Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL [51.48239006107272]
In this paper, we discuss how to measure and improve the diversity of the demonstrations for text-to-diversity research.
We propose fusing iteratively for demonstrations (Fused) to build a high-diversity demonstration pool.
Our method achieves an average improvement of 3.2% and 5.0% with and without human labeling on several mainstream datasets.
arXiv Detail & Related papers (2024-02-16T13:13:18Z) - AI, write an essay for me: A large-scale comparison of human-written
versus ChatGPT-generated essays [66.36541161082856]
ChatGPT and similar generative AI models have attracted hundreds of millions of users.
This study compares human-written versus ChatGPT-generated argumentative student essays.
arXiv Detail & Related papers (2023-04-24T12:58:28Z) - Exploring Diversity in Back Translation for Low-Resource Machine
Translation [85.03257601325183]
Back translation is one of the most widely used methods for improving the performance of neural machine translation systems.
Recent research has sought to enhance the effectiveness of this method by increasing the 'diversity' of the generated translations.
This work puts forward a more nuanced framework for understanding diversity in training data, splitting it into lexical diversity and syntactic diversity.
arXiv Detail & Related papers (2022-06-01T15:21:16Z) - Semantic Diversity in Dialogue with Natural Language Inference [19.74618235525502]
This paper makes two substantial contributions to improving diversity in dialogue generation.
First, we propose a novel metric which uses Natural Language Inference (NLI) to measure the semantic diversity of a set of model responses for a conversation.
Second, we demonstrate how to iteratively improve the semantic diversity of a sampled set of responses via a new generation procedure called Diversity Threshold Generation.
arXiv Detail & Related papers (2022-05-03T13:56:32Z) - MixPoet: Diverse Poetry Generation via Learning Controllable Mixed
Latent Space [79.70053419040902]
We propose MixPoet, a novel model that absorbs multiple factors to create various styles and promote diversity.
Based on a semi-supervised variational autoencoder, our model disentangles the latent space into some subspaces, with each conditioned on one influence factor by adversarial training.
Experiment results on Chinese poetry demonstrate that MixPoet improves both diversity and quality against three state-of-the-art models.
arXiv Detail & Related papers (2020-03-13T03:31:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.