Related papers: Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning

Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning

URL: http://arxiv.org/abs/2412.03343v1
Date: Wed, 04 Dec 2024 14:23:16 GMT
Title: Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning
Authors: Long Mai, Julie Carson-Berndsen,
Abstract summary: Possibility Exploration Fine-Tuning (PEFT) is a task-agnostic framework that enhances the text diversity of Large Language Models (LLMs) without increasing latency or computational cost.<n>PEFT significantly enhances the diversity of LLM outputs, as evidenced by lower similarity between candidate responses.<n>It can also notably reduce demographic bias in dialogue systems.
Score: 23.456302461693053
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While Large Language Models (LLMs) have made significant strides in replicating human-like abilities, there are concerns about a reduction in the linguistic diversity of their outputs. This results in the homogenization of viewpoints and perspectives, as well as the underrepresentation of specific demographic groups. Although several fine-tuning and prompting techniques have been suggested to tackle the issue, they are often tailored to specific tasks or come with a substantial increase in computational cost and latency. This makes them challenging to apply to applications that demand very low latency, such as chatbots and virtual assistants. We propose Possibility Exploration Fine-Tuning (PEFT), a task-agnostic framework that enhances the text diversity of LLMs without increasing latency or computational cost. Given the same prompt, models fine-tuned with PEFT can simultaneously generate multiple diverse responses, each corresponding with a controllable possibility number. Experiments on dialogue and story generation tasks demonstrate that PEFT significantly enhances the diversity of LLM outputs, as evidenced by lower similarity between candidate responses. Since PEFT emphasizes semantic diversity over lexical diversity, it can also notably reduce demographic bias in dialogue systems. The implementations and datasets are available in our repository: https://github.com/mailong25/peft_diversity

Related papers

MultiCaption: Detecting disinformation using multilingual visual claims [10.69065586825833]
We present MultiCaption, a dataset specifically designed for detecting contradictions in visual claims.<n>The resulting dataset comprises 11,088 visual claims in 64 languages.<n>The gains from multilingual training and testing highlight the dataset's potential for building effective multilingual fact-checking pipelines.
arXiv Detail & Related papers (2026-01-16T11:57:07Z)
Mind the Gap: Conformative Decoding to Improve Output Diversity of Instruction-Tuned Large Language Models [0.0]
This paper investigates the diversity gap'' for a writing prompt narrative generation task.<n>Results show significant decreases in diversity due to instruction-tuning.<n>We present a new decoding strategy, conformative decoding, which guides an instruct model using its more diverse base model to reintroduce output diversity.
arXiv Detail & Related papers (2025-07-28T16:04:25Z)
Rethinking Hate Speech Detection on Social Media: Can LLMs Replace Traditional Models? [3.611706857555358]
Hate speech detection across contemporary social media presents unique challenges due to linguistic diversity and the informal nature of online discourse.<n>These challenges are further amplified in settings involving code-mixing, transliteration, and culturally nuanced expressions.<n>We argue that recent large language models (LLMs) not only surpass them but also redefine the landscape of hate speech detection more broadly.
arXiv Detail & Related papers (2025-06-15T06:48:47Z)
Evaluating the Diversity and Quality of LLM Generated Content [72.84945252821908]
We introduce a framework for measuring effective semantic diversity--diversity among outputs that meet quality thresholds. Although preference-tuned models exhibit reduced lexical and syntactic diversity, they produce greater effective semantic diversity than SFT or base models. These findings have important implications for applications that require diverse yet high-quality outputs.
arXiv Detail & Related papers (2025-04-16T23:02:23Z)
BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages [93.92804151830744]
We present BRIGHTER -- a collection of multi-labeled datasets in 28 different languages. We describe the data collection and annotation processes and the challenges of building these datasets. We show that BRIGHTER datasets are a step towards bridging the gap in text-based emotion recognition.
arXiv Detail & Related papers (2025-02-17T15:39:50Z)
Can xLLMs Understand the Structure of Dialog? Exploring Multilingual Response Generation in Complex Scenarios [8.131774353504472]
We introduce XMP, a high-quality parallel Multilingual dataset sourced from Multi-party Podcast dialogues. Each sample in the dataset features at least three participants discussing a wide range of topics, including society, culture, politics, and entertainment. We uncover significant limitations in previously recognized multilingual capabilities of LLMs when applied to such complex dialogue scenarios.
arXiv Detail & Related papers (2025-01-20T04:33:03Z)
P-MMEval: A Parallel Multilingual Multitask Benchmark for Consistent Evaluation of LLMs [84.24644520272835]
We introduce P-MMEval, a large-scale benchmark covering effective fundamental and capability-specialized datasets.<n>P-MMEval delivers consistent language coverage across various datasets and provides parallel samples.<n>We conduct extensive experiments on representative multilingual model series to compare performances across models and tasks.
arXiv Detail & Related papers (2024-11-14T01:29:36Z)
Textualized and Feature-based Models for Compound Multimodal Emotion Recognition in the Wild [45.29814349246784]
multimodal large language models (LLMs) rely on explicit non-verbal cues that may be translated from different non-textual modalities into text. This paper compares the potential of text- and feature-based approaches for compound multimodal ER in videos.
arXiv Detail & Related papers (2024-07-17T18:01:25Z)
CIVICS: Building a Dataset for Examining Culturally-Informed Values in Large Language Models [59.22460740026037]
"CIVICS: Culturally-Informed & Values-Inclusive Corpus for Societal impacts" dataset is designed to evaluate the social and cultural variation of Large Language Models (LLMs) We create a hand-crafted, multilingual dataset of value-laden prompts which address specific socially sensitive topics, including LGBTQI rights, social welfare, immigration, disability rights, and surrogacy.
arXiv Detail & Related papers (2024-05-22T20:19:10Z)
Scaling Data Diversity for Fine-Tuning Language Models in Human Alignment [84.32768080422349]
Alignment with human preference prevents large language models from generating misleading or toxic content. We propose a new formulation of prompt diversity, implying a linear correlation with the final performance of LLMs after fine-tuning.
arXiv Detail & Related papers (2024-03-17T07:08:55Z)
How do Large Language Models Handle Multilingualism? [81.15060972112563]
This study explores how large language models (LLMs) handle multilingualism. LLMs initially understand the query, converting multilingual inputs into English for task-solving. In the intermediate layers, they employ English for thinking and incorporate multilingual knowledge with self-attention and feed-forward structures.
arXiv Detail & Related papers (2024-02-29T02:55:26Z)
How Far Can We Extract Diverse Perspectives from Large Language Models? [16.16678226707335]
We show that large language models (LLMs) can generate diverse perspectives on subjective topics. We propose a criteria-based prompting technique to ground diverse opinions. Our methods, applied to various tasks, show that LLMs can indeed produce diverse opinions according to the degree of task subjectivity.
arXiv Detail & Related papers (2023-11-16T11:23:38Z)
Improving Diversity of Demographic Representation in Large Language Models via Collective-Critiques and Self-Voting [19.79214899011072]
This paper formalizes diversity of representation in generative large language models. We present evaluation datasets and propose metrics to measure diversity in generated responses along people and culture axes. We find that LLMs understand the notion of diversity, and that they can reason and critique their own responses for that goal.
arXiv Detail & Related papers (2023-10-25T10:17:17Z)
OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs. Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z)
TextMI: Textualize Multimodal Information for Integrating Non-verbal Cues in Pre-trained Language Models [5.668457303716451]
We propose TextMI as a general, competitive baseline for multimodal behavioral analysis tasks. Our approach significantly reduces model complexity, adds interpretability to the model's decision, and can be applied for a diverse set of tasks.
arXiv Detail & Related papers (2023-03-27T17:54:32Z)
Improving Classifier Training Efficiency for Automatic Cyberbullying Detection with Feature Density [58.64907136562178]
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods. We hypothesise that estimating dataset complexity allows for the reduction of the number of required experiments. The difference in linguistic complexity of datasets allows us to additionally discuss the efficacy of linguistically-backed word preprocessing.
arXiv Detail & Related papers (2021-11-02T15:48:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.