Improving and Assessing the Fidelity of Large Language Models Alignment to Online Communities
- URL: http://arxiv.org/abs/2408.09366v1
- Date: Sun, 18 Aug 2024 05:41:36 GMT
- Title: Improving and Assessing the Fidelity of Large Language Models Alignment to Online Communities
- Authors: Minh Duc Chu, Zihao He, Rebecca Dorn, Kristina Lerman,
- Abstract summary: Large language models (LLMs) have shown promise in representing individuals and communities.
This paper presents a framework for aligning LLMs with online communities via instruction-tuning.
We demonstrate the utility of our approach by applying it to online communities centered on dieting and body image.
- Score: 5.392300313326522
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have shown promise in representing individuals and communities, offering new ways to study complex social dynamics. However, effectively aligning LLMs with specific human groups and systematically assessing the fidelity of the alignment remains a challenge. This paper presents a robust framework for aligning LLMs with online communities via instruction-tuning and comprehensively evaluating alignment across various aspects of language, including authenticity, emotional tone, toxicity, and harm. We demonstrate the utility of our approach by applying it to online communities centered on dieting and body image. We administer an eating disorder psychometric test to the aligned LLMs to reveal unhealthy beliefs and successfully differentiate communities with varying levels of eating disorder risk. Our results highlight the potential of LLMs in automated moderation and broader applications in public health and social science research.
Related papers
- Stars, Stripes, and Silicon: Unravelling the ChatGPT's All-American, Monochrome, Cis-centric Bias [0.0]
The paper calls for interdisciplinary efforts to address these challenges.
It highlights the need for collaboration between researchers, practitioners, and stakeholders to establish governance frameworks.
arXiv Detail & Related papers (2024-10-02T08:55:00Z) - A Multi-LLM Debiasing Framework [85.17156744155915]
Large Language Models (LLMs) are powerful tools with the potential to benefit society immensely, yet, they have demonstrated biases that perpetuate societal inequalities.
Recent research has shown a growing interest in multi-LLM approaches, which have been demonstrated to be effective in improving the quality of reasoning.
We propose a novel multi-LLM debiasing framework aimed at reducing bias in LLMs.
arXiv Detail & Related papers (2024-09-20T20:24:50Z) - Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias [16.85625861663094]
Motivated by social psychology principles, we propose a novel strategy named textscPeT that inspires LLMs to integrate diverse human perspectives and self-regulate their responses.
Rigorous evaluations and ablation studies are conducted on two commercial LLMs and three open-source LLMs, revealing textscPeT's superiority in producing less harmful responses.
arXiv Detail & Related papers (2024-07-22T04:25:01Z) - COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities [5.0261645603931475]
We introduce Community-Cross-Instruct, an unsupervised framework for aligning large language models to online communities to elicit beliefs.
We demonstrate the method's utility in accurately representing political and diet communities on Reddit.
arXiv Detail & Related papers (2024-06-17T20:20:47Z) - Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches [69.73783026870998]
This work proposes a novel framework, ValueLex, to reconstruct Large Language Models' unique value system from scratch.
Based on Lexical Hypothesis, ValueLex introduces a generative approach to elicit diverse values from 30+ LLMs.
We identify three core value dimensions, Competence, Character, and Integrity, each with specific subdimensions, revealing that LLMs possess a structured, albeit non-human, value system.
arXiv Detail & Related papers (2024-04-19T09:44:51Z) - Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models.
It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop.
Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z) - Network Formation and Dynamics Among Multi-LLMs [5.8418144988203915]
We show that large language models (LLMs) exhibit key social network principles when asked about their preferences in network formation.
We also investigate LLMs' decision-making based on real-world networks, revealing that triadic closure and homophily have a stronger influence than preferential attachment.
arXiv Detail & Related papers (2024-02-16T13:10:14Z) - Rethinking Machine Unlearning for Large Language Models [85.92660644100582]
We explore machine unlearning in the domain of large language models (LLMs)
This initiative aims to eliminate undesirable data influence (e.g., sensitive or illegal information) and the associated model capabilities.
arXiv Detail & Related papers (2024-02-13T20:51:58Z) - Large Language Models Help Reveal Unhealthy Diet and Body Concerns in Online Eating Disorders Communities [5.392300313326522]
Eating disorders (ED) affect millions of people globally, especially adolescents.
The proliferation of online communities that promote and normalize ED has been linked to this public health crisis.
We propose a novel framework to surface implicit attitudes of online communities by adapting large language models to the language of the community.
arXiv Detail & Related papers (2024-01-17T23:32:56Z) - Redefining Digital Health Interfaces with Large Language Models [69.02059202720073]
Large Language Models (LLMs) have emerged as general-purpose models with the ability to process complex information.
We show how LLMs can provide a novel interface between clinicians and digital technologies.
We develop a new prognostic tool using automated machine learning.
arXiv Detail & Related papers (2023-10-05T14:18:40Z) - Training Socially Aligned Language Models on Simulated Social
Interactions [99.39979111807388]
Social alignment in AI systems aims to ensure that these models behave according to established societal values.
Current language models (LMs) are trained to rigidly replicate their training corpus in isolation.
This work presents a novel training paradigm that permits LMs to learn from simulated social interactions.
arXiv Detail & Related papers (2023-05-26T14:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.