Exploring Human-LLM Conversations: Mental Models and the Originator of Toxicity
- URL: http://arxiv.org/abs/2407.05977v1
- Date: Mon, 8 Jul 2024 14:20:05 GMT
- Title: Exploring Human-LLM Conversations: Mental Models and the Originator of Toxicity
- Authors: Johannes Schneider, Arianna Casanova Flores, Anne-Catherine Kranz,
- Abstract summary: This study explores real-world human interactions with large language models (LLMs) in diverse, unconstrained settings.
Our findings show that although LLMs are rightfully accused of providing toxic content, it is mostly demanded or at least provoked by humans who actively seek such content.
- Score: 1.4003044924094596
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study explores real-world human interactions with large language models (LLMs) in diverse, unconstrained settings in contrast to most prior research focusing on ethically trimmed models like ChatGPT for specific tasks. We aim to understand the originator of toxicity. Our findings show that although LLMs are rightfully accused of providing toxic content, it is mostly demanded or at least provoked by humans who actively seek such content. Our manual analysis of hundreds of conversations judged as toxic by APIs commercial vendors, also raises questions with respect to current practices of what user requests are refused to answer. Furthermore, we conjecture based on multiple empirical indicators that humans exhibit a change of their mental model, switching from the mindset of interacting with a machine more towards interacting with a human.
Related papers
- Modulating Language Model Experiences through Frictions [56.17593192325438]
Over-consumption of language model outputs risks propagating unchecked errors in the short-term and damaging human capabilities in the long-term.
We propose selective frictions for language model experiences, inspired by behavioral science interventions, to dampen misuse.
arXiv Detail & Related papers (2024-06-24T16:31:11Z) - Analyzing Human Questioning Behavior and Causal Curiosity through Natural Queries [91.70689724416698]
We present NatQuest, a collection of 13,500 naturally occurring questions from three diverse sources.
Our analysis reveals a significant presence of causal questions (up to 42%) within the dataset.
arXiv Detail & Related papers (2024-05-30T17:55:28Z) - Inter-X: Towards Versatile Human-Human Interaction Analysis [100.254438708001]
We propose Inter-X, a dataset with accurate body movements and diverse interaction patterns.
The dataset includes 11K interaction sequences and more than 8.1M frames.
We also equip Inter-X with versatile annotations of more than 34K fine-grained human part-level textual descriptions.
arXiv Detail & Related papers (2023-12-26T13:36:05Z) - Comprehensive Assessment of Toxicity in ChatGPT [49.71090497696024]
We evaluate the toxicity in ChatGPT by utilizing instruction-tuning datasets.
prompts in creative writing tasks can be 2x more likely to elicit toxic responses.
Certain deliberately toxic prompts, designed in earlier studies, no longer yield harmful responses.
arXiv Detail & Related papers (2023-11-03T14:37:53Z) - Towards Mitigating Hallucination in Large Language Models via
Self-Reflection [63.2543947174318]
Large language models (LLMs) have shown promise for generative and knowledge-intensive tasks including question-answering (QA) tasks.
This paper analyses the phenomenon of hallucination in medical generative QA systems using widely adopted LLMs and datasets.
arXiv Detail & Related papers (2023-10-10T03:05:44Z) - Dynamic Causal Disentanglement Model for Dialogue Emotion Detection [77.96255121683011]
We propose a Dynamic Causal Disentanglement Model based on hidden variable separation.
This model effectively decomposes the content of dialogues and investigates the temporal accumulation of emotions.
Specifically, we propose a dynamic temporal disentanglement model to infer the propagation of utterances and hidden variables.
arXiv Detail & Related papers (2023-09-13T12:58:09Z) - Toxicity in ChatGPT: Analyzing Persona-assigned Language Models [23.53559226972413]
Large language models (LLMs) have shown incredible capabilities and transcended the natural language processing (NLP) community.
We systematically evaluate toxicity in over half a million generations of ChatGPT, a popular dialogue-based LLM.
We find that setting the system parameter of ChatGPT by assigning it a persona, significantly increases the toxicity of generations.
arXiv Detail & Related papers (2023-04-11T16:53:54Z) - A Multibias-mitigated and Sentiment Knowledge Enriched Transformer for
Debiasing in Multimodal Conversational Emotion Recognition [9.020664590692705]
Multimodal emotion recognition in conversations (mERC) is an active research topic in natural language processing (NLP)
Innumerable implicit prejudices and preconceptions fill human language and conversations.
Existing data-driven mERC approaches may offer higher emotional scores on utterances by females than males.
arXiv Detail & Related papers (2022-07-17T08:16:49Z) - Revisiting Contextual Toxicity Detection in Conversations [28.465019968374413]
We show that toxicity labelling by humans is in general influenced by the conversational structure, polarity and topic of the context.
We propose to bring these findings into computational detection models by introducing (a) neural architectures for contextual toxicity detection.
We have also demonstrated that such models can benefit from synthetic data, especially in the social media domain.
arXiv Detail & Related papers (2021-11-24T11:50:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.