Inducing anxiety in large language models increases exploration and bias
- URL: http://arxiv.org/abs/2304.11111v1
- Date: Fri, 21 Apr 2023 16:29:43 GMT
- Title: Inducing anxiety in large language models increases exploration and bias
- Authors: Julian Coda-Forno, Kristin Witte, Akshay K. Jagadish, Marcel Binz,
Zeynep Akata, Eric Schulz
- Abstract summary: We focus on the Generative Pre-Trained Transformer 3.5 and subject it to tasks commonly studied in psychiatry.
Our results show that GPT-3.5 responds robustly to a common anxiety questionnaire, producing higher anxiety scores than human subjects.
GPT-3.5's responses can be predictably changed by using emotion-inducing prompts.
- Score: 29.833677055101326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models are transforming research on machine learning while
galvanizing public debates. Understanding not only when these models work well
and succeed but also why they fail and misbehave is of great societal
relevance. We propose to turn the lens of computational psychiatry, a framework
used to computationally describe and modify aberrant behavior, to the outputs
produced by these models. We focus on the Generative Pre-Trained Transformer
3.5 and subject it to tasks commonly studied in psychiatry. Our results show
that GPT-3.5 responds robustly to a common anxiety questionnaire, producing
higher anxiety scores than human subjects. Moreover, GPT-3.5's responses can be
predictably changed by using emotion-inducing prompts. Emotion-induction not
only influences GPT-3.5's behavior in a cognitive task measuring exploratory
decision-making but also influences its behavior in a previously-established
task measuring biases such as racism and ableism. Crucially, GPT-3.5 shows a
strong increase in biases when prompted with anxiety-inducing text. Thus, it is
likely that how prompts are communicated to large language models has a strong
influence on their behavior in applied settings. These results progress our
understanding of prompt engineering and demonstrate the usefulness of methods
taken from computational psychiatry for studying the capable algorithms to
which we increasingly delegate authority and autonomy.
Related papers
- The Efficacy of Conversational Artificial Intelligence in Rectifying the Theory of Mind and Autonomy Biases: Comparative Analysis [0.0]
The increasing deployment of Conversational Artificial Intelligence (CAI) in mental health interventions necessitates an evaluation of their efficacy in rectifying cognitive biases and recognizing affect in human-AI interactions.
This study aimed to assess the effectiveness of therapeutic chatbots versus general-purpose language models (GPT-3.5, GPT-4, Gemini Pro) in identifying and rectifying cognitive biases and recognizing affect in user interactions.
arXiv Detail & Related papers (2024-06-19T20:20:28Z) - The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models [78.69526166193236]
Pre-trained Language models (PLMs) have been acknowledged to contain harmful information, such as social biases.
We propose sc Social Bias Neurons to accurately pinpoint units (i.e., neurons) in a language model that can be attributed to undesirable behavior, such as social bias.
As measured by prior metrics from StereoSet, our model achieves a higher degree of fairness while maintaining language modeling ability with low cost.
arXiv Detail & Related papers (2024-06-14T15:41:06Z) - LLM Agents for Psychology: A Study on Gamified Assessments [71.08193163042107]
Psychological measurement is essential for mental health, self-understanding, and personal development.
PsychoGAT (Psychological Game AgenTs) achieves statistically significant excellence in psychometric metrics such as reliability, convergent validity, and discriminant validity.
arXiv Detail & Related papers (2024-02-19T18:00:30Z) - Investigating Large Language Models' Perception of Emotion Using
Appraisal Theory [3.0902630634005797]
Large Language Models (LLM) have significantly advanced in recent years and are now being used by the general public.
In this work, we investigate their emotion perception through the lens of appraisal and coping theory.
We applied SCPQ to three recent LLMs from OpenAI, davinci-003, ChatGPT, and GPT-4 and compared the results with predictions from the appraisal theory and human data.
arXiv Detail & Related papers (2023-10-03T16:34:47Z) - Fine-grained Affective Processing Capabilities Emerging from Large
Language Models [7.17010996725842]
We explore ChatGPT's zero-shot ability to perform affective computing tasks using prompting alone.
We show that ChatGPT a) performs meaningful sentiment analysis in the Valence, Arousal and Dominance dimensions, b) has meaningful emotion representations in terms of emotion categories, and c) can perform basic appraisal-based emotion elicitation of situations.
arXiv Detail & Related papers (2023-09-04T15:32:47Z) - Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias [57.42417061979399]
Recent studies show that instruction tuning (IT) and reinforcement learning from human feedback (RLHF) improve the abilities of large language models (LMs) dramatically.
In this work, we investigate the effect of IT and RLHF on decision making and reasoning in LMs.
Our findings highlight the presence of these biases in various models from the GPT-3, Mistral, and T5 families.
arXiv Detail & Related papers (2023-08-01T01:39:25Z) - Generative Models as a Complex Systems Science: How can we make sense of
large language model behavior? [75.79305790453654]
Coaxing out desired behavior from pretrained models, while avoiding undesirable ones, has redefined NLP.
We argue for a systematic effort to decompose language model behavior into categories that explain cross-task performance.
arXiv Detail & Related papers (2023-07-31T22:58:41Z) - Human-Like Intuitive Behavior and Reasoning Biases Emerged in Language
Models -- and Disappeared in GPT-4 [0.0]
We show that large language models (LLMs) exhibit behavior that resembles human-like intuition.
We also probe how sturdy the inclination for intuitive-like decision-making is.
arXiv Detail & Related papers (2023-06-13T08:43:13Z) - Using cognitive psychology to understand GPT-3 [0.0]
We study GPT-3, a recent large language model, using tools from cognitive psychology.
We assess GPT-3's decision-making, information search, deliberation, and causal reasoning abilities.
arXiv Detail & Related papers (2022-06-21T20:06:03Z) - The world seems different in a social context: a neural network analysis
of human experimental data [57.729312306803955]
We show that it is possible to replicate human behavioral data in both individual and social task settings by modifying the precision of prior and sensory signals.
An analysis of the neural activation traces of the trained networks provides evidence that information is coded in fundamentally different ways in the network in the individual and in the social conditions.
arXiv Detail & Related papers (2022-03-03T17:19:12Z) - AGENT: A Benchmark for Core Psychological Reasoning [60.35621718321559]
Intuitive psychology is the ability to reason about hidden mental variables that drive observable actions.
Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning.
We present a benchmark consisting of procedurally generated 3D animations, AGENT, structured around four scenarios.
arXiv Detail & Related papers (2021-02-24T14:58:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.