StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?
- URL: http://arxiv.org/abs/2409.17167v1
- Date: Sat, 14 Sep 2024 08:32:31 GMT
- Title: StressPrompt: Does Stress Impact Large Language Models and Human Performance Similarly?
- Authors: Guobin Shen, Dongcheng Zhao, Aorigele Bao, Xiang He, Yiting Dong, Yi Zeng,
- Abstract summary: This study explores whether Large Language Models (LLMs) exhibit stress responses similar to those of humans.
We developed a novel set of prompts, termed StressPrompt, designed to induce varying levels of stress.
The findings suggest that LLMs, like humans, perform optimally under moderate stress, consistent with the Yerkes-Dodson law.
- Score: 7.573284169975824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human beings often experience stress, which can significantly influence their performance. This study explores whether Large Language Models (LLMs) exhibit stress responses similar to those of humans and whether their performance fluctuates under different stress-inducing prompts. To investigate this, we developed a novel set of prompts, termed StressPrompt, designed to induce varying levels of stress. These prompts were derived from established psychological frameworks and carefully calibrated based on ratings from human participants. We then applied these prompts to several LLMs to assess their responses across a range of tasks, including instruction-following, complex reasoning, and emotional intelligence. The findings suggest that LLMs, like humans, perform optimally under moderate stress, consistent with the Yerkes-Dodson law. Notably, their performance declines under both low and high-stress conditions. Our analysis further revealed that these StressPrompts significantly alter the internal states of LLMs, leading to changes in their neural representations that mirror human responses to stress. This research provides critical insights into the operational robustness and flexibility of LLMs, demonstrating the importance of designing AI systems capable of maintaining high performance in real-world scenarios where stress is prevalent, such as in customer service, healthcare, and emergency response contexts. Moreover, this study contributes to the broader AI research community by offering a new perspective on how LLMs handle different scenarios and their similarities to human cognition.
Related papers
- PersLLM: A Personified Training Approach for Large Language Models [66.16513246245401]
We propose PersLLM, integrating psychology-grounded principles of personality: social practice, consistency, and dynamic development.
We incorporate personality traits directly into the model parameters, enhancing the model's resistance to induction, promoting consistency, and supporting the dynamic evolution of personality.
arXiv Detail & Related papers (2024-07-17T08:13:22Z) - Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models [57.518784855080334]
Large Language Models (LLMs) have demonstrated exceptional task-solving capabilities, increasingly adopting roles akin to human-like assistants.
This paper presents a framework for investigating psychology dimension in LLMs, including psychological identification, assessment dataset curation, and assessment with results validation.
We introduce a comprehensive psychometrics benchmark for LLMs that covers six psychological dimensions: personality, values, emotion, theory of mind, motivation, and intelligence.
arXiv Detail & Related papers (2024-06-25T16:09:08Z) - Large Language Models Understand and Can be Enhanced by Emotional
Stimuli [53.53886609012119]
We take the first step towards exploring the ability of Large Language Models to understand emotional stimuli.
Our experiments show that LLMs have a grasp of emotional intelligence, and their performance can be improved with emotional prompts.
Our human study results demonstrate that EmotionPrompt significantly boosts the performance of generative tasks.
arXiv Detail & Related papers (2023-07-14T00:57:12Z) - Employing Multimodal Machine Learning for Stress Detection [8.430502131775722]
Mental wellness is one of the most neglected but crucial aspects of today's world.
In this work, a multimodal AI-based framework is proposed to monitor a person's working behavior and stress levels.
arXiv Detail & Related papers (2023-06-15T14:34:16Z) - Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in
Large Language Models [82.50173296858377]
Many anecdotal examples were used to suggest newer large language models (LLMs) like ChatGPT and GPT-4 exhibit Neural Theory-of-Mind (N-ToM)
We investigate the extent of LLMs' N-ToM through an extensive evaluation on 6 tasks and find that while LLMs exhibit certain N-ToM abilities, this behavior is far from being robust.
arXiv Detail & Related papers (2023-05-24T06:14:31Z) - Inducing anxiety in large language models can induce bias [47.85323153767388]
We focus on twelve established large language models (LLMs) and subject them to a questionnaire commonly used in psychiatry.
Our results show that six of the latest LLMs respond robustly to the anxiety questionnaire, producing comparable anxiety scores to humans.
Anxiety-induction not only influences LLMs' scores on an anxiety questionnaire but also influences their behavior in a previously-established benchmark measuring biases such as racism and ageism.
arXiv Detail & Related papers (2023-04-21T16:29:43Z) - Insights on Modelling Physiological, Appraisal, and Affective Indicators
of Stress using Audio Features [10.093374748790037]
Utilising speech samples collected while the subject is undergoing an induced stress episode has recently shown promising results for the automatic characterisation of individual stress responses.
We introduce new findings that shed light onto whether speech signals are suited to model physiological biomarkers.
arXiv Detail & Related papers (2022-05-09T14:32:38Z) - Hybrid Handcrafted and Learnable Audio Representation for Analysis of
Speech Under Cognitive and Physical Load [17.394964035035866]
We introduce a set of five datasets for task load detection in speech.
The voice recordings were collected as either cognitive or physical stress was induced in the cohort of volunteers.
We used the datasets to design and evaluate a novel self-supervised audio representation.
arXiv Detail & Related papers (2022-03-30T19:43:21Z) - Autonomous Reinforcement Learning: Formalism and Benchmarking [106.25788536376007]
Real-world embodied learning, such as that performed by humans and animals, is situated in a continual, non-episodic world.
Common benchmark tasks in RL are episodic, with the environment resetting between trials to provide the agent with multiple attempts.
This discrepancy presents a major challenge when attempting to take RL algorithms developed for episodic simulated environments and run them on real-world platforms.
arXiv Detail & Related papers (2021-12-17T16:28:06Z) - MUSER: MUltimodal Stress Detection using Emotion Recognition as an
Auxiliary Task [22.80682208862559]
Stress and emotion are both human affective states, and stress has proven to have important implications on the regulation and expression of emotion.
In this work, we investigate the value of emotion recognition as an auxiliary task to improve stress detection.
We propose M -- a transformer-based model architecture and a novel multi-task learning algorithm with speed-based dynamic sampling strategy.
arXiv Detail & Related papers (2021-05-17T20:24:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.