Behind the Screen: Investigating ChatGPT's Dark Personality Traits and
Conspiracy Beliefs
- URL: http://arxiv.org/abs/2402.04110v1
- Date: Tue, 6 Feb 2024 16:03:57 GMT
- Title: Behind the Screen: Investigating ChatGPT's Dark Personality Traits and
Conspiracy Beliefs
- Authors: Erik Weber, J\'er\^ome Rutinowski, Markus Pauly
- Abstract summary: This paper analyzes the dark personality traits and conspiracy beliefs of GPT-3.5 and GPT-4.
Dark personality traits and conspiracy beliefs were not particularly pronounced in either model.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: ChatGPT is notorious for its intransparent behavior. This paper tries to shed
light on this, providing an in-depth analysis of the dark personality traits
and conspiracy beliefs of GPT-3.5 and GPT-4. Different psychological tests and
questionnaires were employed, including the Dark Factor Test, the Mach-IV
Scale, the Generic Conspiracy Belief Scale, and the Conspiracy Mentality Scale.
The responses were analyzed computing average scores, standard deviations, and
significance tests to investigate differences between GPT-3.5 and GPT-4. For
traits that have shown to be interdependent in human studies, correlations were
considered. Additionally, system roles corresponding to groups that have shown
distinct answering behavior in the corresponding questionnaires were applied to
examine the models' ability to reflect characteristics associated with these
roles in their responses. Dark personality traits and conspiracy beliefs were
not particularly pronounced in either model with little differences between
GPT-3.5 and GPT-4. However, GPT-4 showed a pronounced tendency to believe in
information withholding. This is particularly intriguing given that GPT-4 is
trained on a significantly larger dataset than GPT-3.5. Apparently, in this
case an increased data exposure correlates with a greater belief in the control
of information. An assignment of extreme political affiliations increased the
belief in conspiracy theories. Test sequencing affected the models' responses
and the observed correlations, indicating a form of contextual memory.
Related papers
- An Empirical Analysis on Large Language Models in Debate Evaluation [10.677407097411768]
We investigate the capabilities and inherent biases of advanced large language models (LLMs) such as GPT-3.5 and GPT-4 in the context of debate evaluation.
We uncover a consistent bias in both GPT-3.5 and GPT-4 towards the second candidate response presented.
We also uncover lexical biases in both GPT-3.5 and GPT-4, especially when label sets carry connotations such as numerical or sequential.
arXiv Detail & Related papers (2024-05-28T18:34:53Z) - Unveiling Divergent Inductive Biases of LLMs on Temporal Data [4.561800294155325]
This research focuses on evaluating the performance of GPT-3.5 and GPT-4 models in the analysis of temporal data.
biases toward specific temporal relationships come to light, with GPT-3.5 demonstrating a preference for "AFTER'' in the QA format for both implicit and explicit events, while GPT-4 leans towards "BEFORE''
arXiv Detail & Related papers (2024-04-01T19:56:41Z) - Is GPT-4 a reliable rater? Evaluating Consistency in GPT-4 Text Ratings [63.35165397320137]
This study investigates the consistency of feedback ratings generated by OpenAI's GPT-4.
The model rated responses to tasks within the Higher Education subject domain of macroeconomics in terms of their content and style.
arXiv Detail & Related papers (2023-08-03T12:47:17Z) - Instructed to Bias: Instruction-Tuned Language Models Exhibit Emergent Cognitive Bias [57.42417061979399]
Recent studies show that instruction tuning (IT) and reinforcement learning from human feedback (RLHF) improve the abilities of large language models (LMs) dramatically.
In this work, we investigate the effect of IT and RLHF on decision making and reasoning in LMs.
Our findings highlight the presence of these biases in various models from the GPT-3, Mistral, and T5 families.
arXiv Detail & Related papers (2023-08-01T01:39:25Z) - How is ChatGPT's behavior changing over time? [72.79311931941876]
We evaluate the March 2023 and June 2023 versions of GPT-3.5 and GPT-4.
We find that the performance and behavior of both GPT-3.5 and GPT-4 can vary greatly over time.
arXiv Detail & Related papers (2023-07-18T06:56:08Z) - DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT
Models [92.6951708781736]
This work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5.
We find that GPT models can be easily misled to generate toxic and biased outputs and leak private information.
Our work illustrates a comprehensive trustworthiness evaluation of GPT models and sheds light on the trustworthiness gaps.
arXiv Detail & Related papers (2023-06-20T17:24:23Z) - Is GPT-4 a Good Data Analyst? [67.35956981748699]
We consider GPT-4 as a data analyst to perform end-to-end data analysis with databases from a wide range of domains.
We design several task-specific evaluation metrics to systematically compare the performance between several professional human data analysts and GPT-4.
Experimental results show that GPT-4 can achieve comparable performance to humans.
arXiv Detail & Related papers (2023-05-24T11:26:59Z) - The Self-Perception and Political Biases of ChatGPT [0.0]
This contribution analyzes the self-perception and political biases of OpenAI's Large Language Model ChatGPT.
The political compass test revealed a bias towards progressive and libertarian views.
Political questionnaires for the G7 member states indicated a bias towards progressive views but no significant bias between authoritarian and libertarian views.
arXiv Detail & Related papers (2023-04-14T18:06:13Z) - Humans in Humans Out: On GPT Converging Toward Common Sense in both
Success and Failure [0.0]
GPT-3, GPT-3.5, and GPT-4 were trained on large quantities of human-generated text.
We show that GPT-3 showed evidence of ETR-predicted outputs for 59% of these examples.
Remarkably, the production of human-like fallacious judgments increased from 18% in GPT-3 to 33% in GPT-3.5 and 34% in GPT-4.
arXiv Detail & Related papers (2023-03-30T10:32:18Z) - Evaluating Psychological Safety of Large Language Models [72.88260608425949]
We designed unbiased prompts to evaluate the psychological safety of large language models (LLMs)
We tested five different LLMs by using two personality tests: Short Dark Triad (SD-3) and Big Five Inventory (BFI)
Despite being instruction fine-tuned with safety metrics to reduce toxicity, InstructGPT, GPT-3.5, and GPT-4 still showed dark personality patterns.
Fine-tuning Llama-2-chat-7B with responses from BFI using direct preference optimization could effectively reduce the psychological toxicity of the model.
arXiv Detail & Related papers (2022-12-20T18:45:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.