Who is GPT-3? An Exploration of Personality, Values and Demographics
- URL: http://arxiv.org/abs/2209.14338v1
- Date: Wed, 28 Sep 2022 18:07:02 GMT
- Title: Who is GPT-3? An Exploration of Personality, Values and Demographics
- Authors: Maril\`u Miotto, Nicola Rossberg, Bennett Kleinberg
- Abstract summary: Language models such as GPT-3 have caused a furore in the research community.
This paper answers a related question: who is GPT-3?
- Score: 0.4791233143264229
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Language models such as GPT-3 have caused a furore in the research community.
Some studies found that GPT-3 has some creative abilities and makes mistakes
that are on par with human behaviour. This paper answers a related question:
who is GPT-3? We administered two validated measurement tools to GPT-3 to
assess its personality, the values it holds and its self-reported demographics.
Our results show that GPT-3 scores similarly to human samples in terms of
personality and - when provided with a model response memory - in terms of the
values it holds. We provide the first evidence of psychological assessment of
the GPT-3 model and thereby add to our understanding of the GPT-3 model. We
close with suggestions for future research that moves social science closer to
language models and vice versa.
Related papers
- Behind the Screen: Investigating ChatGPT's Dark Personality Traits and
Conspiracy Beliefs [0.0]
This paper analyzes the dark personality traits and conspiracy beliefs of GPT-3.5 and GPT-4.
Dark personality traits and conspiracy beliefs were not particularly pronounced in either model.
arXiv Detail & Related papers (2024-02-06T16:03:57Z) - Cognitive Effects in Large Language Models [14.808777775761753]
Large Language Models (LLMs) have received enormous attention over the past year and are now used by hundreds of millions of people every day.
We tested one of these models (GPT-3) on a range of cognitive effects, which are systematic patterns that are usually found in human cognitive tasks.
Specifically, we show that the priming, distance, SNARC, and size congruity effects were presented with GPT-3, while the anchoring effect is absent.
arXiv Detail & Related papers (2023-08-28T06:30:33Z) - DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT
Models [92.6951708781736]
This work proposes a comprehensive trustworthiness evaluation for large language models with a focus on GPT-4 and GPT-3.5.
We find that GPT models can be easily misled to generate toxic and biased outputs and leak private information.
Our work illustrates a comprehensive trustworthiness evaluation of GPT models and sheds light on the trustworthiness gaps.
arXiv Detail & Related papers (2023-06-20T17:24:23Z) - Reliability Check: An Analysis of GPT-3's Response to Sensitive Topics
and Prompt Wording [0.0]
We analyze what confuses GPT-3: how the model responds to certain sensitive topics and what effects the prompt wording has on the model response.
We find that GPT-3 correctly disagrees with obvious Conspiracies and Stereotypes but makes mistakes with common Misconceptions and Controversies.
The model responses are inconsistent across prompts and settings, highlighting GPT-3's unreliability.
arXiv Detail & Related papers (2023-06-09T19:07:31Z) - Can GPT-3 Perform Statutory Reasoning? [37.66486350122862]
We explore the capabilities of the most capable GPT-3 model, text-davinci-003, on an established statutory-reasoning dataset called SARA.
We find GPT-3 performs poorly at answering straightforward questions about simple synthetic statutes.
arXiv Detail & Related papers (2023-02-13T04:56:11Z) - Evaluating Psychological Safety of Large Language Models [72.88260608425949]
We designed unbiased prompts to evaluate the psychological safety of large language models (LLMs)
We tested five different LLMs by using two personality tests: Short Dark Triad (SD-3) and Big Five Inventory (BFI)
Despite being instruction fine-tuned with safety metrics to reduce toxicity, InstructGPT, GPT-3.5, and GPT-4 still showed dark personality patterns.
Fine-tuning Llama-2-chat-7B with responses from BFI using direct preference optimization could effectively reduce the psychological toxicity of the model.
arXiv Detail & Related papers (2022-12-20T18:45:07Z) - Prompting GPT-3 To Be Reliable [117.23966502293796]
This work decomposes reliability into four facets: generalizability, fairness, calibration, and factuality.
We find that GPT-3 outperforms smaller-scale supervised models by large margins on all these facets.
arXiv Detail & Related papers (2022-10-17T14:52:39Z) - News Summarization and Evaluation in the Era of GPT-3 [73.48220043216087]
We study how GPT-3 compares against fine-tuned models trained on large summarization datasets.
We show that not only do humans overwhelmingly prefer GPT-3 summaries, prompted using only a task description, but these also do not suffer from common dataset-specific issues such as poor factuality.
arXiv Detail & Related papers (2022-09-26T01:04:52Z) - Elaboration-Generating Commonsense Question Answering at Scale [77.96137534751445]
In question answering requiring common sense, language models (e.g., GPT-3) have been used to generate text expressing background knowledge.
We finetune smaller language models to generate useful intermediate context, referred to here as elaborations.
Our framework alternates between updating two language models -- an elaboration generator and an answer predictor -- allowing each to influence the other.
arXiv Detail & Related papers (2022-09-02T18:32:09Z) - Using cognitive psychology to understand GPT-3 [0.0]
We study GPT-3, a recent large language model, using tools from cognitive psychology.
We assess GPT-3's decision-making, information search, deliberation, and causal reasoning abilities.
arXiv Detail & Related papers (2022-06-21T20:06:03Z) - Memory-assisted prompt editing to improve GPT-3 after deployment [55.62352349324132]
We show how a (simulated) user can interactively teach a deployed GPT-3, doubling its accuracy on basic lexical tasks.
Our simple idea is a first step towards strengthening deployed models, potentially broadening their utility.
arXiv Detail & Related papers (2022-01-16T10:11:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.