Related papers: AI Text-to-Behavior: A Study In Steerability

AI Text-to-Behavior: A Study In Steerability

URL: http://arxiv.org/abs/2308.07326v1
Date: Mon, 7 Aug 2023 18:14:24 GMT
Title: AI Text-to-Behavior: A Study In Steerability
Authors: David Noever and Sam Hyams
Abstract summary: The research explores the steerability of Large Language Models (LLMs) We quantitatively gauged the model's responsiveness to tailored prompts using a behavioral psychology framework called OCEAN. Our findings underscore GPT's versatility and ability to discern and adapt to nuanced instructions.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The research explores the steerability of Large Language Models (LLMs), particularly OpenAI's ChatGPT iterations. By employing a behavioral psychology framework called OCEAN (Openness, Conscientiousness, Extroversion, Agreeableness, Neuroticism), we quantitatively gauged the model's responsiveness to tailored prompts. When asked to generate text mimicking an extroverted personality, OCEAN scored the language alignment to that behavioral trait. In our analysis, while "openness" presented linguistic ambiguity, "conscientiousness" and "neuroticism" were distinctly evoked in the OCEAN framework, with "extroversion" and "agreeableness" showcasing a notable overlap yet distinct separation from other traits. Our findings underscore GPT's versatility and ability to discern and adapt to nuanced instructions. Furthermore, historical figure simulations highlighted the LLM's capacity to internalize and project instructible personas, precisely replicating their philosophies and dialogic styles. However, the rapid advancements in LLM capabilities and the opaque nature of some training techniques make metric proposals degrade rapidly. Our research emphasizes a quantitative role to describe steerability in LLMs, presenting both its promise and areas for further refinement in aligning its progress to human intentions.

Related papers

What Makes a Good Natural Language Prompt? [72.3282960118995]
We conduct a meta-analysis surveying more than 150 prompting-related papers from leading NLP and AI conferences from 2022 to 2025.<n>We propose a property- and human-centric framework for evaluating prompt quality, encompassing 21 properties categorized into six dimensions.<n>We then empirically explore multi-property prompt enhancements in reasoning tasks, observing that single-property enhancements often have the greatest impact.
arXiv Detail & Related papers (2025-06-07T23:19:27Z)
Prompt Sentiment: The Catalyst for LLM Change [0.29998889086656577]
This study systematically examines how sentiment variations in prompts affect large language models (LLMs) Our analysis spans six AI-driven applications, including content generation, conversational AI, legal and financial analysis, healthcare AI, creative writing, and technical documentation. Our findings reveal that prompt sentiment significantly influences model responses, with negative prompts often reducing factual accuracy and amplifying bias, while positive prompts tend to increase verbosity and sentiment propagation.
arXiv Detail & Related papers (2025-03-14T06:25:21Z)
Sparse Auto-Encoder Interprets Linguistic Features in Large Language Models [40.12943080113246]
We present a systematic and comprehensive causal investigation using sparse auto-encoders (SAEs) We extract a wide range of linguistic features from six dimensions. We introduce two indices-Feature Representation Confidence (FRC) and Feature Intervention Confidence (FIC)-to measure the ability of linguistic features to capture and control linguistic phenomena.
arXiv Detail & Related papers (2025-02-27T18:16:47Z)
Beyond Profile: From Surface-Level Facts to Deep Persona Simulation in LLMs [50.0874045899661]
We introduce CharacterBot, a model designed to replicate both the linguistic patterns and distinctive thought patterns as manifested in the textual works of a character.<n>Using Lu Xun, a renowned Chinese writer as a case study, we propose four training tasks derived from his 17 essay collections.<n>These include a pre-training task focused on mastering external linguistic structures and knowledge, as well as three fine-tuning tasks.<n>We evaluate CharacterBot on three tasks for linguistic accuracy and opinion comprehension, demonstrating that it significantly outperforms the baselines on our adapted metrics.
arXiv Detail & Related papers (2025-02-18T16:11:54Z)
Large Language Models as Neurolinguistic Subjects: Discrepancy in Performance and Competence for Form and Meaning [49.60849499134362]
This study investigates the linguistic understanding of Large Language Models (LLMs) regarding signifier (form) and signified (meaning) We introduce a neurolinguistic approach, utilizing a novel method that combines minimal pair and diagnostic probing to analyze activation patterns across model layers. We found: (1) Psycholinguistic and neurolinguistic methods reveal that language performance and competence are distinct; (2) Direct probability measurement may not accurately assess linguistic competence; and (3) Instruction tuning won't change much competence but improve performance.
arXiv Detail & Related papers (2024-11-12T04:16:44Z)
LMLPA: Language Model Linguistic Personality Assessment [11.599282127259736]
Large Language Models (LLMs) are increasingly used in everyday life and research. measuring the personality of a given LLM is currently a challenge. This paper introduces the Language Model Linguistic Personality Assessment (LMLPA), a system designed to evaluate the linguistic personalities of LLMs.
arXiv Detail & Related papers (2024-10-23T07:48:51Z)
Exploring the LLM Journey from Cognition to Expression with Linear Representations [10.92882688742428]
This paper presents an in-depth examination of the evolution and interplay of cognitive and expressive capabilities in large language models (LLMs) We define and explore the model's cognitive and expressive capabilities through linear representations across three critical phases: Pretraining, Supervised Fine-Tuning (SFT), and Reinforcement Learning from Human Feedback (RLHF) Our findings unveil a sequential development pattern, where cognitive abilities are largely established during Pretraining, whereas expressive abilities predominantly advance during SFT and RLHF.
arXiv Detail & Related papers (2024-05-27T08:57:04Z)
Rethinking ChatGPT's Success: Usability and Cognitive Behaviors Enabled by Auto-regressive LLMs' Prompting [5.344199202349884]
We analyze the structure of modalities within both two types of Large Language Models and six task-specific channels during deployment. We examine the stimulation of diverse cognitive behaviors in LLMs through the adoption of free-form text and verbal contexts.
arXiv Detail & Related papers (2024-05-17T00:19:41Z)
Can large language models understand uncommon meanings of common words? [30.527834781076546]
Large language models (LLMs) have shown significant advancements across diverse natural language understanding (NLU) tasks. Yet, lacking widely acknowledged testing mechanisms, answering whether LLMs are parrots or genuinely comprehend the world' remains unclear. This paper presents innovative construction of a Lexical Semantic dataset with novel evaluation metrics.
arXiv Detail & Related papers (2024-05-09T12:58:22Z)
Tuning-Free Accountable Intervention for LLM Deployment -- A Metacognitive Approach [55.613461060997004]
Large Language Models (LLMs) have catalyzed transformative advances across a spectrum of natural language processing tasks. We propose an innovative textitmetacognitive approach, dubbed textbfCLEAR, to equip LLMs with capabilities for self-aware error identification and correction.
arXiv Detail & Related papers (2024-03-08T19:18:53Z)
Towards Uncovering How Large Language Model Works: An Explainability Perspective [38.07611356855978]
Large language models (LLMs) have led to breakthroughs in language tasks, yet the internal mechanisms that enable their remarkable generalization and reasoning abilities remain opaque. This paper aims to uncover the mechanisms underlying LLM functionality through the lens of explainability.
arXiv Detail & Related papers (2024-02-16T13:46:06Z)
Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents [80.5213198675411]
Large language models (LLMs) have dramatically enhanced the field of language intelligence. LLMs leverage the intriguing chain-of-thought (CoT) reasoning techniques, obliging them to formulate intermediate steps en route to deriving an answer. Recent research endeavors have extended CoT reasoning methodologies to nurture the development of autonomous language agents.
arXiv Detail & Related papers (2023-11-20T14:30:55Z)
Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools. Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions. Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z)
Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners [75.85554779782048]
Large Language Models (LLMs) have excited the natural language and machine learning community over recent years. Despite of numerous successful applications, the underlying mechanism of such in-context capabilities still remains unclear. In this work, we hypothesize that the learned textitsemantics of language tokens do the most heavy lifting during the reasoning process.
arXiv Detail & Related papers (2023-05-24T07:33:34Z)
ChatABL: Abductive Learning via Natural Language Interaction with ChatGPT [72.83383437501577]
Large language models (LLMs) have recently demonstrated significant potential in mathematical abilities. LLMs currently have difficulty in bridging perception, language understanding and reasoning capabilities. This paper presents a novel method for integrating LLMs into the abductive learning framework.
arXiv Detail & Related papers (2023-04-21T16:23:47Z)
A Latent-Variable Model for Intrinsic Probing [93.62808331764072]
We propose a novel latent-variable formulation for constructing intrinsic probes. We find empirical evidence that pre-trained representations develop a cross-lingually entangled notion of morphosyntax.
arXiv Detail & Related papers (2022-01-20T15:01:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.