Towards the Scalable Evaluation of Cooperativeness in Language Models
- URL: http://arxiv.org/abs/2303.13360v1
- Date: Thu, 16 Mar 2023 15:34:23 GMT
- Title: Towards the Scalable Evaluation of Cooperativeness in Language Models
- Authors: Alan Chan, Maxime Rich\'e, Jesse Clifton
- Abstract summary: We aim to understand and shape the multi-agent behaviors of PLMs in a pro-social manner.
We generate scenarios with particular structures with both crowdworkers and a language model.
We find that instruct-tuned models tend to act in a way that could be perceived as cooperative when scaled up.
- Score: 1.7875811547963403
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is likely that AI systems driven by pre-trained language models (PLMs)
will increasingly be used to assist humans in high-stakes interactions with
other agents, such as negotiation or conflict resolution. Consistent with the
goals of Cooperative AI \citep{dafoe_open_2020}, we wish to understand and
shape the multi-agent behaviors of PLMs in a pro-social manner. An important
first step is the evaluation of model behaviour across diverse cooperation
problems. Since desired behaviour in an interaction depends upon precise
game-theoretic structure, we focus on generating scenarios with particular
structures with both crowdworkers and a language model. Our work proceeds as
follows. First, we discuss key methodological issues in the generation of
scenarios corresponding to particular game-theoretic structures. Second, we
employ both crowdworkers and a language model to generate such scenarios. We
find that the quality of generations tends to be mediocre in both cases. We
additionally get both crowdworkers and a language model to judge whether given
scenarios align with their intended game-theoretic structure, finding mixed
results depending on the game. Third, we provide a dataset of scenario based on
our data generated. We provide both quantitative and qualitative evaluations of
UnifiedQA and GPT-3 on this dataset. We find that instruct-tuned models tend to
act in a way that could be perceived as cooperative when scaled up, while other
models seemed to have flat scaling trends.
Related papers
- Towards "Differential AI Psychology" and in-context Value-driven Statement Alignment with Moral Foundations Theory [0.0]
This work investigates the alignment between personalized language models and survey participants on a Moral Foundation questionnaire.
We adapt text-to-text models to different political personas and survey the questionnaire repetitively to generate a synthetic population of persona and model combinations.
Our findings indicate that adapted models struggle to represent the survey-leading assessment of political ideologies.
arXiv Detail & Related papers (2024-08-21T08:20:41Z) - Foundational Models Defining a New Era in Vision: A Survey and Outlook [151.49434496615427]
Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world.
The models learned to bridge the gap between such modalities coupled with large-scale training data facilitate contextual reasoning, generalization, and prompt capabilities at test time.
The output of such models can be modified through human-provided prompts without retraining, e.g., segmenting a particular object by providing a bounding box, having interactive dialogues by asking questions about an image or video scene or manipulating the robot's behavior through language instructions.
arXiv Detail & Related papers (2023-07-25T17:59:18Z) - Feature Interactions Reveal Linguistic Structure in Language Models [2.0178765779788495]
We study feature interactions in the context of feature attribution methods for post-hoc interpretability.
We work out a grey box methodology, in which we train models to perfection on a formal language classification task.
We show that under specific configurations, some methods are indeed able to uncover the grammatical rules acquired by a model.
arXiv Detail & Related papers (2023-06-21T11:24:41Z) - Structured Like a Language Model: Analysing AI as an Automated Subject [0.0]
We argue the intentional fictional projection of subjectivity onto large language models can yield an alternate frame through which AI behaviour can be analysed.
We trace a brief history of language models, culminating in the releases of systems that realise state-of-the-art natural language processing performance.
We conclude that critical media methods and psychoanalytic theory together offer a productive frame for grasping the powerful new capacities of AI-driven language systems.
arXiv Detail & Related papers (2022-12-08T21:58:43Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - Unifying Language Learning Paradigms [96.35981503087567]
We present a unified framework for pre-training models that are universally effective across datasets and setups.
We show how different pre-training objectives can be cast as one another and how interpolating between different objectives can be effective.
Our model also achieve strong results at in-context learning, outperforming 175B GPT-3 on zero-shot SuperGLUE and tripling the performance of T5-XXL on one-shot summarization.
arXiv Detail & Related papers (2022-05-10T19:32:20Z) - A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities.
We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention.
Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z) - Towards Generalized Models for Task-oriented Dialogue Modeling on Spoken
Conversations [22.894541507068933]
This paper presents our approach to build generalized models for the Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations Challenge of DSTC-10.
We employ extensive data augmentation strategies on written data, including artificial error injection and round-trip text-speech transformation.
Our approach ranks third on the objective evaluation and second on the final official human evaluation.
arXiv Detail & Related papers (2022-03-08T12:26:57Z) - Probing Task-Oriented Dialogue Representation from Language Models [106.02947285212132]
This paper investigates pre-trained language models to find out which model intrinsically carries the most informative representation for task-oriented dialogue tasks.
We fine-tune a feed-forward layer as the classifier probe on top of a fixed pre-trained language model with annotated labels in a supervised way.
arXiv Detail & Related papers (2020-10-26T21:34:39Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.