The Curious Case of Control
- URL: http://arxiv.org/abs/2205.12113v1
- Date: Tue, 24 May 2022 14:45:16 GMT
- Title: The Curious Case of Control
- Authors: Elias Stengel-Eskin and Benjamin Van Durme
- Abstract summary: Children make systematic errors on subject control sentences even after they have reached near-adult competence.
We find that models can be categorized by behavior into three separate groups, with broad differences between the groups.
We examine to what degree the models are sensitive to prompting with agent-patient information, finding that raising the salience of agent and patient relations results in significant changes in the outputs of most models.
- Score: 37.28245521206576
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Children acquiring English make systematic errors on subject control
sentences even after they have reached near-adult competence (C. Chomsky,
1969), possibly due to heuristics based on semantic roles (Maratsos, 1974).
Given the advanced fluency of large generative language models, we ask whether
model outputs are consistent with these heuristics, and to what degree
different models are consistent with each other. We find that models can be
categorized by behavior into three separate groups, with broad differences
between the groups. The outputs of models in the largest group are consistent
with positional heuristics that succeed on subject control but fail on object
control. This result is surprising, given that object control is orders of
magnitude more frequent in the text data used to train such models. We examine
to what degree the models are sensitive to prompting with agent-patient
information, finding that raising the salience of agent and patient relations
results in significant changes in the outputs of most models. Based on this
observation, we leverage an existing dataset of semantic proto-role annotations
(White, et al. 2020) to explore the connections between control and labeling
event participants with properties typically associated with agents and
patients.
Related papers
- Corpus Considerations for Annotator Modeling and Scaling [9.263562546969695]
We show that the commonly used user token model consistently outperforms more complex models.
Our findings shed light on the relationship between corpus statistics and annotator modeling performance.
arXiv Detail & Related papers (2024-04-02T22:27:24Z) - Stubborn Lexical Bias in Data and Models [50.79738900885665]
We use a new statistical method to examine whether spurious patterns in data appear in models trained on the data.
We apply an optimization approach to *reweight* the training data, reducing thousands of spurious correlations.
Surprisingly, though this method can successfully reduce lexical biases in the training data, we still find strong evidence of corresponding bias in the trained models.
arXiv Detail & Related papers (2023-06-03T20:12:27Z) - Even Small Correlation and Diversity Shifts Pose Dataset-Bias Issues [19.4921353136871]
We study two types of distribution shifts: diversity shifts, which occur when test samples exhibit patterns unseen during training, and correlation shifts, which occur when test data present a different correlation between seen invariant and spurious features.
We propose an integrated protocol to analyze both types of shifts using datasets where they co-exist in a controllable manner.
arXiv Detail & Related papers (2023-05-09T23:40:23Z) - Training Trajectories of Language Models Across Scales [99.38721327771208]
Scaling up language models has led to unprecedented performance gains.
How do language models of different sizes learn during pre-training?
Why do larger language models demonstrate more desirable behaviors?
arXiv Detail & Related papers (2022-12-19T19:16:29Z) - Does Your Model Classify Entities Reasonably? Diagnosing and Mitigating
Spurious Correlations in Entity Typing [29.820473012776283]
Existing entity typing models are subject to the problem of spurious correlations.
We identify six types of existing model biases, including mention-context bias, lexical overlapping bias, named entity bias, pronoun bias, dependency bias, and overgeneralization bias.
By augmenting the original training set with their bias-free counterparts, models are forced to fully comprehend the sentences.
arXiv Detail & Related papers (2022-05-25T10:34:22Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - On the Efficacy of Adversarial Data Collection for Question Answering:
Results from a Large-Scale Randomized Study [65.17429512679695]
In adversarial data collection (ADC), a human workforce interacts with a model in real time, attempting to produce examples that elicit incorrect predictions.
Despite ADC's intuitive appeal, it remains unclear when training on adversarial datasets produces more robust models.
arXiv Detail & Related papers (2021-06-02T00:48:33Z) - A comprehensive comparative evaluation and analysis of Distributional
Semantic Models [61.41800660636555]
We perform a comprehensive evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT.
The results show that the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous.
We borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models.
arXiv Detail & Related papers (2021-05-20T15:18:06Z) - Recommendations for Bayesian hierarchical model specifications for
case-control studies in mental health [0.0]
Researchers must choose whether to assume all subjects are drawn from a common population, or to model them as deriving from separate populations.
We ran systematic simulations on synthetic multi-group behavioural data from a commonly used bandit task.
We found that fitting groups separately provided the most accurate and robust inference across all conditions.
arXiv Detail & Related papers (2020-11-03T14:19:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.