The Extraordinary Failure of Complement Coercion Crowdsourcing
- URL: http://arxiv.org/abs/2010.05971v1
- Date: Mon, 12 Oct 2020 19:04:04 GMT
- Title: The Extraordinary Failure of Complement Coercion Crowdsourcing
- Authors: Yanai Elazar, Victoria Basmov, Shauli Ravfogel, Yoav Goldberg, Reut
Tsarfaty
- Abstract summary: Crowdsourcing has eased and scaled up the collection of linguistic annotation in recent years.
We aim to collect annotated data for this phenomenon by reducing it to either of two known tasks: Explicit Completion and Natural Language Inference.
In both cases, crowdsourcing resulted in low agreement scores, even though we followed the same methodologies as in previous work.
- Score: 50.599433903377374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crowdsourcing has eased and scaled up the collection of linguistic annotation
in recent years. In this work, we follow known methodologies of collecting
labeled data for the complement coercion phenomenon. These are constructions
with an implied action -- e.g., "I started a new book I bought last week",
where the implied action is reading. We aim to collect annotated data for this
phenomenon by reducing it to either of two known tasks: Explicit Completion and
Natural Language Inference. However, in both cases, crowdsourcing resulted in
low agreement scores, even though we followed the same methodologies as in
previous work. Why does the same process fail to yield high agreement scores?
We specify our modeling schemes, highlight the differences with previous work
and provide some insights about the task and possible explanations for the
failure. We conclude that specific phenomena require tailored solutions, not
only in specialized algorithms, but also in data collection methods.
Related papers
- Annotator in the Loop: A Case Study of In-Depth Rater Engagement to Create a Bridging Benchmark Dataset [1.825224193230824]
We describe a novel, collaborative, and iterative annotator-in-the-loop methodology for annotation.
Our findings indicate that collaborative engagement with annotators can enhance annotation methods.
arXiv Detail & Related papers (2024-08-01T19:11:08Z) - Conjunct Resolution in the Face of Verbal Omissions [51.220650412095665]
We propose a conjunct resolution task that operates directly on the text and makes use of a split-and-rephrase paradigm in order to recover the missing elements in the coordination structure.
We curate a large dataset, containing over 10K examples of naturally-occurring verbal omissions with crowd-sourced annotations.
We train various neural baselines for this task, and show that while our best method obtains decent performance, it leaves ample space for improvement.
arXiv Detail & Related papers (2023-05-26T08:44:02Z) - Explanation Selection Using Unlabeled Data for Chain-of-Thought
Prompting [80.9896041501715]
Explanations that have not been "tuned" for a task, such as off-the-shelf explanations written by nonexperts, may lead to mediocre performance.
This paper tackles the problem of how to optimize explanation-infused prompts in a blackbox fashion.
arXiv Detail & Related papers (2023-02-09T18:02:34Z) - Revisiting text decomposition methods for NLI-based factuality scoring
of summaries [9.044665059626958]
We show that fine-grained decomposition is not always a winning strategy for factuality scoring.
We also show that small changes to previously proposed entailment-based scoring methods can result in better performance.
arXiv Detail & Related papers (2022-11-30T09:54:37Z) - Annotation Error Detection: Analyzing the Past and Present for a More
Coherent Future [63.99570204416711]
We reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets.
We define a uniform evaluation setup including a new formalization of the annotation error detection task.
We release our datasets and implementations in an easy-to-use and open source software package.
arXiv Detail & Related papers (2022-06-05T22:31:45Z) - Ensemble Distillation Approaches for Grammatical Error Correction [18.81579562876076]
Ensemble distillation (EnD) and ensemble distribution distillation (EnDD) have been proposed that compress the ensemble into a single model.
This paper examines the application of both these distillation approaches to a sequence prediction task, grammatical error correction (GEC)
It is, however, more challenging than the standard tasks investigated for distillation as the prediction of any grammatical correction to a word will be highly dependent on both the input sequence and the generated output history for the word.
arXiv Detail & Related papers (2020-11-24T15:00:45Z) - Evaluating Factuality in Generation with Dependency-level Entailment [57.5316011554622]
We propose a new formulation of entailment that decomposes it at the level of dependency arcs.
We show that our dependency arc entailment model trained on this data can identify factual inconsistencies in paraphrasing and summarization better than sentence-level methods.
arXiv Detail & Related papers (2020-10-12T06:43:10Z) - Knowledge-Aware Procedural Text Understanding with Multi-Stage Training [110.93934567725826]
We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.
Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved.
We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
arXiv Detail & Related papers (2020-09-28T10:28:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.