Conjunct Resolution in the Face of Verbal Omissions
- URL: http://arxiv.org/abs/2305.16740v1
- Date: Fri, 26 May 2023 08:44:02 GMT
- Title: Conjunct Resolution in the Face of Verbal Omissions
- Authors: Royi Rassin, Yoav Goldberg, Reut Tsarfaty
- Abstract summary: We propose a conjunct resolution task that operates directly on the text and makes use of a split-and-rephrase paradigm in order to recover the missing elements in the coordination structure.
We curate a large dataset, containing over 10K examples of naturally-occurring verbal omissions with crowd-sourced annotations.
We train various neural baselines for this task, and show that while our best method obtains decent performance, it leaves ample space for improvement.
- Score: 51.220650412095665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Verbal omissions are complex syntactic phenomena in VP coordination
structures. They occur when verbs and (some of) their arguments are omitted
from subsequent clauses after being explicitly stated in an initial clause.
Recovering these omitted elements is necessary for accurate interpretation of
the sentence, and while humans easily and intuitively fill in the missing
information, state-of-the-art models continue to struggle with this task.
Previous work is limited to small-scale datasets, synthetic data creation
methods, and to resolution methods in the dependency-graph level. In this work
we propose a conjunct resolution task that operates directly on the text and
makes use of a split-and-rephrase paradigm in order to recover the missing
elements in the coordination structure. To this end, we first formulate a
pragmatic framework of verbal omissions which describes the different types of
omissions, and develop an automatic scalable collection method. Based on this
method, we curate a large dataset, containing over 10K examples of
naturally-occurring verbal omissions with crowd-sourced annotations of the
resolved conjuncts. We train various neural baselines for this task, and show
that while our best method obtains decent performance, it leaves ample space
for improvement. We propose our dataset, metrics and models as a starting point
for future research on this topic.
Related papers
- CAST: Corpus-Aware Self-similarity Enhanced Topic modelling [16.562349140796115]
We introduce CAST: Corpus-Aware Self-similarity Enhanced Topic modelling, a novel topic modelling method.
We find self-similarity to be an effective metric to prevent functional words from acting as candidate topic words.
Our approach significantly enhances the coherence and diversity of generated topics, as well as the topic model's ability to handle noisy data.
arXiv Detail & Related papers (2024-10-19T15:27:11Z) - Conversational Semantic Parsing using Dynamic Context Graphs [68.72121830563906]
We consider the task of conversational semantic parsing over general purpose knowledge graphs (KGs) with millions of entities, and thousands of relation-types.
We focus on models which are capable of interactively mapping user utterances into executable logical forms.
arXiv Detail & Related papers (2023-05-04T16:04:41Z) - Revisiting text decomposition methods for NLI-based factuality scoring
of summaries [9.044665059626958]
We show that fine-grained decomposition is not always a winning strategy for factuality scoring.
We also show that small changes to previously proposed entailment-based scoring methods can result in better performance.
arXiv Detail & Related papers (2022-11-30T09:54:37Z) - Compositional Semantic Parsing with Large Language Models [27.627684573915147]
We identify challenges in more realistic semantic parsing tasks with larger vocabulary.
Our best method is based on least-to-most prompting.
We expect similar efforts will lead to new results in other tasks and domains.
arXiv Detail & Related papers (2022-09-29T17:58:28Z) - Leveraging Natural Supervision for Language Representation Learning and
Generation [8.083109555490475]
We describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
We first investigate self-supervised training losses to help enhance the performance of pretrained language models for various NLP tasks.
We propose a framework that uses paraphrase pairs to disentangle semantics and syntax in sentence representations.
arXiv Detail & Related papers (2022-07-21T17:26:03Z) - Lexically-constrained Text Generation through Commonsense Knowledge
Extraction and Injection [62.071938098215085]
We focus on the Commongen benchmark, wherein the aim is to generate a plausible sentence for a given set of input concepts.
We propose strategies for enhancing the semantic correctness of the generated text.
arXiv Detail & Related papers (2020-12-19T23:23:40Z) - The Extraordinary Failure of Complement Coercion Crowdsourcing [50.599433903377374]
Crowdsourcing has eased and scaled up the collection of linguistic annotation in recent years.
We aim to collect annotated data for this phenomenon by reducing it to either of two known tasks: Explicit Completion and Natural Language Inference.
In both cases, crowdsourcing resulted in low agreement scores, even though we followed the same methodologies as in previous work.
arXiv Detail & Related papers (2020-10-12T19:04:04Z) - Semantically Driven Sentence Fusion: Modeling and Evaluation [27.599227950466442]
Sentence fusion is the task of joining related sentences into coherent text.
Current training and evaluation schemes for this task are based on single reference ground-truths.
We show that this hinders models from robustly capturing the semantic relationship between input sentences.
arXiv Detail & Related papers (2020-10-06T10:06:01Z) - Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size.
We propose a fully compositional output embedding layer for language models.
To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.