QA-Align: Representing Cross-Text Content Overlap by Aligning
Question-Answer Propositions
- URL: http://arxiv.org/abs/2109.12655v1
- Date: Sun, 26 Sep 2021 17:19:48 GMT
- Title: QA-Align: Representing Cross-Text Content Overlap by Aligning
Question-Answer Propositions
- Authors: Daniela Brook Weiss, Paul Roit, Ayal Klein, Ori Ernst, Ido Dagan
- Abstract summary: We propose to align predicate-argument relations across texts, providing a scaffold for information consolidation.
Our setting exploits QA-SRL, utilizing question-answer pairs to capture predicate-argument relations.
Analyses show that our new task is semantically challenging, capturing content overlap beyond lexical similarity.
- Score: 12.264795812337153
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-text applications, such as multi-document summarization, are typically
required to model redundancies across related texts. Current methods
confronting consolidation struggle to fuse overlapping information. In order to
explicitly represent content overlap, we propose to align predicate-argument
relations across texts, providing a potential scaffold for information
consolidation. We go beyond clustering coreferring mentions, and instead model
overlap with respect to redundancy at a propositional level, rather than merely
detecting shared referents. Our setting exploits QA-SRL, utilizing
question-answer pairs to capture predicate-argument relations, facilitating
laymen annotation of cross-text alignments. We employ crowd-workers for
constructing a dataset of QA-based alignments, and present a baseline QA
alignment model trained over our dataset. Analyses show that our new task is
semantically challenging, capturing content overlap beyond lexical similarity
and complements cross-document coreference with proposition-level links,
offering potential use for downstream tasks.
Related papers
- Localizing Factual Inconsistencies in Attributable Text Generation [91.981439746404]
We introduce QASemConsistency, a new formalism for localizing factual inconsistencies in attributable text generation.
We first demonstrate the effectiveness of the QASemConsistency methodology for human annotation.
We then implement several methods for automatically detecting localized factual inconsistencies.
arXiv Detail & Related papers (2024-10-09T22:53:48Z) - Leveraging Inter-Chunk Interactions for Enhanced Retrieval in Large Language Model-Based Question Answering [12.60063463163226]
IIER captures the internal connections between document chunks by considering three types of interactions: structural, keyword, and semantic.
It identifies multiple seed nodes based on the target question and iteratively searches for relevant chunks to gather supporting evidence.
It refines the context and reasoning chain, aiding the large language model in reasoning and answer generation.
arXiv Detail & Related papers (2024-08-06T02:39:55Z) - The Power of Summary-Source Alignments [62.76959473193149]
Multi-document summarization (MDS) is a challenging task, often decomposed to subtasks of salience and redundancy detection.
alignment of corresponding sentences between a reference summary and its source documents has been leveraged to generate training data.
This paper proposes extending the summary-source alignment framework by applying it at the more fine-grained proposition span level.
arXiv Detail & Related papers (2024-06-02T19:35:19Z) - Revisiting Sentence Union Generation as a Testbed for Text Consolidation [17.594941316215838]
We propose revisiting the sentence union generation task as an effective well-defined testbed for assessing text consolidation capabilities.
We present refined annotation methodology and tools for crowdsourcing sentence union, create the largest union dataset to date.
We then propose a comprehensive evaluation protocol for union generation, including both human and automatic evaluation.
arXiv Detail & Related papers (2023-05-24T22:34:01Z) - Peek Across: Improving Multi-Document Modeling via Cross-Document
Question-Answering [49.85790367128085]
We pre-training a generic multi-document model from a novel cross-document question answering pre-training objective.
This novel multi-document QA formulation directs the model to better recover cross-text informational relations.
Unlike prior multi-document models that focus on either classification or summarization tasks, our pre-training objective formulation enables the model to perform tasks that involve both short text generation and long text generation.
arXiv Detail & Related papers (2023-05-24T17:48:40Z) - QASem Parsing: Text-to-text Modeling of QA-based Semantics [19.42681342441062]
We consider three QA-based semantic tasks, namely, QA-SRL, QANom and QADiscourse.
We release the first unified QASem parsing tool, practical for downstream applications.
arXiv Detail & Related papers (2022-05-23T15:56:07Z) - Extending Multi-Text Sentence Fusion Resources via Pyramid Annotations [12.394777121890925]
This paper revisits and substantially extends previous dataset creation efforts.
We show that our extended version uses more representative texts for multi-document tasks and provides a larger and more diverse training set.
arXiv Detail & Related papers (2021-10-09T09:15:05Z) - Relation Clustering in Narrative Knowledge Graphs [71.98234178455398]
relational sentences in the original text are embedded (with SBERT) and clustered in order to merge together semantically similar relations.
Preliminary tests show that such clustering might successfully detect similar relations, and provide a valuable preprocessing for semi-supervised approaches.
arXiv Detail & Related papers (2020-11-27T10:43:04Z) - Pairwise Representation Learning for Event Coreference [73.10563168692667]
We develop a Pairwise Representation Learning (PairwiseRL) scheme for the event mention pairs.
Our representation supports a finer, structured representation of the text snippet to facilitate encoding events and their arguments.
We show that PairwiseRL, despite its simplicity, outperforms the prior state-of-the-art event coreference systems on both cross-document and within-document event coreference benchmarks.
arXiv Detail & Related papers (2020-10-24T06:55:52Z) - Query Focused Multi-Document Summarization with Distant Supervision [88.39032981994535]
Existing work relies heavily on retrieval-style methods for estimating the relevance between queries and text segments.
We propose a coarse-to-fine modeling framework which introduces separate modules for estimating whether segments are relevant to the query.
We demonstrate that our framework outperforms strong comparison systems on standard QFS benchmarks.
arXiv Detail & Related papers (2020-04-06T22:35:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.