Explaining Interactions Between Text Spans
- URL: http://arxiv.org/abs/2310.13506v1
- Date: Fri, 20 Oct 2023 13:52:37 GMT
- Title: Explaining Interactions Between Text Spans
- Authors: Sagnik Ray Choudhury, Pepa Atanasova, Isabelle Augenstein
- Abstract summary: Reasoning over spans of tokens from different parts of the input is essential for natural language understanding.
We introduce SpanEx, a dataset of human span interaction explanations for two NLU tasks: NLI and FC.
We then investigate the decision-making processes of multiple fine-tuned large language models in terms of the employed connections between spans.
- Score: 50.70253702800355
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Reasoning over spans of tokens from different parts of the input is essential
for natural language understanding (NLU) tasks such as fact-checking (FC),
machine reading comprehension (MRC) or natural language inference (NLI).
However, existing highlight-based explanations primarily focus on identifying
individual important tokens or interactions only between adjacent tokens or
tuples of tokens. Most notably, there is a lack of annotations capturing the
human decision-making process w.r.t. the necessary interactions for informed
decision-making in such tasks. To bridge this gap, we introduce SpanEx, a
multi-annotator dataset of human span interaction explanations for two NLU
tasks: NLI and FC. We then investigate the decision-making processes of
multiple fine-tuned large language models in terms of the employed connections
between spans in separate parts of the input and compare them to the human
reasoning processes. Finally, we present a novel community detection based
unsupervised method to extract such interaction explanations from a model's
inner workings.
Related papers
- Narrative Action Evaluation with Prompt-Guided Multimodal Interaction [60.281405999483]
Narrative action evaluation (NAE) aims to generate professional commentary that evaluates the execution of an action.
NAE is a more challenging task because it requires both narrative flexibility and evaluation rigor.
We propose a prompt-guided multimodal interaction framework to facilitate the interaction between different modalities of information.
arXiv Detail & Related papers (2024-04-22T17:55:07Z) - mCL-NER: Cross-Lingual Named Entity Recognition via Multi-view
Contrastive Learning [54.523172171533645]
Cross-lingual named entity recognition (CrossNER) faces challenges stemming from uneven performance due to the scarcity of multilingual corpora.
We propose Multi-view Contrastive Learning for Cross-lingual Named Entity Recognition (mCL-NER)
Our experiments on the XTREME benchmark, spanning 40 languages, demonstrate the superiority of mCL-NER over prior data-driven and model-based approaches.
arXiv Detail & Related papers (2023-08-17T16:02:29Z) - Coreference-aware Double-channel Attention Network for Multi-party
Dialogue Reading Comprehension [7.353227696624305]
We tackle Multi-party Dialogue Reading (abbr., MDRC)
MDRC stands for an extractive reading comprehension task grounded on a batch of dialogues among multiple interlocutors.
We propose a coreference-aware attention modeling method to strengthen the reasoning ability.
arXiv Detail & Related papers (2023-05-15T05:01:29Z) - Bridging the Gap between Language Models and Cross-Lingual Sequence
Labeling [101.74165219364264]
Large-scale cross-lingual pre-trained language models (xPLMs) have shown effectiveness in cross-lingual sequence labeling tasks.
Despite the great success, we draw an empirical observation that there is a training objective gap between pre-training and fine-tuning stages.
In this paper, we first design a pre-training task tailored for xSL named Cross-lingual Language Informative Span Masking (CLISM) to eliminate the objective gap.
Second, we present ContrAstive-Consistency Regularization (CACR), which utilizes contrastive learning to encourage the consistency between representations of input parallel
arXiv Detail & Related papers (2022-04-11T15:55:20Z) - DEIM: An effective deep encoding and interaction model for sentence
matching [0.0]
We propose a sentence matching method based on deep encoding and interaction to extract deep semantic information.
In the encoder layer,we refer to the information of another sentence in the process of encoding a single sentence, and later use a algorithm to fuse the information.
In the interaction layer, we use a bidirectional attention mechanism and a self-attention mechanism to obtain deep semantic information.
arXiv Detail & Related papers (2022-03-20T07:59:42Z) - Explaining Neural Network Predictions on Sentence Pairs via Learning
Word-Group Masks [21.16662651409811]
We propose the Group Mask (GMASK) method to implicitly detect word correlations by grouping correlated words from the input text pair together.
The proposed method is evaluated with two different model architectures (decomposable attention model and BERT) across four datasets.
arXiv Detail & Related papers (2021-04-09T17:14:34Z) - Contextual Biasing of Language Models for Speech Recognition in
Goal-Oriented Conversational Agents [11.193867567895353]
Goal-oriented conversational interfaces are designed to accomplish specific tasks.
We propose a new architecture that utilizes context embeddings derived from BERT on sample utterances provided during inference time.
Our experiments show a word error rate (WER) relative reduction of 7% over non-contextual utterance-level NLM rescorers on goal-oriented audio datasets.
arXiv Detail & Related papers (2021-03-18T15:38:08Z) - VECO: Variable and Flexible Cross-lingual Pre-training for Language
Understanding and Generation [77.82373082024934]
We plug a cross-attention module into the Transformer encoder to explicitly build the interdependence between languages.
It can effectively avoid the degeneration of predicting masked words only conditioned on the context in its own language.
The proposed cross-lingual model delivers new state-of-the-art results on various cross-lingual understanding tasks of the XTREME benchmark.
arXiv Detail & Related papers (2020-10-30T03:41:38Z) - Multi-View Sequence-to-Sequence Models with Conversational Structure for
Abstractive Dialogue Summarization [72.54873655114844]
Text summarization is one of the most challenging and interesting problems in NLP.
This work proposes a multi-view sequence-to-sequence model by first extracting conversational structures of unstructured daily chats from different views to represent conversations.
Experiments on a large-scale dialogue summarization corpus demonstrated that our methods significantly outperformed previous state-of-the-art models via both automatic evaluations and human judgment.
arXiv Detail & Related papers (2020-10-04T20:12:44Z) - Generating Hierarchical Explanations on Text Classification via Feature
Interaction Detection [21.02924712220406]
We build hierarchical explanations by detecting feature interactions.
Such explanations visualize how words and phrases are combined at different levels of the hierarchy.
Experiments show the effectiveness of the proposed method in providing explanations both faithful to models and interpretable to humans.
arXiv Detail & Related papers (2020-04-04T20:56:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.