Distributed Marker Representation for Ambiguous Discourse Markers and
Entangled Relations
- URL: http://arxiv.org/abs/2306.10658v1
- Date: Mon, 19 Jun 2023 00:49:51 GMT
- Title: Distributed Marker Representation for Ambiguous Discourse Markers and
Entangled Relations
- Authors: Dongyu Ru, Lin Qiu, Xipeng Qiu, Yue Zhang, Zheng Zhang
- Abstract summary: We learn a Distributed Marker Representation (DMR) by utilizing the unlimited discourse marker data with a latent discourse sense.
Our method also offers a valuable tool to understand complex ambiguity and entanglement among discourse markers and manually defined discourse relations.
- Score: 50.31129784616845
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Discourse analysis is an important task because it models intrinsic semantic
structures between sentences in a document. Discourse markers are natural
representations of discourse in our daily language. One challenge is that the
markers as well as pre-defined and human-labeled discourse relations can be
ambiguous when describing the semantics between sentences. We believe that a
better approach is to use a contextual-dependent distribution over the markers
to express discourse information. In this work, we propose to learn a
Distributed Marker Representation (DMR) by utilizing the (potentially)
unlimited discourse marker data with a latent discourse sense, thereby bridging
markers with sentence pairs. Such representations can be learned automatically
from data without supervision, and in turn provide insights into the data
itself. Experiments show the SOTA performance of our DMR on the implicit
discourse relation recognition task and strong interpretability. Our method
also offers a valuable tool to understand complex ambiguity and entanglement
among discourse markers and manually defined discourse relations.
Related papers
- Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition [96.62264528407863]
We propose a self-supervised contrastive learning framework to excavate rich context via spatial-temporal consistency.
Inspired by the complementary property of motion and joint modalities, we first introduce first-order motion information into sign language modeling.
Our method is evaluated with extensive experiments on four public benchmarks, and achieves new state-of-the-art performance with a notable margin.
arXiv Detail & Related papers (2024-06-15T04:50:19Z) - Label Aware Speech Representation Learning For Language Identification [49.197215416945596]
We propose a novel framework of combining self-supervised representation learning with the language label information for the pre-training task.
This framework, termed as Label Aware Speech Representation (LASR) learning, uses a triplet based objective function to incorporate language labels along with the self-supervised loss function.
arXiv Detail & Related papers (2023-06-07T12:14:16Z) - Cross-Genre Argument Mining: Can Language Models Automatically Fill in
Missing Discourse Markers? [17.610382230820395]
We propose to automatically augment a given text with discourse markers such that all relations are explicitly signaled.
Our analysis unveils that popular language models taken out-of-the-box fail on this task.
We demonstrate the impact of our approach on an Argument Mining downstream task, evaluated on different corpora.
arXiv Detail & Related papers (2023-06-07T10:19:50Z) - PropSegmEnt: A Large-Scale Corpus for Proposition-Level Segmentation and
Entailment Recognition [63.51569687229681]
We argue for the need to recognize the textual entailment relation of each proposition in a sentence individually.
We propose PropSegmEnt, a corpus of over 45K propositions annotated by expert human raters.
Our dataset structure resembles the tasks of (1) segmenting sentences within a document to the set of propositions, and (2) classifying the entailment relation of each proposition with respect to a different yet topically-aligned document.
arXiv Detail & Related papers (2022-12-21T04:03:33Z) - Saliency Map Verbalization: Comparing Feature Importance Representations
from Model-free and Instruction-based Methods [6.018950511093273]
Saliency maps can explain a neural model's predictions by identifying important input features.
We formalize the underexplored task of translating saliency maps into natural language.
We compare two novel methods (search-based and instruction-based verbalizations) against conventional feature importance representations.
arXiv Detail & Related papers (2022-10-13T17:48:15Z) - Transition-based Abstract Meaning Representation Parsing with Contextual
Embeddings [0.0]
We study a way of combing two of the most successful routes to meaning of language--statistical language models and symbolic semantics formalisms--in the task of semantic parsing.
We explore the utility of incorporating pretrained context-aware word embeddings--such as BERT and RoBERTa--in the problem of parsing.
arXiv Detail & Related papers (2022-06-13T15:05:24Z) - Context Matters: Self-Attention for Sign Language Recognition [1.005130974691351]
This paper proposes an attentional network for the task of Continuous Sign Language Recognition.
We exploit co-independent streams of data to model the sign language modalities.
We find that the model is able to identify the essential Sign Language components that revolve around the dominant hand and the face areas.
arXiv Detail & Related papers (2021-01-12T17:40:19Z) - R$^2$-Net: Relation of Relation Learning Network for Sentence Semantic
Matching [58.72111690643359]
We propose a Relation of Relation Learning Network (R2-Net) for sentence semantic matching.
We first employ BERT to encode the input sentences from a global perspective.
Then a CNN-based encoder is designed to capture keywords and phrase information from a local perspective.
To fully leverage labels for better relation information extraction, we introduce a self-supervised relation of relation classification task.
arXiv Detail & Related papers (2020-12-16T13:11:30Z) - Paragraph-level Commonsense Transformers with Recurrent Memory [77.4133779538797]
We train a discourse-aware model that incorporates paragraph-level information to generate coherent commonsense inferences from narratives.
Our results show that PARA-COMET outperforms the sentence-level baselines, particularly in generating inferences that are both coherent and novel.
arXiv Detail & Related papers (2020-10-04T05:24:12Z) - DiscSense: Automated Semantic Analysis of Discourse Markers [9.272765183222967]
We study the link between discourse markers and the semantic relations annotated in classification datasets.
By using an automatic rediction method over existing semantically annotated datasets, we provide a bottom-up characterization of discourse markers in English.
arXiv Detail & Related papers (2020-06-02T13:39:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.