Unsupervised Inference of Data-Driven Discourse Structures using a Tree
Auto-Encoder
- URL: http://arxiv.org/abs/2210.09559v1
- Date: Tue, 18 Oct 2022 03:28:39 GMT
- Title: Unsupervised Inference of Data-Driven Discourse Structures using a Tree
Auto-Encoder
- Authors: Patrick Huber and Giuseppe Carenini
- Abstract summary: We propose a new strategy to generate tree structures in a task-agnostic, unsupervised fashion by extending a latent tree induction framework with an auto-encoding objective.
The proposed approach can be applied to any tree-structured objective, such as syntactic parsing, discourse parsing and others.
- Score: 30.615883375573432
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: With a growing need for robust and general discourse structures in many
downstream tasks and real-world applications, the current lack of high-quality,
high-quantity discourse trees poses a severe shortcoming. In order the
alleviate this limitation, we propose a new strategy to generate tree
structures in a task-agnostic, unsupervised fashion by extending a latent tree
induction framework with an auto-encoding objective. The proposed approach can
be applied to any tree-structured objective, such as syntactic parsing,
discourse parsing and others. However, due to the especially difficult
annotation process to generate discourse trees, we initially develop such
method to complement task-specific models in generating much larger and more
diverse discourse treebanks.
Related papers
- GrootVL: Tree Topology is All You Need in State Space Model [66.36757400689281]
GrootVL is a versatile multimodal framework that can be applied to both visual and textual tasks.
Our method significantly outperforms existing structured state space models on image classification, object detection and segmentation.
By fine-tuning large language models, our approach achieves consistent improvements in multiple textual tasks at minor training cost.
arXiv Detail & Related papers (2024-06-04T15:09:29Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - RLET: A Reinforcement Learning Based Approach for Explainable QA with
Entailment Trees [47.745218107037786]
We propose RLET, a Reinforcement Learning based Entailment Tree generation framework.
RLET iteratively performs single step reasoning with sentence selection and deduction generation modules.
Experiments on three settings of the EntailmentBank dataset demonstrate the strength of using RL framework.
arXiv Detail & Related papers (2022-10-31T06:45:05Z) - Large Discourse Treebanks from Scalable Distant Supervision [30.615883375573432]
We propose a framework to generate "silver-standard" discourse trees from distant supervision on the auxiliary task of sentiment analysis.
"Silver-standard" discourse trees are trained on larger, more diverse and domain-independent datasets.
arXiv Detail & Related papers (2022-10-18T03:33:43Z) - A Tree-structured Transformer for Program Representation Learning [27.31416015946351]
Long-term/global dependencies widely exist in programs, and most neural networks fail to capture these dependencies.
In this paper, we propose Tree-Transformer, a novel tree-structured neural network which aims to overcome the above limitations.
By combining bottom-up and top-down propagation, Tree-Transformer can learn both global contexts and meaningful node features.
arXiv Detail & Related papers (2022-08-18T05:42:01Z) - Tree Reconstruction using Topology Optimisation [0.685316573653194]
We present a general method for extracting the branch structure of trees from point cloud data.
We discuss the benefits and drawbacks of this novel approach to tree structure reconstruction.
Our method generates detailed and accurate tree structures in most cases.
arXiv Detail & Related papers (2022-05-26T07:08:32Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - Unsupervised Learning of Discourse Structures using a Tree Autoencoder [8.005512864082126]
We propose a new strategy to generate tree structures in a task-agnostic, unsupervised fashion by extending a latent tree induction framework with an auto-encoding objective.
The proposed approach can be applied to any tree objective, such as syntactic parsing, discourse parsing and others.
In this paper we are inferring general tree structures of natural text in multiple domains, showing promising results on a diverse set of tasks.
arXiv Detail & Related papers (2020-12-17T08:40:34Z) - MEGA RST Discourse Treebanks with Structure and Nuclearity from Scalable
Distant Sentiment Supervision [30.615883375573432]
We present a novel methodology to automatically generate discourse treebanks using distant supervision from sentiment-annotated datasets.
Our approach generates trees incorporating structure and nuclearity for documents of arbitrary length by relying on an efficient beam-search strategy.
Experiments indicate that a discourse trained on our MEGA-DT treebank delivers promising inter-domain performance gains.
arXiv Detail & Related papers (2020-11-05T18:22:38Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Tree-structured Attention with Hierarchical Accumulation [103.47584968330325]
"Hierarchical Accumulation" encodes parse tree structures into self-attention at constant time complexity.
Our approach outperforms SOTA methods in four IWSLT translation tasks and the WMT'14 English-German translation task.
arXiv Detail & Related papers (2020-02-19T08:17:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.