Learning from Non-Binary Constituency Trees via Tensor Decomposition
- URL: http://arxiv.org/abs/2011.00860v1
- Date: Mon, 2 Nov 2020 10:06:59 GMT
- Title: Learning from Non-Binary Constituency Trees via Tensor Decomposition
- Authors: Daniele Castellana, Davide Bacciu
- Abstract summary: We introduce a new approach to deal with non-binary constituency trees.
We show how a powerful composition function based on the canonical tensor decomposition can exploit such a rich structure.
We experimentally assess its performance on different NLP tasks.
- Score: 12.069862650316262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Processing sentence constituency trees in binarised form is a common and
popular approach in literature. However, constituency trees are non-binary by
nature. The binarisation procedure changes deeply the structure, furthering
constituents that instead are close. In this work, we introduce a new approach
to deal with non-binary constituency trees which leverages tensor-based models.
In particular, we show how a powerful composition function based on the
canonical tensor decomposition can exploit such a rich structure. A key point
of our approach is the weight sharing constraint imposed on the factor
matrices, which allows limiting the number of model parameters. Finally, we
introduce a Tree-LSTM model which takes advantage of this composition function
and we experimentally assess its performance on different NLP tasks.
Related papers
- Bayesian post-hoc regularization of random forests [0.0]
Random Forests are powerful ensemble learning algorithms widely used in various machine learning tasks.
We propose post-hoc regularization to leverage the reliable patterns captured by leaf nodes closer to the root.
We have evaluated the performance of our method on various machine learning data sets.
arXiv Detail & Related papers (2023-06-06T14:15:29Z) - On marginal feature attributions of tree-based models [0.11184789007828977]
Local feature attributions based on marginal expectations, e.g. marginal Shapley, Owen or Banzhaf values, may be employed.
We present two (statistically similar) decision trees that compute the exact same function for which the "path-dependent" TreeSHAP yields different rankings of features.
We exploit the symmetry to derive an explicit formula, with improved complexity and only in terms of the internal model parameters, for marginal Shapley (and Banzhaf and Owen) values of CatBoost models.
arXiv Detail & Related papers (2023-02-16T17:18:03Z) - Learning compositional structures for semantic graph parsing [81.41592892863979]
We show how AM dependency parsing can be trained directly on a neural latent-variable model.
Our model picks up on several linguistic phenomena on its own and achieves comparable accuracy to supervised training.
arXiv Detail & Related papers (2021-06-08T14:20:07Z) - Robustifying Algorithms of Learning Latent Trees with Vector Variables [92.18777020401484]
We present the sample complexities of Recursive Grouping (RG) and Chow-Liu Recursive Grouping (CLRG)
We robustify RG, CLRG, Neighbor Joining (NJ) and Spectral NJ (SNJ) by using the truncated inner product.
We derive the first known instance-dependent impossibility result for structure learning of latent trees.
arXiv Detail & Related papers (2021-06-02T01:37:52Z) - Spectral Top-Down Recovery of Latent Tree Models [13.681975313065477]
Spectral Top-Down Recovery (STDR) is a divide-and-conquer approach for inference of large latent tree models.
STDR's partitioning step is non-random. Instead, it is based on the Fiedler vector of a suitable Laplacian matrix related to the observed nodes.
We prove that STDR is statistically consistent, and bound the number of samples required to accurately recover the tree with high probability.
arXiv Detail & Related papers (2021-02-26T02:47:42Z) - SGA: A Robust Algorithm for Partial Recovery of Tree-Structured
Graphical Models with Noisy Samples [75.32013242448151]
We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise.
Katiyar et al. (2020) showed that although the exact tree structure cannot be recovered, one can recover a partial tree structure.
We propose Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery.
arXiv Detail & Related papers (2021-01-22T01:57:35Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary.
We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - Tensor Decompositions in Recursive Neural Networks for Tree-Structured
Data [12.069862650316262]
We introduce two new aggregation functions to encode structural knowledge from tree-structured data.
We test them on two tree classification tasks, showing the advantage of proposed models when tree outdegree increases.
arXiv Detail & Related papers (2020-06-18T15:40:32Z) - Generalising Recursive Neural Models by Tensor Decomposition [12.069862650316262]
We introduce a general approach to model aggregation of structural context leveraging a tensor-based formulation.
We show how the exponential growth in the size of the parameter space can be controlled through an approximation based on the Tucker decomposition.
By this means, we can effectively regulate the trade-off between expressivity of the encoding, controlled by the hidden size, computational complexity and model generalisation.
arXiv Detail & Related papers (2020-06-17T17:28:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.