Related papers: Efficient Second-Order TreeCRF for Neural Dependency Parsing

Efficient Second-Order TreeCRF for Neural Dependency Parsing

URL: http://arxiv.org/abs/2005.00975v2
Date: Mon, 29 Jun 2020 06:35:00 GMT
Title: Efficient Second-Order TreeCRF for Neural Dependency Parsing
Authors: Yu Zhang, Zhenghua Li, Min Zhang
Abstract summary: In the deep learning (DL) era, parsing models are extremely simplified with little hurt on performance. This paper presents a second-order TreeCRF extension to the biaffine. We propose an effective way to batchify the inside and Viterbi algorithms for direct large matrix operation.
Score: 23.426500262860777
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the deep learning (DL) era, parsing models are extremely simplified with little hurt on performance, thanks to the remarkable capability of multi-layer BiLSTMs in context representation. As the most popular graph-based dependency parser due to its high efficiency and performance, the biaffine parser directly scores single dependencies under the arc-factorization assumption, and adopts a very simple local token-wise cross-entropy training loss. This paper for the first time presents a second-order TreeCRF extension to the biaffine parser. For a long time, the complexity and inefficiency of the inside-outside algorithm hinder the popularity of TreeCRF. To address this issue, we propose an effective way to batchify the inside and Viterbi algorithms for direct large matrix operation on GPUs, and to avoid the complex outside algorithm via efficient back-propagation. Experiments and analysis on 27 datasets from 13 languages clearly show that techniques developed before the DL era, such as structural learning (global TreeCRF loss) and high-order modeling are still useful, and can further boost parsing performance over the state-of-the-art biaffine parser, especially for partially annotated training data. We release our code at https://github.com/yzhangcs/crfpar.

Related papers

Score-matching-based Structure Learning for Temporal Data on Networks [17.166362605356074]
Causal discovery is a crucial initial step in establishing causality from empirical data and background knowledge. Current score-matching-based algorithms are primarily designed to analyze independent and identically distributed (i.i.d.) data. We have developed a new parent-finding subroutine for leaf nodes in DAGs, significantly accelerating the most time-consuming part of the process: the pruning step.
arXiv Detail & Related papers (2024-12-10T12:36:35Z)
Optimizing Tensor Computation Graphs with Equality Saturation and Monte Carlo Tree Search [0.0]
We present a tensor graph rewriting approach that uses Monte Carlo tree search to build superior representation. Our approach improves the inference speedup of neural networks by up to 11% compared to existing methods.
arXiv Detail & Related papers (2024-10-07T22:22:02Z)
Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs. Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors. We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z)
Scalable Learning of Latent Language Structure With Logical Offline Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text. As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z)
Biaffine Discourse Dependency Parsing [0.0]
We use the biaffine model for neural discourse dependency parsing and achieve significant performance improvement compared with the baselines. We compare the Eisner algorithm and the Chu-Liu-Edmonds algorithm in the task and find that using the Chu-Liu-Edmonds generates deeper trees.
arXiv Detail & Related papers (2022-01-12T12:56:13Z)
Structural Optimization Makes Graph Classification Simpler and Better [5.770986723520119]
We investigate the feasibility of improving graph classification performance while simplifying the model learning process. Inspired by progress in structural information assessment, we optimize the given data sample from graphs to encoding trees. We present an implementation of the scheme in a tree kernel and a convolutional network to perform graph classification.
arXiv Detail & Related papers (2021-09-05T08:54:38Z)
Comparative Code Structure Analysis using Deep Learning for Performance Prediction [18.226950022938954]
This paper aims to assess the feasibility of using purely static information (e.g., abstract syntax tree or AST) of applications to predict performance change based on the change in code structure. Our evaluations of several deep embedding learning methods demonstrate that tree-based Long Short-Term Memory (LSTM) models can leverage the hierarchical structure of source-code to discover latent representations and achieve up to 84% (individual problem) and 73% (combined dataset with multiple of problems) accuracy in predicting the change in performance.
arXiv Detail & Related papers (2021-02-12T16:59:12Z)
Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification. Our strategy enables important aspects of the base learner objective to be learned during meta-training. We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z)
A Simple Global Neural Discourse Parser [61.728994693410954]
We propose a simple chart-based neural discourse that does not require any manually-crafted features and is based on learned span representations only. We empirically demonstrate that our model achieves the best performance among globals, and comparable performance to state-of-art greedys.
arXiv Detail & Related papers (2020-09-02T19:28:40Z)
Fast and Accurate Neural CRF Constituency Parsing [16.90190521285297]
This work presents a fast and accurate neural CRF constituency computation. We batchify the inside algorithm for loss by direct large tensor operations on GPU, and avoid the outside algorithm for computation via efficient back-propagation. Experiments on PTB, CTB5.1, and CTB7 show that our two-stage CRF achieves new state-of-the-art performance on both settings of w/o and w/ BERT.
arXiv Detail & Related papers (2020-08-09T14:38:48Z)
PolyDL: Polyhedral Optimizations for Creation of High Performance DL primitives [55.79741270235602]
We present compiler algorithms to automatically generate high performance implementations of Deep Learning primitives. We develop novel data reuse analysis algorithms using the polyhedral model. We also show that such a hybrid compiler plus a minimal library-use approach results in state-of-the-art performance.
arXiv Detail & Related papers (2020-06-02T06:44:09Z)
Learning to Hash with Graph Neural Networks for Recommender Systems [103.82479899868191]
Graph representation learning has attracted much attention in supporting high quality candidate search at scale. Despite its effectiveness in learning embedding vectors for objects in the user-item interaction network, the computational costs to infer users' preferences in continuous embedding space are tremendous. We propose a simple yet effective discrete representation learning framework to jointly learn continuous and discrete codes.
arXiv Detail & Related papers (2020-03-04T06:59:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.