pyBART: Evidence-based Syntactic Transformations for IE
- URL: http://arxiv.org/abs/2005.01306v2
- Date: Thu, 4 Jun 2020 10:25:10 GMT
- Title: pyBART: Evidence-based Syntactic Transformations for IE
- Authors: Aryeh Tiktinsky, Yoav Goldberg, Reut Tsarfaty
- Abstract summary: We present pyBART, an easy-to-use open-source Python library for converting English UD trees to Enhanced UD graphs or to our representation.
When evaluated in a pattern-based relation extraction scenario, our representation results in higher extraction scores than Enhanced UD, while requiring fewer patterns.
- Score: 52.93947844555369
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Syntactic dependencies can be predicted with high accuracy, and are useful
for both machine-learned and pattern-based information extraction tasks.
However, their utility can be improved. These syntactic dependencies are
designed to accurately reflect syntactic relations, and they do not make
semantic relations explicit. Therefore, these representations lack many
explicit connections between content words, that would be useful for downstream
applications. Proposals like English Enhanced UD improve the situation by
extending universal dependency trees with additional explicit arcs. However,
they are not available to Python users, and are also limited in coverage. We
introduce a broad-coverage, data-driven and linguistically sound set of
transformations, that makes event-structure and many lexical relations
explicit. We present pyBART, an easy-to-use open-source Python library for
converting English UD trees either to Enhanced UD graphs or to our
representation. The library can work as a standalone package or be integrated
within a spaCy NLP pipeline. When evaluated in a pattern-based relation
extraction scenario, our representation results in higher extraction scores
than Enhanced UD, while requiring fewer patterns.
Related papers
- pyvene: A Library for Understanding and Improving PyTorch Models via
Interventions [79.72930339711478]
$textbfpyvene$ is an open-source library that supports customizable interventions on a range of different PyTorch modules.
We show how $textbfpyvene$ provides a unified framework for performing interventions on neural models and sharing the intervened upon models with others.
arXiv Detail & Related papers (2024-03-12T16:46:54Z) - Data Augmentation for Machine Translation via Dependency Subtree
Swapping [0.0]
We present a generic framework for data augmentation via dependency subtree swapping.
We extract corresponding subtrees from the dependency parse trees of the source and target sentences and swap these across bisentences to create augmented samples.
We conduct resource-constrained experiments on 4 language pairs in both directions using the IWSLT text translation datasets and the Hunglish2 corpus.
arXiv Detail & Related papers (2023-07-13T19:00:26Z) - SPINDLE: Spinning Raw Text into Lambda Terms with Graph Attention [0.8379286663107844]
The module transforms raw text input to programs for meaning composition, expressed as lambda terms.
Its output consists of hi-res derivations of a multimodal type-logical grammar.
arXiv Detail & Related papers (2023-02-23T14:22:45Z) - Syntactic Multi-view Learning for Open Information Extraction [26.1066324477346]
Open Information Extraction (OpenIE) aims to extracts from open-domain sentences.
In this paper, we model both constituency and dependency trees into word-level graphs.
arXiv Detail & Related papers (2022-12-05T07:15:41Z) - GraphQ IR: Unifying Semantic Parsing of Graph Query Language with
Intermediate Representation [91.27083732371453]
We propose a unified intermediate representation (IR) for graph query languages, namely GraphQ IR.
With the IR's natural-language-like representation that bridges the semantic gap and its formally defined syntax that maintains the graph structure, neural semantic parsing can more effectively convert user queries into GraphQ IR.
Our approach can consistently achieve state-of-the-art performance on KQA Pro, Overnight and MetaQA.
arXiv Detail & Related papers (2022-05-24T13:59:53Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - Coordinate Constructions in English Enhanced Universal Dependencies:
Analysis and Computational Modeling [1.9950682531209154]
We address the representation of coordinate constructions in Enhanced Universal Dependencies (UD)
We create a large-scale dataset of manually edited syntax graphs.
We identify several systematic errors in the original data, and propose to also propagate adjuncts.
arXiv Detail & Related papers (2021-03-16T10:24:27Z) - GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and
Event Extraction [107.8262586956778]
We introduce graph convolutional networks (GCNs) with universal dependency parses to learn language-agnostic sentence representations.
GCNs struggle to model words with long-range dependencies or are not directly connected in the dependency tree.
We propose to utilize the self-attention mechanism to learn the dependencies between words with different syntactic distances.
arXiv Detail & Related papers (2020-10-06T20:30:35Z) - Coreferential Reasoning Learning for Language Representation [88.14248323659267]
We present CorefBERT, a novel language representation model that can capture the coreferential relations in context.
The experimental results show that, compared with existing baseline models, CorefBERT can achieve significant improvements consistently on various downstream NLP tasks.
arXiv Detail & Related papers (2020-04-15T03:57:45Z) - Cross-Lingual Adaptation Using Universal Dependencies [1.027974860479791]
We show that models trained using UD parse trees for complex NLP tasks can characterize very different languages.
Based on UD parse trees, we develop several models using tree kernels and show that these models trained on the English dataset can correctly classify data of other languages.
arXiv Detail & Related papers (2020-03-24T13:04:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.