The DCU-EPFL Enhanced Dependency Parser at the IWPT 2021 Shared Task
- URL: http://arxiv.org/abs/2107.01982v1
- Date: Mon, 5 Jul 2021 12:42:59 GMT
- Title: The DCU-EPFL Enhanced Dependency Parser at the IWPT 2021 Shared Task
- Authors: James Barry, Alireza Mohammadshahi, Joachim Wagner, Jennifer Foster,
James Henderson
- Abstract summary: We describe the multitask-EPFL submission to the IWPT 2021 Shared Task on Parsing into Enhanced Universal Dependencies.
The task involves parsing Enhanced graphs, which are an extension of the basic dependency trees designed to be more facilitative towards representing semantic structure.
evaluation is carried out on 29 treebanks in 17 languages and participants are required to parse the data from each language starting from raw strings.
- Score: 19.98425994656106
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We describe the DCU-EPFL submission to the IWPT 2021 Shared Task on Parsing
into Enhanced Universal Dependencies. The task involves parsing Enhanced UD
graphs, which are an extension of the basic dependency trees designed to be
more facilitative towards representing semantic structure. Evaluation is
carried out on 29 treebanks in 17 languages and participants are required to
parse the data from each language starting from raw strings. Our approach uses
the Stanza pipeline to preprocess the text files, XLMRoBERTa to obtain
contextualized token representations, and an edge-scoring and labeling model to
predict the enhanced graph. Finally, we run a post-processing script to ensure
all of our outputs are valid Enhanced UD graphs. Our system places 6th out of 9
participants with a coarse Enhanced Labeled Attachment Score (ELAS) of 83.57.
We carry out additional post-deadline experiments which include using Trankit
for pre-processing, XLM-RoBERTa-LARGE, treebank concatenation, and multitask
learning between a basic and an enhanced dependency parser. All of these
modifications improve our initial score and our final system has a coarse ELAS
of 88.04.
Related papers
- A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision [74.972172804514]
We introduce a multi-task Transformer model, CSLR2, that is able to ingest a signing sequence and output in a joint embedding space between signed language and spoken language text.
New dataset annotations provide continuous sign-level annotations for six hours of test videos, and will be made publicly available.
Our model significantly outperforms the previous state of the art on both tasks.
arXiv Detail & Related papers (2024-05-16T17:19:06Z) - Leveraging Linguistically Enhanced Embeddings for Open Information Extraction [0.0]
Open Information Extraction (OIE) is a structured prediction task in Natural Language Processing (NLP)
We are the first to leverage linguistic features with a Seq2Seq PLM for OIE.
Our work can give any neural OIE architecture the key performance boost from both PLMs and linguistic features in one go.
arXiv Detail & Related papers (2024-03-20T18:18:48Z) - Substructure Distribution Projection for Zero-Shot Cross-Lingual
Dependency Parsing [55.69800855705232]
SubDP is a technique that projects a distribution over structures in one domain to another, by projecting substructure distributions separately.
We evaluate SubDP on zero-shot cross-lingual dependency parsing, taking dependency arcs as substructures.
arXiv Detail & Related papers (2021-10-16T10:12:28Z) - TGIF: Tree-Graph Integrated-Format Parser for Enhanced UD with Two-Stage
Generic- to Individual-Language Finetuning [18.71574180551552]
We present our contribution to the IWPT 2021 shared task on parsing into enhanced Universal Dependencies.
Our main system component is a hybrid tree-graph that integrates predictions of spanning trees for the enhanced graphs with additional graph edges not present in the spanning trees.
arXiv Detail & Related papers (2021-07-14T18:00:08Z) - Coordinate Constructions in English Enhanced Universal Dependencies:
Analysis and Computational Modeling [1.9950682531209154]
We address the representation of coordinate constructions in Enhanced Universal Dependencies (UD)
We create a large-scale dataset of manually edited syntax graphs.
We identify several systematic errors in the original data, and propose to also propagate adjuncts.
arXiv Detail & Related papers (2021-03-16T10:24:27Z) - Constructing Taxonomies from Pretrained Language Models [52.53846972667636]
We present a method for constructing taxonomic trees (e.g., WordNet) using pretrained language models.
Our approach is composed of two modules, one that predicts parenthood relations and another that reconciles those predictions into trees.
We train our model on subtrees sampled from WordNet, and test on non-overlapping WordNet subtrees.
arXiv Detail & Related papers (2020-10-24T07:16:21Z) - Automatic Extraction of Rules Governing Morphological Agreement [103.78033184221373]
We develop an automated framework for extracting a first-pass grammatical specification from raw text.
We focus on extracting rules describing agreement, a morphosyntactic phenomenon at the core of the grammars of many of the world's languages.
We apply our framework to all languages included in the Universal Dependencies project, with promising results.
arXiv Detail & Related papers (2020-10-02T18:31:45Z) - The ADAPT Enhanced Dependency Parser at the IWPT 2020 Shared Task [12.226699055857182]
We describe the ADAPT system for the 2020 IWPT Shared Task on parsing enhanced Universal Dependencies in 17 languages.
We implement a pipeline approach using UDPipe and UDPipe-future to provide initial levels of annotation.
For the majority of languages, a semantic dependency can be successfully applied to the task of parsing enhanced dependencies.
arXiv Detail & Related papers (2020-09-03T14:43:04Z) - K{\o}psala: Transition-Based Graph Parsing via Efficient Training and
Effective Encoding [13.490365811869719]
We present Kopsala, the Copenhagen-Uppsala system for the Enhanced Universal Dependencies Shared Task at IWPT 2020.
Our system is a pipeline consisting of off-the-shelf models for everything but enhanced parsing, and for the latter, a transition-based graphencies adapted from Che et al.
Our demonstrates that a unified pipeline is effective for both Representation Parsing and Enhanced Universal Dependencies, according to average ELAS.
arXiv Detail & Related papers (2020-05-25T13:17:09Z) - pyBART: Evidence-based Syntactic Transformations for IE [52.93947844555369]
We present pyBART, an easy-to-use open-source Python library for converting English UD trees to Enhanced UD graphs or to our representation.
When evaluated in a pattern-based relation extraction scenario, our representation results in higher extraction scores than Enhanced UD, while requiring fewer patterns.
arXiv Detail & Related papers (2020-05-04T07:38:34Z) - Towards Instance-Level Parser Selection for Cross-Lingual Transfer of
Dependency Parsers [59.345145623931636]
We argue for a novel cross-lingual transfer paradigm: instance-level selection (ILPS)
We present a proof-of-concept study focused on instance-level selection in the framework of delexicalized transfer.
arXiv Detail & Related papers (2020-04-16T13:18:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.