Related papers: Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches

Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches

URL: http://arxiv.org/abs/2005.03035v1
Date: Wed, 6 May 2020 18:00:04 GMT
Title: Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches
Authors: Tianze Shi, Lillian Lee
Abstract summary: An interesting and frequent type of multi-word expression (MWE) is the headless MWE. Current dependency-annotation schemes require treating such flat structures as if they had internal heads. We empirically compare these two common strategies--parsing and tagging--for predicting flat MWEs.
Score: 25.981620411958602
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: An interesting and frequent type of multi-word expression (MWE) is the headless MWE, for which there are no true internal syntactic dominance relations; examples include many named entities ("Wells Fargo") and dates ("July 5, 2020") as well as certain productive constructions ("blow for blow", "day after day"). Despite their special status and prevalence, current dependency-annotation schemes require treating such flat structures as if they had internal syntactic heads, and most current parsers handle them in the same fashion as headed constructions. Meanwhile, outside the context of parsing, taggers are typically used for identifying MWEs, but taggers might benefit from structural information. We empirically compare these two common strategies--parsing and tagging--for predicting flat MWEs. Additionally, we propose an efficient joint decoding algorithm that combines scores from both strategies. Experimental results on the MWE-Aware English Dependency Corpus and on six non-English dependency treebanks with frequent flat structures show that: (1) tagging is more accurate than parsing for identifying flat-structure MWEs, (2) our joint decoder reconciles the two different views and, for non-BERT features, leads to higher accuracies, and (3) most of the gains result from feature sharing between the parsers and taggers.

Related papers

CORG: Generating Answers from Complex, Interrelated Contexts [57.213304718157985]
In a real-world corpus, knowledge frequently recurs across documents but often contains inconsistencies due to ambiguous naming, outdated information, or errors. Previous research has shown that language models struggle with these complexities, typically focusing on single factors in isolation. We introduce Context Organizer (CORG), a framework that organizes multiple contexts into independently processed groups.
arXiv Detail & Related papers (2025-04-25T02:40:48Z)
Integrating Supertag Features into Neural Discontinuous Constituent Parsing [0.0]
Traditional views of constituency demand that constituents consist of adjacent words, common in languages like German. Transition-based parsing produces trees given raw text input using supervised learning on large annotated corpora.
arXiv Detail & Related papers (2024-10-11T12:28:26Z)
A General and Flexible Multi-concept Parsing Framework for Multilingual Semantic Matching [60.51839859852572]
We propose to resolve the text into multi concepts for multilingual semantic matching to liberate the model from the reliance on NER models. We conduct comprehensive experiments on English datasets QQP and MRPC, and Chinese dataset Medical-SM.
arXiv Detail & Related papers (2024-03-05T13:55:16Z)
Unsupervised Chunking with Hierarchical RNN [62.15060807493364]
This paper introduces an unsupervised approach to chunking, a syntactic task that involves grouping words in a non-hierarchical manner. We present a two-layer Hierarchical Recurrent Neural Network (HRNN) designed to model word-to-chunk and chunk-to-sentence compositions. Experiments on the CoNLL-2000 dataset reveal a notable improvement over existing unsupervised methods, enhancing phrase F1 score by up to 6 percentage points.
arXiv Detail & Related papers (2023-09-10T02:55:12Z)
Structural Ambiguity and its Disambiguation in Language Model Based Parsers: the Case of Dutch Clause Relativization [2.9950872478176627]
We study how the presence of a prior sentence can resolve relative clause ambiguities. Results show that a neurosymbolic, based on proof nets, is more open to data bias correction than an approach based on universal dependencies.
arXiv Detail & Related papers (2023-05-24T09:04:18Z)
Masked Part-Of-Speech Model: Does Modeling Long Context Help Unsupervised POS-tagging? [94.68962249604749]
We propose a Masked Part-of-Speech Model (MPoSM) to facilitate flexible dependency modeling. MPoSM can model arbitrary tag dependency and perform POS induction through the objective of masked POS reconstruction. We achieve competitive results on both the English Penn WSJ dataset and the universal treebank containing 10 diverse languages.
arXiv Detail & Related papers (2022-06-30T01:43:05Z)
CPTAM: Constituency Parse Tree Aggregation Method [6.011216641982612]
This paper adopts the truth discovery idea to aggregate constituency parse trees from different distances. We formulate the constituency parse tree aggregation problem in two steps, structure aggregation and constituent label aggregation. Experiments are conducted on benchmark datasets in different languages and domains.
arXiv Detail & Related papers (2022-01-19T23:05:37Z)
More Than Words: Collocation Tokenization for Latent Dirichlet Allocation Models [71.42030830910227]
We propose a new metric for measuring the clustering quality in settings where the models differ. We show that topics trained with merged tokens result in topic keys that are clearer, more coherent, and more effective at distinguishing topics than those unmerged models.
arXiv Detail & Related papers (2021-08-24T14:08:19Z)
Linguistic dependencies and statistical dependence [76.89273585568084]
We use pretrained language models to estimate probabilities of words in context. We find that maximum-CPMI trees correspond to linguistic dependencies more often than trees extracted from non-contextual PMI estimate.
arXiv Detail & Related papers (2021-04-18T02:43:37Z)
Multidirectional Associative Optimization of Function-Specific Word Representations [86.87082468226387]
We present a neural framework for learning associations between interrelated groups of words. Our model induces a joint function-specific word vector space, where vectors of e.g. plausible SVO compositions lie close together. The model retains information about word group membership even in the joint space, and can thereby effectively be applied to a number of tasks reasoning over the SVO structure.
arXiv Detail & Related papers (2020-05-11T17:07:20Z)
Is POS Tagging Necessary or Even Helpful for Neural Dependency Parsing? [22.93722845643562]
We show that POS tagging can still significantly improve parsing performance when using the Stack joint framework. Considering that it is much cheaper to annotate POS tags than parse trees, we also investigate the utilization of large-scale heterogeneous POS tag data.
arXiv Detail & Related papers (2020-03-06T13:47:30Z)
Discontinuous Constituent Parsing with Pointer Networks [0.34376560669160383]
discontinuous constituent trees are crucial for representing all grammatical phenomena of languages such as German. Recent advances in dependency parsing have shown that Pointer Networks excel in efficiently parsing syntactic relations between words in a sentence. We propose a novel neural network architecture that is able to generate the most accurate discontinuous constituent representations.
arXiv Detail & Related papers (2020-02-05T15:12:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.