K-UD: Revising Korean Universal Dependencies Guidelines
- URL: http://arxiv.org/abs/2412.00856v1
- Date: Sun, 01 Dec 2024 15:41:05 GMT
- Title: K-UD: Revising Korean Universal Dependencies Guidelines
- Authors: Kyuwon Kim, Yige Chen, Eunkyul Leah Jo, KyungTae Lim, Jungyeul Park, Chulwoo Park,
- Abstract summary: We aim to refine the definition of syntactic dependency of UDs within the context of analyzing the Korean language.
Our aim is not only to achieve a consensus within UDs but also to garner agreement beyond the UD framework for analyzing Korean sentences using dependency structure.
- Score: 6.292929354303524
- License:
- Abstract: Critique has surfaced concerning the existing linguistic annotation framework for Korean Universal Dependencies (UDs), particularly in relation to syntactic relationships. In this paper, our primary objective is to refine the definition of syntactic dependency of UDs within the context of analyzing the Korean language. Our aim is not only to achieve a consensus within UDs but also to garner agreement beyond the UD framework for analyzing Korean sentences using dependency structure, by establishing a linguistic consensus model.
Related papers
- Does Incomplete Syntax Influence Korean Language Model? Focusing on Word Order and Case Markers [7.275938266030414]
Syntactic elements, such as word order and case markers, are fundamental in natural language processing.
This study explores whether Korean language models can accurately capture this flexibility.
arXiv Detail & Related papers (2024-07-12T11:33:41Z) - A Compositional Typed Semantics for Universal Dependencies [26.65442947858347]
We introduce UD Type Calculus, a compositional, principled, and language-independent system of semantic types and logical forms for lexical items.
We explain the essential features of UD Type Calculus, which all involve giving dependency relations denotations just like those of words.
We present results on a large existing corpus of sentences and their logical forms, showing that UD-TC can produce meanings comparable with our baseline.
arXiv Detail & Related papers (2024-03-02T11:58:24Z) - Natural Language Decompositions of Implicit Content Enable Better Text
Representations [56.85319224208865]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account.
We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed.
Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z) - We're Afraid Language Models Aren't Modeling Ambiguity [136.8068419824318]
Managing ambiguity is a key part of human language understanding.
We characterize ambiguity in a sentence by its effect on entailment relations with another sentence.
We show that a multilabel NLI model can flag political claims in the wild that are misleading due to ambiguity.
arXiv Detail & Related papers (2023-04-27T17:57:58Z) - Discourse Analysis via Questions and Answers: Parsing Dependency
Structures of Questions Under Discussion [57.43781399856913]
This work adopts the linguistic framework of Questions Under Discussion (QUD) for discourse analysis.
We characterize relationships between sentences as free-form questions, in contrast to exhaustive fine-grained questions.
We develop the first-of-its-kind QUD that derives a dependency structure of questions over full documents.
arXiv Detail & Related papers (2022-10-12T03:53:12Z) - Yet Another Format of Universal Dependencies for Korean [4.909210276089872]
morphUD outperforms parsing results for all Korean UD treebanks.
We develop scripts that convert between the original format used by Universal Dependencies and the proposed morpheme-based format automatically.
arXiv Detail & Related papers (2022-09-20T14:21:00Z) - CUGE: A Chinese Language Understanding and Generation Evaluation
Benchmark [144.05723617401674]
General-purpose language intelligence evaluation has been a longstanding goal for natural language processing.
We argue that for general-purpose language intelligence evaluation, the benchmark itself needs to be comprehensive and systematic.
We propose CUGE, a Chinese Language Understanding and Generation Evaluation benchmark with the following features.
arXiv Detail & Related papers (2021-12-27T11:08:58Z) - Cross-linguistically Consistent Semantic and Syntactic Annotation of Child-directed Speech [27.657676278734534]
This paper proposes a methodology for constructing such corpora of child directed speech paired with sentential logical forms.
The approach enforces a cross-linguistically consistent representation, building on recent advances in dependency representation and semantic parsing.
arXiv Detail & Related papers (2021-09-22T18:17:06Z) - Treebanking User-Generated Content: a UD Based Overview of Guidelines,
Corpora and Unified Recommendations [58.50167394354305]
This article presents a discussion on the main linguistic phenomena which cause difficulties in the analysis of user-generated texts found on the web and in social media.
It proposes a set of tentative UD-based annotation guidelines to promote consistent treatment of the particular phenomena found in these types of texts.
arXiv Detail & Related papers (2020-11-03T23:34:42Z) - GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and
Event Extraction [107.8262586956778]
We introduce graph convolutional networks (GCNs) with universal dependency parses to learn language-agnostic sentence representations.
GCNs struggle to model words with long-range dependencies or are not directly connected in the dependency tree.
We propose to utilize the self-attention mechanism to learn the dependencies between words with different syntactic distances.
arXiv Detail & Related papers (2020-10-06T20:30:35Z) - Analysis of the Penn Korean Universal Dependency Treebank (PKT-UD):
Manual Revision to Build Robust Parsing Model in Korean [15.899449418195106]
We first open on important issues regarding the Penn Korean Universal Treebank (PKT-UD)
We address these issues by revising the entire corpus manually with the aim of producing cleaner UD annotations.
For compatibility to the rest of UD corpora, we extensively revise the part-of-speech tags and the dependency relations.
arXiv Detail & Related papers (2020-05-26T17:46:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.