CGELBank Annotation Manual v1.1
- URL: http://arxiv.org/abs/2305.17347v2
- Date: Wed, 5 Jun 2024 02:14:37 GMT
- Title: CGELBank Annotation Manual v1.1
- Authors: Brett Reynolds, Nathan Schneider, Aryaman Arora,
- Abstract summary: CGELBank is a treebank and associated tools based on a syntactic formalism for English derived from the Cambridge Grammar of the English Language.
This document lays out the particularities of the CGELBank annotation scheme.
- Score: 8.78380676369991
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: CGELBank is a treebank and associated tools based on a syntactic formalism for English derived from the Cambridge Grammar of the English Language. This document lays out the particularities of the CGELBank annotation scheme.
Related papers
- Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions [53.069446715005924]
Graph-based captioning (GBC) describes an image using a labelled graph structure.
nodes in GBC are created using, in a first stage, object detection and dense captioning tools.
We show that using GBC nodes' annotations results in significant performance boost on downstream models.
arXiv Detail & Related papers (2024-07-09T09:55:04Z) - CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies [53.2331634010413]
CultureBank is a knowledge base built upon users' self-narratives.
It contains 12K cultural descriptors sourced from TikTok and 11K from Reddit.
We offer recommendations for future culturally aware language technologies.
arXiv Detail & Related papers (2024-04-23T17:16:08Z) - MaiBaam: A Multi-Dialectal Bavarian Universal Dependency Treebank [56.810282574817414]
We present the first multi-dialect Bavarian treebank (MaiBaam) manually annotated with part-of-speech and syntactic dependency information in Universal Dependencies (UD)
We highlight the morphosyntactic differences between the closely-related Bavarian and German and showcase the rich variability of speakers' orthographies.
Our corpus includes 15k tokens, covering dialects from all Bavarian-speaking areas spanning three countries.
arXiv Detail & Related papers (2024-03-15T13:33:10Z) - Is Japanese CCGBank empirically correct? A case study of passive and
causative constructions [18.021287677546958]
We focus on the analysis of passive/causative constructions in the Japanese CCGBank.
We show that, together with the compositional semantics of ccg2lambda, a semantic parsing system, it yields empirically wrong predictions for the nested construction of passives and causatives.
arXiv Detail & Related papers (2023-02-28T16:19:24Z) - CGELBank: CGEL as a Framework for English Syntax Annotation [11.042037758273226]
We introduce the syntactic formalism of the textitCambridge Grammar of the English Language (CGEL) to the world of treebanking through the CGELBank project.
We discuss some issues in linguistic analysis that arose in adapting the formalism to corpus annotation, followed by quantitative and qualitative comparisons with parallel UD and PTB treebanks.
arXiv Detail & Related papers (2022-10-01T23:44:06Z) - Incorporating Constituent Syntax for Coreference Resolution [50.71868417008133]
We propose a graph-based method to incorporate constituent syntactic structures.
We also explore to utilise higher-order neighbourhood information to encode rich structures in constituent trees.
Experiments on the English and Chinese portions of OntoNotes 5.0 benchmark show that our proposed model either beats a strong baseline or achieves new state-of-the-art performance.
arXiv Detail & Related papers (2022-02-22T07:40:42Z) - BBC-Oxford British Sign Language Dataset [64.32108826673183]
We introduce the BBC-Oxford British Sign Language (BOBSL) dataset, a large-scale video collection of British Sign Language (BSL)
We describe the motivation for the dataset, together with statistics and available annotations.
We conduct experiments to provide baselines for the tasks of sign recognition, sign language alignment, and sign language translation.
arXiv Detail & Related papers (2021-11-05T17:35:58Z) - Apurin\~a Universal Dependencies Treebank [0.4893345190925178]
This paper presents and discusses the first Universal Dependencies treebank for the Apurina language.
The treebank contains 76 fully annotated sentences, applies 14 parts-of-speech, as well as seven augmented or new features.
arXiv Detail & Related papers (2021-06-07T07:42:00Z) - Treebanking User-Generated Content: a UD Based Overview of Guidelines,
Corpora and Unified Recommendations [58.50167394354305]
This article presents a discussion on the main linguistic phenomena which cause difficulties in the analysis of user-generated texts found on the web and in social media.
It proposes a set of tentative UD-based annotation guidelines to promote consistent treatment of the particular phenomena found in these types of texts.
arXiv Detail & Related papers (2020-11-03T23:34:42Z) - Establishing a New State-of-the-Art for French Named Entity Recognition [0.0]
The French TreeBank is the main source of morphosyntactic and syntactic annotations for French.
It does not include explicit information related to named entities, which are among the most useful information for several natural language processing tasks and applications.
We have manually annotated the French TreeBank with such information, after an automatic pre-annotation step.
arXiv Detail & Related papers (2020-05-27T08:44:09Z) - Resources for Turkish Dependency Parsing: Introducing the BOUN Treebank
and the BoAT Annotation Tool [0.0]
We introduce the resources that we developed for Turkish dependency parsing, which include a novel manually annotated treebank (BOUN Treebank)
Decisions regarding the annotation of the BOUN Treebank were made in line with the Universal Dependencies (UD) framework.
We report the results of a state-of-the-art dependency annotation obtained over the BOUN Treebank as well as two other treebanks in Turkish.
arXiv Detail & Related papers (2020-02-24T17:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.