Discourse Representation Structure Parsing for Chinese
- URL: http://arxiv.org/abs/2306.09725v1
- Date: Fri, 16 Jun 2023 09:47:45 GMT
- Title: Discourse Representation Structure Parsing for Chinese
- Authors: Chunliu Wang, Xiao Zhang, Johan Bos
- Abstract summary: We explore the feasibility of Chinese semantic parsing in the absence of labeled data for Chinese meaning representations.
We propose a test suite designed explicitly for Chinese semantic parsing, which provides fine-grained evaluation for parsing performance.
Our experimental results show that the difficulty of Chinese semantic parsing is mainly caused by adverbs.
- Score: 8.846860617823005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous work has predominantly focused on monolingual English semantic
parsing. We, instead, explore the feasibility of Chinese semantic parsing in
the absence of labeled data for Chinese meaning representations. We describe
the pipeline of automatically collecting the linearized Chinese meaning
representation data for sequential-to sequential neural networks. We further
propose a test suite designed explicitly for Chinese semantic parsing, which
provides fine-grained evaluation for parsing performance, where we aim to study
Chinese parsing difficulties. Our experimental results show that the difficulty
of Chinese semantic parsing is mainly caused by adverbs. Realizing Chinese
parsing through machine translation and an English parser yields slightly lower
performance than training a model directly on Chinese data.
Related papers
- Is Argument Structure of Learner Chinese Understandable: A Corpus-Based
Analysis [8.883799596036484]
This paper presents a corpus-based analysis of argument structure errors in learner Chinese.
The data for analysis includes sentences produced by language learners as well as their corrections by native speakers.
We couple the data with semantic role labeling annotations that are manually created by two senior students.
arXiv Detail & Related papers (2023-08-17T21:10:04Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - A New Dataset and Empirical Study for Sentence Simplification in Chinese [50.0624778757462]
This paper introduces CSS, a new dataset for assessing sentence simplification in Chinese.
We collect manual simplifications from human annotators and perform data analysis to show the difference between English and Chinese sentence simplifications.
In the end, we explore whether Large Language Models can serve as high-quality Chinese sentence simplification systems by evaluating them on CSS.
arXiv Detail & Related papers (2023-06-07T06:47:34Z) - Joint Chinese Word Segmentation and Span-based Constituency Parsing [11.080040070201608]
This work proposes a method for joint Chinese word segmentation and Span-based Constituency Parsing by adding extra labels to individual Chinese characters on the parse trees.
Through experiments, the proposed algorithm outperforms the recent models for joint segmentation and constituency parsing on CTB 5.1.
arXiv Detail & Related papers (2022-11-03T08:19:00Z) - Improving Chinese Story Generation via Awareness of Syntactic
Dependencies and Semantics [17.04903530992664]
We present a new generation framework that enhances the feature mechanism by informing the generation model of dependencies between words.
We conduct a range of experiments, and the results demonstrate that our framework outperforms the state-of-the-art Chinese generation models on all evaluation metrics.
arXiv Detail & Related papers (2022-10-19T15:01:52Z) - Exploiting Word Semantics to Enrich Character Representations of Chinese
Pre-trained Models [12.0190584907439]
We propose a new method to exploit word structure and integrate lexical semantics into character representations of pre-trained models.
We show that our approach achieves superior performance over the basic pre-trained models BERT, BERT-wwm and ERNIE on different Chinese NLP tasks.
arXiv Detail & Related papers (2022-07-13T02:28:08Z) - Improving Pre-trained Language Models with Syntactic Dependency
Prediction Task for Chinese Semantic Error Recognition [52.55136323341319]
Existing Chinese text error detection mainly focuses on spelling and simple grammatical errors.
Chinese semantic errors are understudied and more complex that humans cannot easily recognize.
arXiv Detail & Related papers (2022-04-15T13:55:32Z) - End-to-End Chinese Parsing Exploiting Lexicons [15.786281545363448]
We propose an end-to-end Chinese parsing model based on character inputs which jointly learns to output word segmentation, part-of-speech tags and dependency structures.
Our parsing model relies on word-char graph attention networks, which can enrich the character inputs with external word knowledge.
arXiv Detail & Related papers (2020-12-08T12:24:36Z) - Is Supervised Syntactic Parsing Beneficial for Language Understanding?
An Empirical Investigation [71.70562795158625]
Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU)
Recent advent of end-to-end neural models, self-supervised via language modeling (LM), and their success on a wide range of LU tasks, questions this belief.
We empirically investigate the usefulness of supervised parsing for semantic LU in the context of LM-pretrained transformer networks.
arXiv Detail & Related papers (2020-08-15T21:03:36Z) - Self-Attention with Cross-Lingual Position Representation [112.05807284056337]
Position encoding (PE) is used to preserve the word order information for natural language processing tasks, generating fixed position indices for input sequences.
Due to word order divergences in different languages, modeling the cross-lingual positional relationships might help SANs tackle this problem.
We augment SANs with emphcross-lingual position representations to model the bilingually aware latent structure for the input sentence.
arXiv Detail & Related papers (2020-04-28T05:23:43Z) - On the Language Neutrality of Pre-trained Multilingual Representations [70.93503607755055]
We investigate the language-neutrality of multilingual contextual embeddings directly and with respect to lexical semantics.
Our results show that contextual embeddings are more language-neutral and, in general, more informative than aligned static word-type embeddings.
We show how to reach state-of-the-art accuracy on language identification and match the performance of statistical methods for word alignment of parallel sentences.
arXiv Detail & Related papers (2020-04-09T19:50:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.