TransCouplet:Transformer based Chinese Couplet Generation
- URL: http://arxiv.org/abs/2112.01707v1
- Date: Fri, 3 Dec 2021 04:34:48 GMT
- Title: TransCouplet:Transformer based Chinese Couplet Generation
- Authors: Kuan-Yu Chiang, Shihao Lin, Joe Chen, Qian Yin, Qizhen Jin
- Abstract summary: Chinese couplet is a form of poetry composed of complex syntax with ancient Chinese language.
This paper presents a transformer-based sequence-to-sequence couplet generation model.
We also evaluate the Glyph, PinYin and Part-of-Speech tagging on the couplet grammatical rules.
- Score: 1.084959821967413
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Chinese couplet is a special form of poetry composed of complex syntax with
ancient Chinese language. Due to the complexity of semantic and grammatical
rules, creation of a suitable couplet is a formidable challenge. This paper
presents a transformer-based sequence-to-sequence couplet generation model.
With the utilization of AnchiBERT, the model is able to capture ancient Chinese
language understanding. Moreover, we evaluate the Glyph, PinYin and
Part-of-Speech tagging on the couplet grammatical rules to further improve the
model.
Related papers
- Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation [3.9166923630129604]
Bailing-TTS is a family of large-scale TTS models capable of generating high-quality Chinese dialectal speech.
The Chinese dialectal representation learning is developed using a specific transformer architecture and multi-stage training processes.
Experiments demonstrate that Bailing-TTS generates Chinese dialectal speech towards human-like spontaneous representation.
arXiv Detail & Related papers (2024-08-01T04:57:31Z) - Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding [57.22231959529641]
Hunyuan-DiT is a text-to-image diffusion transformer with fine-grained understanding of both English and Chinese.
For fine-grained language understanding, we train a Multimodal Large Language Model to refine the captions of the images.
arXiv Detail & Related papers (2024-05-14T16:33:25Z) - Shuo Wen Jie Zi: Rethinking Dictionaries and Glyphs for Chinese Language
Pre-training [50.100992353488174]
We introduce CDBERT, a new learning paradigm that enhances the semantics understanding ability of the Chinese PLMs with dictionary knowledge and structure of Chinese characters.
We name the two core modules of CDBERT as Shuowen and Jiezi, where Shuowen refers to the process of retrieving the most appropriate meaning from Chinese dictionaries.
Our paradigm demonstrates consistent improvements on previous Chinese PLMs across all tasks.
arXiv Detail & Related papers (2023-05-30T05:48:36Z) - Dual-Decoder Transformer For end-to-end Mandarin Chinese Speech
Recognition with Pinyin and Character [15.999657143705045]
Pinyin and character as writing and spelling systems respectively are mutual promotion in the Mandarin Chinese language.
We propose a novel Mandarin Chinese ASR model with dual-decoder Transformer according to the characteristics of pinyin transcripts and character transcripts.
The results on the test sets of AISHELL-1 dataset show that the proposed Speech-Pinyin-Character-Interaction (S PCI) model without a language model achieves 9.85% character error rate (CER) on the test set.
arXiv Detail & Related papers (2022-01-26T07:59:03Z) - Controlled Evaluation of Grammatical Knowledge in Mandarin Chinese
Language Models [22.57309958548928]
We investigate whether structural supervision improves language models' ability to learn grammatical dependencies in typologically different languages.
We train LSTMs, Recurrent Neural Network Grammars, Transformer language models, and generative parsing models on datasets of different sizes.
We find suggestive evidence that structural supervision helps with representing syntactic state across intervening content and improves performance in low-data settings.
arXiv Detail & Related papers (2021-09-22T22:11:30Z) - ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin
Information [32.70080326854314]
We propose ChineseBERT, which incorporates the glyph and pinyin information of Chinese characters into language model pretraining.
The proposed ChineseBERT model yields significant performance boost over baseline models with fewer training steps.
arXiv Detail & Related papers (2021-06-30T13:06:00Z) - Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese
Pre-trained Language Models [62.41139712595334]
We propose a novel pre-training paradigm for Chinese -- Lattice-BERT.
We construct a lattice graph from the characters and words in a sentence and feed all these text units into transformers.
We show that our model can bring an average increase of 1.5% under the 12-layer setting.
arXiv Detail & Related papers (2021-04-15T02:36:49Z) - GTAE: Graph-Transformer based Auto-Encoders for Linguistic-Constrained
Text Style Transfer [119.70961704127157]
Non-parallel text style transfer has attracted increasing research interests in recent years.
Current approaches still lack the ability to preserve the content and even logic of original sentences.
We propose a method called Graph Transformer based Auto-GTAE, which models a sentence as a linguistic graph and performs feature extraction and style transfer at the graph level.
arXiv Detail & Related papers (2021-02-01T11:08:45Z) - AnchiBERT: A Pre-Trained Model for Ancient ChineseLanguage Understanding
and Generation [22.08457469951396]
AnchiBERT is a pre-trained language model based on the architecture of BERT.
We evaluate AnchiBERT on both language understanding and generation tasks, including poem classification.
arXiv Detail & Related papers (2020-09-24T03:41:13Z) - Constructing a Family Tree of Ten Indo-European Languages with
Delexicalized Cross-linguistic Transfer Patterns [57.86480614673034]
We formalize the delexicalized transfer as interpretable tree-to-string and tree-to-tree patterns.
This allows us to quantitatively probe cross-linguistic transfer and extend inquiries of Second Language Acquisition.
arXiv Detail & Related papers (2020-07-17T15:56:54Z) - Generating Major Types of Chinese Classical Poetry in a Uniformed
Framework [88.57587722069239]
We propose a GPT-2 based framework for generating major types of Chinese classical poems.
Preliminary results show this enhanced model can generate Chinese classical poems of major types with high quality in both form and content.
arXiv Detail & Related papers (2020-03-13T14:16:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.