Lexical Complexity Controlled Sentence Generation
- URL: http://arxiv.org/abs/2211.14540v1
- Date: Sat, 26 Nov 2022 11:03:56 GMT
- Title: Lexical Complexity Controlled Sentence Generation
- Authors: Jinran Nie, Liner Yang, Yun Chen, Cunliang Kong, Junhui Zhu, Erhong
Yang
- Abstract summary: We introduce a novel task of lexical complexity controlled sentence generation.
It has enormous potential in domains such as grade reading, language teaching and acquisition.
We propose a simple but effective approach for this task based on complexity embedding.
- Score: 6.298911438929862
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text generation rarely considers the control of lexical complexity, which
limits its more comprehensive practical application. We introduce a novel task
of lexical complexity controlled sentence generation, which aims at keywords to
sentence generation with desired complexity levels. It has enormous potential
in domains such as grade reading, language teaching and acquisition. The
challenge of this task is to generate fluent sentences only using the words of
given complexity levels. We propose a simple but effective approach for this
task based on complexity embedding. Compared with potential solutions, our
approach fuses the representations of the word complexity levels into the model
to get better control of lexical complexity. And we demonstrate the feasibility
of the approach for both training models from scratch and fine-tuning the
pre-trained models. To facilitate the research, we develop two datasets in
English and Chinese respectively, on which extensive experiments are conducted.
Results show that our approach better controls lexical complexity and generates
higher quality sentences than baseline methods.
Related papers
- Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round LLM Generation [49.27250832754313]
We present AgentCOT, a llm-based autonomous agent framework.
At each step, AgentCOT selects an action and executes it to yield an intermediate result with supporting evidence.
We introduce two new strategies to enhance the performance of AgentCOT.
arXiv Detail & Related papers (2024-09-19T02:20:06Z) - Self-Convinced Prompting: Few-Shot Question Answering with Repeated
Introspection [13.608076739368949]
We introduce a novel framework that harnesses the potential of large-scale pre-trained language models.
Our framework processes the output of a typical few-shot chain-of-thought prompt, assesses the correctness of the response, scrutinizes the answer, and ultimately produces a new solution.
arXiv Detail & Related papers (2023-10-08T06:36:26Z) - Can Large Language Models Understand Real-World Complex Instructions? [54.86632921036983]
Large language models (LLMs) can understand human instructions, but struggle with complex instructions.
Existing benchmarks are insufficient to assess LLMs' ability to understand complex instructions.
We propose CELLO, a benchmark for evaluating LLMs' ability to follow complex instructions systematically.
arXiv Detail & Related papers (2023-09-17T04:18:39Z) - When Do Program-of-Thoughts Work for Reasoning? [51.2699797837818]
We propose complexity-impacted reasoning score (CIRS) to measure correlation between code and reasoning abilities.
Specifically, we use the abstract syntax tree to encode the structural information and calculate logical complexity.
Code will be integrated into the EasyInstruct framework at https://github.com/zjunlp/EasyInstruct.
arXiv Detail & Related papers (2023-08-29T17:22:39Z) - Natural Language Decomposition and Interpretation of Complex Utterances [47.30126929007346]
We introduce an approach to handle complex-intent-bearing utterances from a user via a process of hierarchical natural language decomposition.
Our approach uses a pre-trained language model to decompose a complex utterance into a sequence of simpler natural language steps.
Experiments show that the proposed approach enables the interpretation of complex utterances with almost no complex training data.
arXiv Detail & Related papers (2023-05-15T14:35:00Z) - Pseudo-Labels Are All You Need [3.52359746858894]
We present our submission to the Text Complexity DE Challenge 2022.
The goal is to predict the complexity of a German sentence for German learners at level B.
We find that the pseudo-label-based approach gives impressive results yet requires little to no adjustment to the specific task.
arXiv Detail & Related papers (2022-08-19T09:52:41Z) - Domain Adaptation in Multilingual and Multi-Domain Monolingual Settings
for Complex Word Identification [0.27998963147546146]
Complex word identification (CWI) is a cornerstone process towards proper text simplification.
CWI is highly dependent on context, whereas its difficulty is augmented by the scarcity of available datasets.
We propose a novel training technique for the CWI task based on domain adaptation to improve the target character and context representations.
arXiv Detail & Related papers (2022-05-15T13:21:02Z) - Uniform Complexity for Text Generation [4.867923281108005]
We introduce Uniform Complexity for Text Generation (UCTG), a new benchmark test which raises the challenge of making generative models observe uniform linguistic properties with respect to prompts.
We find that models such as GPT-2 struggle to preserve the complexity of input prompts used in its generations, even if finetuned with professionally written texts.
arXiv Detail & Related papers (2022-04-11T15:19:47Z) - Long Text Generation by Modeling Sentence-Level and Discourse-Level
Coherence [59.51720326054546]
We propose a long text generation model, which can represent the prefix sentences at sentence level and discourse level in the decoding process.
Our model can generate more coherent texts than state-of-the-art baselines.
arXiv Detail & Related papers (2021-05-19T07:29:08Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z) - BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration [72.88493072196094]
We present a new synthesis approach that leverages learning to guide a bottom-up search over programs.
In particular, we train a model to prioritize compositions of intermediate values during search conditioned on a set of input-output examples.
We show that the combination of learning and bottom-up search is remarkably effective, even with simple supervised learning approaches.
arXiv Detail & Related papers (2020-07-28T17:46:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.