From Solving a Problem Boldly to Cutting the Gordian Knot: Idiomatic
Text Generation
- URL: http://arxiv.org/abs/2104.06541v1
- Date: Tue, 13 Apr 2021 22:57:25 GMT
- Title: From Solving a Problem Boldly to Cutting the Gordian Knot: Idiomatic
Text Generation
- Authors: Jianing Zhou, Hongyu Gong, Suma Bhat
- Abstract summary: We study a new application for text generation -- idiomatic sentence generation.
We propose a novel approach for this task, which retrieves the appropriate idiom for a given literal sentence.
We generate the idiomatic sentence by using a neural model to combine the retrieved idiom and the remainder of the sentence.
- Score: 14.360808219541752
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study a new application for text generation -- idiomatic sentence
generation -- which aims to transfer literal phrases in sentences into their
idiomatic counterparts. Inspired by psycholinguistic theories of idiom use in
one's native language, we propose a novel approach for this task, which
retrieves the appropriate idiom for a given literal sentence, extracts the span
of the sentence to be replaced by the idiom, and generates the idiomatic
sentence by using a neural model to combine the retrieved idiom and the
remainder of the sentence. Experiments on a novel dataset created for this task
show that our model is able to effectively transfer literal sentences into
idiomatic ones. Furthermore, automatic and human evaluations show that for this
task, the proposed model outperforms a series of competitive baseline models
for text generation.
Related papers
- Neural paraphrasing by automatically crawled and aligned sentence pairs [11.95795974003684]
The main obstacle toward neural-network-based paraphrasing is the lack of large datasets with aligned pairs of sentences and paraphrases.
We present a method for the automatic generation of large aligned corpora, that is based on the assumption that news and blog websites talk about the same events using different narrative styles.
We propose a similarity search procedure with linguistic constraints that, given a reference sentence, is able to locate the most similar candidate paraphrases out from millions of indexed sentences.
arXiv Detail & Related papers (2024-02-16T10:40:38Z) - TwistList: Resources and Baselines for Tongue Twister Generation [17.317550526263183]
We present work on the generation of tongue twisters, a form of language that is required to be phonetically conditioned to maximise sound overlap.
We present textbfTwistList, a large annotated dataset of tongue twisters, consisting of 2.1K+ human-authored examples.
We additionally present several benchmark systems for the proposed task of tongue twister generation, including models that both do and do not require training on in-domain data.
arXiv Detail & Related papers (2023-06-06T07:20:51Z) - Training Effective Neural Sentence Encoders from Automatically Mined
Paraphrases [0.0]
We propose a method for training effective language-specific sentence encoders without manually labeled data.
Our approach is to automatically construct a dataset of paraphrase pairs from sentence-aligned bilingual text corpora.
Our sentence encoder can be trained in less than a day on a single graphics card, achieving high performance on a diverse set of sentence-level tasks.
arXiv Detail & Related papers (2022-07-26T09:08:56Z) - Quark: Controllable Text Generation with Reinforced Unlearning [68.07749519374089]
Large-scale language models often learn behaviors that are misaligned with user expectations.
We introduce Quantized Reward Konditioning (Quark), an algorithm for optimizing a reward function that quantifies an (un)wanted property.
For unlearning toxicity, negative sentiment, and repetition, our experiments show that Quark outperforms both strong baselines and state-of-the-art reinforcement learning methods.
arXiv Detail & Related papers (2022-05-26T21:11:51Z) - Typical Decoding for Natural Language Generation [76.69397802617064]
We study why high-probability texts can be dull or repetitive.
We show that typical sampling offers competitive performance in terms of quality.
arXiv Detail & Related papers (2022-02-01T18:58:45Z) - Generating Adversarial Examples in Chinese Texts Using Sentence-Pieces [60.58900627906269]
We propose a pre-train language model as the substitutes generator using sentence-pieces to craft adversarial examples in Chinese.
The substitutions in the generated adversarial examples are not characters or words but textit'pieces', which are more natural to Chinese readers.
arXiv Detail & Related papers (2020-12-29T14:28:07Z) - Don't Change Me! User-Controllable Selective Paraphrase Generation [45.0436584774495]
In paraphrase generation, source sentences often contain phrases that should not be altered.
Our solution is to provide the user with explicit tags that can be placed around any arbitrary segment of text to mean "don't change me!"
The contribution of this work is a novel data generation technique using distant supervision.
arXiv Detail & Related papers (2020-08-21T03:31:50Z) - Toward Better Storylines with Sentence-Level Language Models [54.91921545103256]
We propose a sentence-level language model which selects the next sentence in a story from a finite set of fluent alternatives.
We demonstrate the effectiveness of our approach with state-of-the-art accuracy on the unsupervised Story Cloze task.
arXiv Detail & Related papers (2020-05-11T16:54:19Z) - Neural Syntactic Preordering for Controlled Paraphrase Generation [57.5316011554622]
Our work uses syntactic transformations to softly "reorder'' the source sentence and guide our neural paraphrasing model.
First, given an input sentence, we derive a set of feasible syntactic rearrangements using an encoder-decoder model.
Next, we use each proposed rearrangement to produce a sequence of position embeddings, which encourages our final encoder-decoder paraphrase model to attend to the source words in a particular order.
arXiv Detail & Related papers (2020-05-05T09:02:25Z) - Metaphoric Paraphrase Generation [58.592750281138265]
We use crowdsourcing to evaluate our results, as well as developing an automatic metric for evaluating metaphoric paraphrases.
We show that while the lexical replacement baseline is capable of producing accurate paraphrases, they often lack metaphoricity.
Our metaphor masking model excels in generating metaphoric sentences while performing nearly as well with regard to fluency and paraphrase quality.
arXiv Detail & Related papers (2020-02-28T16:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.