AdParaphrase v2.0: Generating Attractive Ad Texts Using a Preference-Annotated Paraphrase Dataset
- URL: http://arxiv.org/abs/2505.20826v1
- Date: Tue, 27 May 2025 07:34:44 GMT
- Title: AdParaphrase v2.0: Generating Attractive Ad Texts Using a Preference-Annotated Paraphrase Dataset
- Authors: Soichiro Murakami, Peinan Zhang, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura,
- Abstract summary: This study proposes AdParaphrase v2.0, a dataset for ad text paraphrasing, containing human preference data.<n>Compared with v1.0, this dataset is 20 times larger, comprising 16,460 ad text paraphrase pairs, each annotated with preference data from ten evaluators.
- Score: 34.12547921617836
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Identifying factors that make ad text attractive is essential for advertising success. This study proposes AdParaphrase v2.0, a dataset for ad text paraphrasing, containing human preference data, to enable the analysis of the linguistic factors and to support the development of methods for generating attractive ad texts. Compared with v1.0, this dataset is 20 times larger, comprising 16,460 ad text paraphrase pairs, each annotated with preference data from ten evaluators, thereby enabling a more comprehensive and reliable analysis. Through the experiments, we identified multiple linguistic features of engaging ad texts that were not observed in v1.0 and explored various methods for generating attractive ad texts. Furthermore, our analysis demonstrated the relationships between human preference and ad performance, and highlighted the potential of reference-free metrics based on large language models for evaluating ad text attractiveness. The dataset is publicly available at: https://github.com/CyberAgentAILab/AdParaphrase-v2.0.
Related papers
- Exploring the Relationship Between Diversity and Quality in Ad Text Generation [1.284952415878457]
Ad text generation significantly differs from tasks owing to the text style and requirements.<n>This research explores the relationship between diversity and ad quality in ad text generation.
arXiv Detail & Related papers (2025-05-22T09:05:44Z) - AdParaphrase: Paraphrase Dataset for Analyzing Linguistic Features toward Generating Attractive Ad Texts [34.12547921617836]
This study aims to explore the linguistic features of ad texts that influence human preferences.<n>We present AdParaphrase, a paraphrase dataset that contains human preferences for pairs of ad texts.<n>Our analysis revealed that ad texts preferred by human judges have higher fluency, longer length, more nouns, and use of bracket symbols.
arXiv Detail & Related papers (2025-02-07T05:39:55Z) - Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text [61.22649031769564]
We propose a novel framework, paraphrased text span detection (PTD)
PTD aims to identify paraphrased text spans within a text.
We construct a dedicated dataset, PASTED, for paraphrased text span detection.
arXiv Detail & Related papers (2024-05-21T11:22:27Z) - A Benchmark for Text Expansion: Datasets, Metrics, and Baselines [87.47745669317894]
This work presents a new task of Text Expansion (TE), which aims to insert fine-grained modifier into proper locations of the plain text.
We leverage four complementary approaches to construct a dataset with 12 million automatically generated instances and 2K human-annotated references.
On top of a pre-trained text-infilling model, we build both pipelined and joint Locate&Infill models, which demonstrate the superiority over the Text2Text baselines.
arXiv Detail & Related papers (2023-09-17T07:54:38Z) - Natural Language Generation for Advertising: A Survey [1.7265013728930998]
Natural language generation methods have emerged as effective tools to help advertisers increase the number of online advertisements they produce.
This survey entails a review of the research trends on this topic over the past decade, from template-based to extractive and abstractive approaches using neural networks.
arXiv Detail & Related papers (2023-06-22T07:52:34Z) - Natural Language Decompositions of Implicit Content Enable Better Text Representations [52.992875653864076]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account.<n>We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed.<n>Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z) - Persuasion Strategies in Advertisements [68.70313043201882]
We introduce an extensive vocabulary of persuasion strategies and build the first ad image corpus annotated with persuasion strategies.
We then formulate the task of persuasion strategy prediction with multi-modal learning.
We conduct a real-world case study on 1600 advertising campaigns of 30 Fortune-500 companies.
arXiv Detail & Related papers (2022-08-20T07:33:13Z) - TRIE++: Towards End-to-End Information Extraction from Visually Rich
Documents [51.744527199305445]
This paper proposes a unified end-to-end information extraction framework from visually rich documents.
Text reading and information extraction can reinforce each other via a well-designed multi-modal context block.
The framework can be trained in an end-to-end trainable manner, achieving global optimization.
arXiv Detail & Related papers (2022-07-14T08:52:07Z) - VisualTextRank: Unsupervised Graph-based Content Extraction for
Automating Ad Text to Image Search [6.107273836558503]
We propose VisualTextRank as an unsupervised method to augment input ad text using semantically similar ads.
VisualTextRank builds on prior work on graph based context extraction.
Online tests with a simplified version led to a 28.7% increase in the usage of stock image search.
arXiv Detail & Related papers (2021-08-05T16:47:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.