Stylized Data-to-Text Generation: A Case Study in the E-Commerce Domain
- URL: http://arxiv.org/abs/2305.03256v1
- Date: Fri, 5 May 2023 03:02:41 GMT
- Title: Stylized Data-to-Text Generation: A Case Study in the E-Commerce Domain
- Authors: Liqiang Jing and Xuemeng Song and Xuming Lin and Zhongzhou Zhao and
Wei Zhou and Liqiang Nie
- Abstract summary: We propose a new task, namely stylized data-to-text generation, whose aim is to generate coherent text according to a specific style.
This task is non-trivial, due to three challenges: the logic of the generated text, unstructured style reference, and biased training samples.
We propose a novel stylized data-to-text generation model, named StyleD2T, comprising three components: logic planning-enhanced data embedding, mask-based style embedding, and unbiased stylized text generation.
- Score: 53.22419717434372
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing data-to-text generation efforts mainly focus on generating a
coherent text from non-linguistic input data, such as tables and
attribute-value pairs, but overlook that different application scenarios may
require texts of different styles. Inspired by this, we define a new task,
namely stylized data-to-text generation, whose aim is to generate coherent text
for the given non-linguistic data according to a specific style. This task is
non-trivial, due to three challenges: the logic of the generated text,
unstructured style reference, and biased training samples. To address these
challenges, we propose a novel stylized data-to-text generation model, named
StyleD2T, comprising three components: logic planning-enhanced data embedding,
mask-based style embedding, and unbiased stylized text generation. In the first
component, we introduce a graph-guided logic planner for attribute organization
to ensure the logic of generated text. In the second component, we devise
feature-level mask-based style embedding to extract the essential style signal
from the given unstructured style reference. In the last one, pseudo triplet
augmentation is utilized to achieve unbiased text generation, and a
multi-condition based confidence assignment function is designed to ensure the
quality of pseudo samples. Extensive experiments on a newly collected dataset
from Taobao have been conducted, and the results show the superiority of our
model over existing methods.
Related papers
- Layout Agnostic Scene Text Image Synthesis with Diffusion Models [42.37340959594495]
SceneTextGen is a novel diffusion-based model specifically designed to circumvent the need for a predefined layout stage.
The novelty of SceneTextGen lies in its integration of three key components: a character-level encoder for capturing detailed typographic properties and a character-level instance segmentation model and a word-level spotting model to address the issues of unwanted text generation and minor character inaccuracies.
arXiv Detail & Related papers (2024-06-03T07:20:34Z) - PixT3: Pixel-based Table-To-Text Generation [66.96636025277536]
We present PixT3, a multimodal table-to-text model that overcomes the challenges of linearization and input size limitations.
Experiments on the ToTTo and Logic2Text benchmarks show that PixT3 is competitive and superior to generators that operate solely on text.
arXiv Detail & Related papers (2023-11-16T11:32:47Z) - Specializing Small Language Models towards Complex Style Transfer via
Latent Attribute Pre-Training [29.143887057933327]
We introduce the concept of complex text style transfer tasks, and constructed complex text datasets based on two widely applicable scenarios.
Our dataset is the first large-scale data set of its kind, with 700 rephrased sentences and 1,000 sentences from the game Genshin Impact.
arXiv Detail & Related papers (2023-09-19T21:01:40Z) - Style Generation: Image Synthesis based on Coarsely Matched Texts [10.939482612568433]
We introduce a novel task called text-based style generation and propose a two-stage generative adversarial network.
The first stage generates the overall image style with a sentence feature, and the second stage refines the generated style with a synthetic feature.
The practical potential of our work is demonstrated by various applications such as text-image alignment and story visualization.
arXiv Detail & Related papers (2023-09-08T21:51:11Z) - Informative Text Generation from Knowledge Triples [56.939571343797304]
We propose a novel memory augmented generator that employs a memory network to memorize the useful knowledge learned during the training.
We derive a dataset from WebNLG for our new setting and conduct extensive experiments to investigate the effectiveness of our model.
arXiv Detail & Related papers (2022-09-26T14:35:57Z) - Data-to-text Generation with Variational Sequential Planning [74.3955521225497]
We consider the task of data-to-text generation, which aims to create textual output from non-linguistic input.
We propose a neural model enhanced with a planning component responsible for organizing high-level information in a coherent and meaningful way.
We infer latent plans sequentially with a structured variational model, while interleaving the steps of planning and generation.
arXiv Detail & Related papers (2022-02-28T13:17:59Z) - Towards Faithful Neural Table-to-Text Generation with Content-Matching
Constraints [63.84063384518667]
We propose a novel Transformer-based generation framework to achieve the goal.
Core techniques in our method to enforce faithfulness include a new table-text optimal-transport matching loss.
To evaluate faithfulness, we propose a new automatic metric specialized to the table-to-text generation problem.
arXiv Detail & Related papers (2020-05-03T02:54:26Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.