DSGPT: Domain-Specific Generative Pre-Training of Transformers for Text
Generation in E-commerce Title and Review Summarization
- URL: http://arxiv.org/abs/2112.08414v1
- Date: Wed, 15 Dec 2021 19:02:49 GMT
- Title: DSGPT: Domain-Specific Generative Pre-Training of Transformers for Text
Generation in E-commerce Title and Review Summarization
- Authors: Xueying Zhang, Yunjiang Jiang, Yue Shang, Zhaomeng Cheng, Chi Zhang,
Xiaochuan Fan, Yun Xiao, Bo Long
- Abstract summary: We propose a novel domain-specific generative pre-training (DS-GPT) method for text generation.
We apply it to the product titleand review summarization problems on E-commerce mobile display.
- Score: 14.414693156937782
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a novel domain-specific generative pre-training (DS-GPT) method
for text generation and apply it to the product titleand review summarization
problems on E-commerce mobile display.First, we adopt a decoder-only
transformer architecture, which fitswell for fine-tuning tasks by combining
input and output all to-gether. Second, we demonstrate utilizing only small
amount of pre-training data in related domains is powerful. Pre-training a
languagemodel from a general corpus such as Wikipedia or the CommonCrawl
requires tremendous time and resource commitment, andcan be wasteful if the
downstream tasks are limited in variety. OurDSGPT is pre-trained on a limited
dataset, the Chinese short textsummarization dataset (LCSTS). Third, our model
does not requireproduct-related human-labeled data. For title summarization
task,the state of art explicitly uses additional background knowledgein
training and predicting stages. In contrast, our model implic-itly captures
this knowledge and achieves significant improvementover other methods, after
fine-tuning on the public Taobao.comdataset. For review summarization task, we
utilize JD.com in-housedataset, and observe similar improvement over standard
machinetranslation methods which lack the flexibility of fine-tuning.
Ourproposed work can be simply extended to other domains for a widerange of
text generation tasks.
Related papers
- Leveraging Natural Supervision for Language Representation Learning and
Generation [8.083109555490475]
We describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.
We first investigate self-supervised training losses to help enhance the performance of pretrained language models for various NLP tasks.
We propose a framework that uses paraphrase pairs to disentangle semantics and syntax in sentence representations.
arXiv Detail & Related papers (2022-07-21T17:26:03Z) - Curriculum-Based Self-Training Makes Better Few-Shot Learners for
Data-to-Text Generation [56.98033565736974]
We propose Curriculum-Based Self-Training (CBST) to leverage unlabeled data in a rearranged order determined by the difficulty of text generation.
Our method can outperform fine-tuning and task-adaptive pre-training methods, and achieve state-of-the-art performance in the few-shot setting of data-to-text generation.
arXiv Detail & Related papers (2022-06-06T16:11:58Z) - Domain Adaptation with Pre-trained Transformers for Query Focused
Abstractive Text Summarization [18.791701342934605]
The Query Focused Text Summarization (QFTS) task aims at building systems that generate the summary of the text document(s) based on a given query.
A key challenge in addressing this task is the lack of large labeled data for training the summarization model.
We address this challenge by exploring a series of domain adaptation techniques.
arXiv Detail & Related papers (2021-12-22T05:34:56Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation [100.79870384880333]
We propose a knowledge-grounded pre-training (KGPT) to generate knowledge-enriched text.
We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness.
Under zero-shot setting, our model achieves over 30 ROUGE-L on WebNLG while all other baselines fail.
arXiv Detail & Related papers (2020-10-05T19:59:05Z) - Partially-Aligned Data-to-Text Generation with Distant Supervision [69.15410325679635]
We propose a new generation task called Partially-Aligned Data-to-Text Generation (PADTG)
It is more practical since it utilizes automatically annotated data for training and thus considerably expands the application domains.
Our framework outperforms all baseline models as well as verify the feasibility of utilizing partially-aligned data.
arXiv Detail & Related papers (2020-10-03T03:18:52Z) - Text-to-Text Pre-Training for Data-to-Text Tasks [9.690158790639131]
We study the pre-train + fine-tune strategy for data-to-text tasks.
Our experiments indicate that text-to-text pre-training in the form of T5 enables simple, end-to-end transformer based models.
arXiv Detail & Related papers (2020-05-21T02:46:15Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z) - Exploring the Limits of Transfer Learning with a Unified Text-to-Text
Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP)
In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.