Dont Add, dont Miss: Effective Content Preserving Generation from
Pre-Selected Text Spans
- URL: http://arxiv.org/abs/2310.09017v3
- Date: Sun, 25 Feb 2024 14:45:00 GMT
- Title: Dont Add, dont Miss: Effective Content Preserving Generation from
Pre-Selected Text Spans
- Authors: Aviv Slobodkin, Avi Caciularu, Eran Hirsch, Ido Dagan
- Abstract summary: Controlled Text Reduction (CTR) task isolates the text generation step within typical summarization-style tasks.
We introduce a high-quality, open-source CTR model that tackles two prior key limitations.
We substantially improve the silver training data quality via GPT-4 distillation.
- Score: 27.569687461395002
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The recently introduced Controlled Text Reduction (CTR) task isolates the
text generation step within typical summarization-style tasks. It does so by
challenging models to generate coherent text conforming to pre-selected content
within the input text (``highlights''). This framing enables increased
modularity in summarization-like tasks, allowing to couple a single CTR model
with various content-selection setups and modules. However, there are currently
no reliable CTR models, while the performance of the existing baseline for the
task is mediocre, falling short of practical utility. Here, we address this gap
by introducing a high-quality, open-source CTR model that tackles two prior key
limitations: inadequate enforcement of the content-preservation constraint, and
suboptimal silver training data. Addressing these, we amplify the
content-preservation constraint in both training, via RL, and inference, via a
controlled decoding strategy. Further, we substantially improve the silver
training data quality via GPT-4 distillation. Overall, pairing the distilled
dataset with the highlight-adherence strategies yields marked gains over the
current baseline, of up to 30 ROUGE-L points, providing a reliable CTR model
for downstream use.
Related papers
- Continual LLaVA: Continual Instruction Tuning in Large Vision-Language Models [93.5327725085853]
Continual LLaVA is a rehearsal-free method tailored for continual instruction tuning in LVLMs.
Experiments indicate that the proposed Continual LLaVA outperforms previous methods by significantly reducing the forgetting during the continual instruction tuning process.
arXiv Detail & Related papers (2024-11-04T19:55:32Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
Click-through rate (CTR) prediction plays as a core function module in personalized online services.
Traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality.
Pretrained Language Models(PLMs) has given rise to another paradigm, which takes as inputs the sentences of textual modality.
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models(FLIP) for CTR prediction.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - Structural Self-Supervised Objectives for Transformers [3.018656336329545]
This thesis focuses on improving the pre-training of natural language models using unsupervised raw data.
In the first part, we introduce three alternative pre-training objectives to BERT's Masked Language Modeling (MLM)
In the second part, we proposes self-supervised pre-training tasks that align structurally with downstream applications.
arXiv Detail & Related papers (2023-09-15T09:30:45Z) - Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation [35.33340453046864]
Chain-of-Thought Attribute Manipulation (CoTAM) is a novel approach that generates new data from existing examples.
We leverage the chain-of-thought prompting to directly edit the text in three steps, (1) attribute decomposition, (2) manipulation proposal, and (3) sentence reconstruction.
arXiv Detail & Related papers (2023-07-14T00:10:03Z) - DELTA: Dynamic Embedding Learning with Truncated Conscious Attention for
CTR Prediction [61.68415731896613]
Click-Through Rate (CTR) prediction is a pivotal task in product and content recommendation.
We propose a model that enables Dynamic Embedding Learning with Truncated Conscious Attention for CTR prediction.
arXiv Detail & Related papers (2023-05-03T12:34:45Z) - Continual Contrastive Finetuning Improves Low-Resource Relation
Extraction [34.76128090845668]
Relation extraction has been particularly challenging in low-resource scenarios and domains.
Recent literature has tackled low-resource RE by self-supervised learning.
We propose to pretrain and finetune the RE model using consistent objectives of contrastive learning.
arXiv Detail & Related papers (2022-12-21T07:30:22Z) - CREATER: CTR-driven Advertising Text Generation with Controlled
Pre-Training and Contrastive Fine-Tuning [14.912117221662054]
We propose CREATER, a CTR-driven advertising text generation approach, to generate ad texts based on high-quality user reviews.
To incorporate CTR objective, our model learns from online A/B test data with contrastive learning, which encourages the model to generate ad texts that obtain higher CTR.
Experiments on industrial datasets show that CREATER significantly outperforms current approaches.
arXiv Detail & Related papers (2022-05-18T14:17:04Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation [100.79870384880333]
We propose a knowledge-grounded pre-training (KGPT) to generate knowledge-enriched text.
We adopt three settings, namely fully-supervised, zero-shot, few-shot to evaluate its effectiveness.
Under zero-shot setting, our model achieves over 30 ROUGE-L on WebNLG while all other baselines fail.
arXiv Detail & Related papers (2020-10-05T19:59:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.