Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation
- URL: http://arxiv.org/abs/2404.04232v2
- Date: Mon, 3 Jun 2024 12:08:20 GMT
- Title: Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation
- Authors: Tianqi Zhong, Zhaoyi Li, Quan Wang, Linqi Song, Ying Wei, Defu Lian, Zhendong Mao,
- Abstract summary: CompMCTG is a benchmark encompassing diverse multi-aspect labeled datasets.
We introduce Meta-MCTG, a training framework incorporating meta-learning.
We demonstrate the effectiveness of Meta-MCTG through achieving obvious improvement in 94.4% cases.
- Score: 56.854968623992214
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Compositional generalization, representing the model's ability to generate text with new attribute combinations obtained by recombining single attributes from the training data, is a crucial property for multi-aspect controllable text generation (MCTG) methods. Nonetheless, a comprehensive compositional generalization evaluation benchmark of MCTG is still lacking. We propose CompMCTG, a benchmark encompassing diverse multi-aspect labeled datasets and a crafted three-dimensional evaluation protocol, to holistically evaluate the compositional generalization of MCTG approaches. We observe that existing MCTG works generally confront a noticeable performance drop in compositional testing. To mitigate this issue, we introduce Meta-MCTG, a training framework incorporating meta-learning, where we enable models to learn how to generalize by simulating compositional generalization scenarios in the training phase. We demonstrate the effectiveness of Meta-MCTG through achieving obvious improvement (by at most 3.64%) for compositional testing performance in 94.4% cases.
Related papers
- On the Generalization Ability of Machine-Generated Text Detectors [23.434925348283617]
Large language models (LLMs) has raised concerns about machine-generated text (MGT)
This work investigates the generalization capabilities of MGT detectors in three aspects.
arXiv Detail & Related papers (2024-12-23T03:30:34Z) - HMGIE: Hierarchical and Multi-Grained Inconsistency Evaluation for Vision-Language Data Cleansing [54.970275599061594]
We design an adaptive evaluation framework, called Hierarchical and Multi-Grained Inconsistency Evaluation (HMGIE)
HMGIE can provide multi-grained evaluations covering both accuracy and completeness for various image-caption pairs.
To verify the efficacy and flexibility of the proposed framework, we construct MVTID, an image-caption dataset with diverse types and granularities of inconsistencies.
arXiv Detail & Related papers (2024-12-07T15:47:49Z) - Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark [62.58869921806019]
We propose a task decomposition evaluation framework based on GPT-4o to automatically construct a new training dataset.
We design innovative training strategies to effectively distill GPT-4o's evaluation capabilities into a 7B open-source MLLM, MiniCPM-V-2.6.
Experimental results demonstrate that our distilled open-source MLLM significantly outperforms the current state-of-the-art GPT-4o-base baseline.
arXiv Detail & Related papers (2024-11-23T08:06:06Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - Compositional Generalization for Multi-label Text Classification: A
Data-Augmentation Approach [40.879814474959545]
We assess the compositional generalization ability of existing multi-label text classification models.
Our results show that these models often fail to generalize to compositional concepts encountered infrequently during training.
To address this, we introduce a data augmentation method that leverages two innovative text generation models.
arXiv Detail & Related papers (2023-12-18T15:18:57Z) - T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional
Text-to-image Generation [62.71574695256264]
T2I-CompBench is a comprehensive benchmark for open-world compositional text-to-image generation.
We propose several evaluation metrics specifically designed to evaluate compositional text-to-image generation.
We introduce a new approach, Generative mOdel fine-tuning with Reward-driven Sample selection (GORS) to boost the compositional text-to-image generation abilities.
arXiv Detail & Related papers (2023-07-12T17:59:42Z) - Seen to Unseen: Exploring Compositional Generalization of
Multi-Attribute Controllable Dialogue Generation [23.79168163871952]
Existing controllable dialogue generation work focuses on the single-attribute control.
We propose a prompt-based disentangled controllable dialogue generation model, DCG.
arXiv Detail & Related papers (2023-06-17T10:50:19Z) - TART: Improved Few-shot Text Classification Using Task-Adaptive
Reference Transformation [23.02986307143718]
We propose a novel Task-Adaptive Reference Transformation (TART) network to enhance the generalization.
Our model surpasses the state-of-the-art method by 7.4% and 5.4% in 1-shot and 5-shot classification on the 20 Newsgroups dataset.
arXiv Detail & Related papers (2023-06-03T18:38:02Z) - GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation
Understanding [51.37738394062851]
GIFT can adapt various Transformer-based pre-trained language models for universal MPC understanding.
Four types of edges are designed to integrate graph-induced signals into attention mechanisms.
arXiv Detail & Related papers (2023-05-16T11:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.