Related papers: Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation

Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation

URL: http://arxiv.org/abs/2404.04232v2
Date: Mon, 3 Jun 2024 12:08:20 GMT
Title: Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation
Authors: Tianqi Zhong, Zhaoyi Li, Quan Wang, Linqi Song, Ying Wei, Defu Lian, Zhendong Mao,
Abstract summary: CompMCTG is a benchmark encompassing diverse multi-aspect labeled datasets. We introduce Meta-MCTG, a training framework incorporating meta-learning. We demonstrate the effectiveness of Meta-MCTG through achieving obvious improvement in 94.4% cases.
Score: 56.854968623992214
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Compositional generalization, representing the model's ability to generate text with new attribute combinations obtained by recombining single attributes from the training data, is a crucial property for multi-aspect controllable text generation (MCTG) methods. Nonetheless, a comprehensive compositional generalization evaluation benchmark of MCTG is still lacking. We propose CompMCTG, a benchmark encompassing diverse multi-aspect labeled datasets and a crafted three-dimensional evaluation protocol, to holistically evaluate the compositional generalization of MCTG approaches. We observe that existing MCTG works generally confront a noticeable performance drop in compositional testing. To mitigate this issue, we introduce Meta-MCTG, a training framework incorporating meta-learning, where we enable models to learn how to generalize by simulating compositional generalization scenarios in the training phase. We demonstrate the effectiveness of Meta-MCTG through achieving obvious improvement (by at most 3.64%) for compositional testing performance in 94.4% cases.

Related papers

Consistency Evaluation of News Article Summaries Generated by Large (and Small) Language Models [0.0]
Large Language Models (LLMs) have shown promise in generating fluent abstractive summaries but they can produce hallucinated details not grounded in the source text. This paper embarks on an exploration of text summarization with a diverse set of techniques, including TextRank, BART, Mistral-7B-Instruct, and OpenAI GPT-3.5-Turbo. We find that all summarization models produce consistent summaries when tested on the XL-Sum dataset.
arXiv Detail & Related papers (2025-02-28T01:58:17Z)
Test-Time Modality Generalization for Medical Image Segmentation [0.9092907230570326]
Generalizable medical image segmentation is essential for ensuring consistent performance across diverse unseen clinical settings. We introduce a novel Test-Time Modality Generalization (TTMG) framework, which comprises two core components: Modality-Aware Style Projection (MASP) and Modality-Sensitive Instance Whitening (MSIW) MASP estimates the likelihood of a test instance belonging to each seen modality and maps it onto a distribution using modality-specific style bases, guiding its projection effectively. MSIW is applied during training to selectively suppress modality-sensitive information while retaining modality-invariant features.
arXiv Detail & Related papers (2025-02-27T01:32:13Z)
HMGIE: Hierarchical and Multi-Grained Inconsistency Evaluation for Vision-Language Data Cleansing [54.970275599061594]
We design an adaptive evaluation framework, called Hierarchical and Multi-Grained Inconsistency Evaluation (HMGIE) HMGIE can provide multi-grained evaluations covering both accuracy and completeness for various image-caption pairs. To verify the efficacy and flexibility of the proposed framework, we construct MVTID, an image-caption dataset with diverse types and granularities of inconsistencies.
arXiv Detail & Related papers (2024-12-07T15:47:49Z)
Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark [62.58869921806019]
We propose a task decomposition evaluation framework based on GPT-4o to automatically construct a new training dataset. We design innovative training strategies to effectively distill GPT-4o's evaluation capabilities into a 7B open-source MLLM, MiniCPM-V-2.6. Experimental results demonstrate that our distilled open-source MLLM significantly outperforms the current state-of-the-art GPT-4o-base baseline.
arXiv Detail & Related papers (2024-11-23T08:06:06Z)
SPOR: A Comprehensive and Practical Evaluation Method for Compositional Generalization in Data-to-Text Generation [21.68354181391989]
We propose SPOR, a comprehensive and practical evaluation method for compositional generalization in data-to-text generation. We demonstrate SPOR on two different datasets and evaluate some existing language models including LLMs.
arXiv Detail & Related papers (2024-05-17T09:25:30Z)
Contextualization Distillation from Large Language Model for Knowledge Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks. Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments. Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z)
Compositional Generalization for Multi-label Text Classification: A Data-Augmentation Approach [40.879814474959545]
We assess the compositional generalization ability of existing multi-label text classification models. Our results show that these models often fail to generalize to compositional concepts encountered infrequently during training. To address this, we introduce a data augmentation method that leverages two innovative text generation models.
arXiv Detail & Related papers (2023-12-18T15:18:57Z)
Compositional Generalization for Data-to-Text Generation [86.79706513098104]
We propose a novel model that addresses compositional generalization by clustering predicates into groups. Our model generates text in a sentence-by-sentence manner, relying on one cluster of predicates at a time. It significantly outperforms T5baselines across all evaluation metrics.
arXiv Detail & Related papers (2023-12-05T13:23:15Z)
T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation [62.71574695256264]
T2I-CompBench is a comprehensive benchmark for open-world compositional text-to-image generation. We propose several evaluation metrics specifically designed to evaluate compositional text-to-image generation. We introduce a new approach, Generative mOdel fine-tuning with Reward-driven Sample selection (GORS) to boost the compositional text-to-image generation abilities.
arXiv Detail & Related papers (2023-07-12T17:59:42Z)
Seen to Unseen: Exploring Compositional Generalization of Multi-Attribute Controllable Dialogue Generation [23.79168163871952]
Existing controllable dialogue generation work focuses on the single-attribute control. We propose a prompt-based disentangled controllable dialogue generation model, DCG.
arXiv Detail & Related papers (2023-06-17T10:50:19Z)
TART: Improved Few-shot Text Classification Using Task-Adaptive Reference Transformation [23.02986307143718]
We propose a novel Task-Adaptive Reference Transformation (TART) network to enhance the generalization. Our model surpasses the state-of-the-art method by 7.4% and 5.4% in 1-shot and 5-shot classification on the 20 Newsgroups dataset.
arXiv Detail & Related papers (2023-06-03T18:38:02Z)
GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding [51.37738394062851]
GIFT can adapt various Transformer-based pre-trained language models for universal MPC understanding. Four types of edges are designed to integrate graph-induced signals into attention mechanisms.
arXiv Detail & Related papers (2023-05-16T11:35:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.