Related papers: Compositional Generalization for Multi-label Text Classification: A Data-Augmentation Approach

Compositional Generalization for Multi-label Text Classification: A Data-Augmentation Approach

URL: http://arxiv.org/abs/2312.11276v3
Date: Wed, 20 Dec 2023 09:43:01 GMT
Title: Compositional Generalization for Multi-label Text Classification: A Data-Augmentation Approach
Authors: Yuyang Chai, Zhuang Li, Jiahui Liu, Lei Chen, Fei Li, Donghong Ji and Chong Teng
Abstract summary: We assess the compositional generalization ability of existing multi-label text classification models. Our results show that these models often fail to generalize to compositional concepts encountered infrequently during training. To address this, we introduce a data augmentation method that leverages two innovative text generation models.
Score: 40.879814474959545
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite significant advancements in multi-label text classification, the ability of existing models to generalize to novel and seldom-encountered complex concepts, which are compositions of elementary ones, remains underexplored. This research addresses this gap. By creating unique data splits across three benchmarks, we assess the compositional generalization ability of existing multi-label text classification models. Our results show that these models often fail to generalize to compositional concepts encountered infrequently during training, leading to inferior performance on tests with these new combinations. To address this, we introduce a data augmentation method that leverages two innovative text generation models designed to enhance the classification models' capacity for compositional generalization. Our experiments show that this data augmentation approach significantly improves the compositional generalization capabilities of classification models on our benchmarks, with both generation models surpassing other text generation baselines.

Related papers

How compositional generalization and creativity improve as diffusion models are trained [82.08869888944324]
How many samples do generative models need in order to learn composition rules? What signal in the data is exploited to learn those rules? We discuss connections between the hierarchical clustering mechanism we introduce here and the renormalization group in physics.
arXiv Detail & Related papers (2025-02-17T18:06:33Z)
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation [70.8833857249951]
IterComp is a novel framework that aggregates composition-aware model preferences from multiple models. We propose an iterative feedback learning method to enhance compositionality in a closed-loop manner. IterComp opens new research avenues in reward feedback learning for diffusion models and compositional generation.
arXiv Detail & Related papers (2024-10-09T17:59:13Z)
Contextuality Helps Representation Learning for Generalized Category Discovery [5.885208652383516]
This paper introduces a novel approach to Generalized Category Discovery (GCD) by leveraging the concept of contextuality. Our model integrates two levels of contextuality: instance-level, where nearest-neighbor contexts are utilized for contrastive learning, and cluster-level, employing contrastive learning. The integration of the contextual information effectively improves the feature learning and thereby the classification accuracy of all categories.
arXiv Detail & Related papers (2024-07-29T07:30:41Z)
How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored. Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges. We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z)
Extractive Text Summarization Using Generalized Additive Models with Interactions for Sentence Selection [0.0]
This work studies the application of two modern Generalized Additive Models with interactions, namely Explainable Boosting Machine and GAMI-Net, to the extractive summarization problem based on linguistic features and binary classification.
arXiv Detail & Related papers (2022-12-21T00:56:50Z)
Compositional Generalisation with Structured Reordering and Fertility Layers [121.37328648951993]
Seq2seq models have been shown to struggle with compositional generalisation. We present a flexible end-to-end differentiable neural model that composes two structural operations.
arXiv Detail & Related papers (2022-10-06T19:51:31Z)
Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge. This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework. We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z)
Revisiting the Compositional Generalization Abilities of Neural Sequence Models [23.665350744415004]
We focus on one-shot primitive generalization as introduced by the popular SCAN benchmark. We demonstrate that modifying the training distribution in simple and intuitive ways enables standard seq-to-seq models to achieve near-perfect generalization performance.
arXiv Detail & Related papers (2022-03-14T18:03:21Z)
Improving Compositional Generalization with Self-Training for Data-to-Text Generation [36.973617793800315]
We study the compositional generalization of current generation models in data-to-text tasks. By simulating structural shifts in the compositional Weather dataset, we show that T5 models fail to generalize to unseen structures. We propose an approach based on self-training using finetuned BLEURT for pseudo-response selection.
arXiv Detail & Related papers (2021-10-16T04:26:56Z)
Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings. We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data. We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.