Compositional Generalization for Multi-label Text Classification: A
Data-Augmentation Approach
- URL: http://arxiv.org/abs/2312.11276v3
- Date: Wed, 20 Dec 2023 09:43:01 GMT
- Title: Compositional Generalization for Multi-label Text Classification: A
Data-Augmentation Approach
- Authors: Yuyang Chai, Zhuang Li, Jiahui Liu, Lei Chen, Fei Li, Donghong Ji and
Chong Teng
- Abstract summary: We assess the compositional generalization ability of existing multi-label text classification models.
Our results show that these models often fail to generalize to compositional concepts encountered infrequently during training.
To address this, we introduce a data augmentation method that leverages two innovative text generation models.
- Score: 40.879814474959545
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite significant advancements in multi-label text classification, the
ability of existing models to generalize to novel and seldom-encountered
complex concepts, which are compositions of elementary ones, remains
underexplored. This research addresses this gap. By creating unique data splits
across three benchmarks, we assess the compositional generalization ability of
existing multi-label text classification models. Our results show that these
models often fail to generalize to compositional concepts encountered
infrequently during training, leading to inferior performance on tests with
these new combinations. To address this, we introduce a data augmentation
method that leverages two innovative text generation models designed to enhance
the classification models' capacity for compositional generalization. Our
experiments show that this data augmentation approach significantly improves
the compositional generalization capabilities of classification models on our
benchmarks, with both generation models surpassing other text generation
baselines.
Related papers
- How Well Do Text Embedding Models Understand Syntax? [50.440590035493074]
The ability of text embedding models to generalize across a wide range of syntactic contexts remains under-explored.
Our findings reveal that existing text embedding models have not sufficiently addressed these syntactic understanding challenges.
We propose strategies to augment the generalization ability of text embedding models in diverse syntactic scenarios.
arXiv Detail & Related papers (2023-11-14T08:51:00Z) - Extractive Text Summarization Using Generalized Additive Models with
Interactions for Sentence Selection [0.0]
This work studies the application of two modern Generalized Additive Models with interactions, namely Explainable Boosting Machine and GAMI-Net, to the extractive summarization problem based on linguistic features and binary classification.
arXiv Detail & Related papers (2022-12-21T00:56:50Z) - Compositional Generalisation with Structured Reordering and Fertility
Layers [121.37328648951993]
Seq2seq models have been shown to struggle with compositional generalisation.
We present a flexible end-to-end differentiable neural model that composes two structural operations.
arXiv Detail & Related papers (2022-10-06T19:51:31Z) - Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge.
This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework.
We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z) - Revisiting the Compositional Generalization Abilities of Neural Sequence
Models [23.665350744415004]
We focus on one-shot primitive generalization as introduced by the popular SCAN benchmark.
We demonstrate that modifying the training distribution in simple and intuitive ways enables standard seq-to-seq models to achieve near-perfect generalization performance.
arXiv Detail & Related papers (2022-03-14T18:03:21Z) - Improving Compositional Generalization with Self-Training for
Data-to-Text Generation [36.973617793800315]
We study the compositional generalization of current generation models in data-to-text tasks.
By simulating structural shifts in the compositional Weather dataset, we show that T5 models fail to generalize to unseen structures.
We propose an approach based on self-training using finetuned BLEURT for pseudo-response selection.
arXiv Detail & Related papers (2021-10-16T04:26:56Z) - Improving Label Quality by Jointly Modeling Items and Annotators [68.8204255655161]
We propose a fully Bayesian framework for learning ground truth labels from noisy annotators.
Our framework ensures scalability by factoring a generative, Bayesian soft clustering model over label distributions into the classic David and Skene joint annotator-data model.
arXiv Detail & Related papers (2021-06-20T02:15:20Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - A Systematic Assessment of Syntactic Generalization in Neural Language
Models [20.589737524626745]
We present a systematic evaluation of the syntactic knowledge of neural language models.
We find substantial differences in syntactic generalization performance by model architecture.
Our results also reveal a dissociation between perplexity and syntactic generalization performance.
arXiv Detail & Related papers (2020-05-07T18:35:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.