SPEC: Summary Preference Decomposition for Low-Resource Abstractive
Summarization
- URL: http://arxiv.org/abs/2303.14011v1
- Date: Fri, 24 Mar 2023 14:07:03 GMT
- Title: SPEC: Summary Preference Decomposition for Low-Resource Abstractive
Summarization
- Authors: Yi-Syuan Chen, Yun-Zhu Song, Hong-Han Shuai
- Abstract summary: We present a framework to transfer few-shot learning processes from source corpora to the target corpus.
Our methods achieve state-of-the-art performance on six diverse corpora with 30.11%/33.95%/27.51% and 26.74%/31.14%/24.48% average improvements on ROUGE-1/2/L under 10- and 100-example settings.
- Score: 21.037841262371355
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural abstractive summarization has been widely studied and achieved great
success with large-scale corpora. However, the considerable cost of annotating
data motivates the need for learning strategies under low-resource settings. In
this paper, we investigate the problems of learning summarizers with only few
examples and propose corresponding methods for improvements. First, typical
transfer learning methods are prone to be affected by data properties and
learning objectives in the pretext tasks. Therefore, based on pretrained
language models, we further present a meta learning framework to transfer
few-shot learning processes from source corpora to the target corpus. Second,
previous methods learn from training examples without decomposing the content
and preference. The generated summaries could therefore be constrained by the
preference bias in the training set, especially under low-resource settings. As
such, we propose decomposing the contents and preferences during learning
through the parameter modulation, which enables control over preferences during
inference. Third, given a target application, specifying required preferences
could be non-trivial because the preferences may be difficult to derive through
observations. Therefore, we propose a novel decoding method to automatically
estimate suitable preferences and generate corresponding summary candidates
from the few training examples. Extensive experiments demonstrate that our
methods achieve state-of-the-art performance on six diverse corpora with
30.11%/33.95%/27.51% and 26.74%/31.14%/24.48% average improvements on
ROUGE-1/2/L under 10- and 100-example settings.
Related papers
- Clear Preferences Leave Traces: Reference Model-Guided Sampling for Preference Learning [59.11519451499754]
Direct Preference Optimization (DPO) has emerged as a de-facto approach for aligning language models with human preferences.
Recent work has shown DPO's effectiveness relies on training data quality.
We discover that reference model probability space naturally detects high-quality training samples.
arXiv Detail & Related papers (2025-01-25T07:21:50Z) - A Systematic Examination of Preference Learning through the Lens of Instruction-Following [83.71180850955679]
We use a novel synthetic data generation pipeline to generate 48,000 instruction unique-following prompts.
With our synthetic prompts, we use two preference dataset curation methods - rejection sampling (RS) and Monte Carlo Tree Search (MCTS)
Experiments reveal that shared prefixes in preference pairs, as generated by MCTS, provide marginal but consistent improvements.
High-contrast preference pairs generally outperform low-contrast pairs; however, combining both often yields the best performance.
arXiv Detail & Related papers (2024-12-18T15:38:39Z) - Integrated Image-Text Based on Semi-supervised Learning for Small Sample Instance Segmentation [1.3157419797035321]
The article proposes a novel small sample instance segmentation solution from the perspective of maximizing the utilization of existing information.
First, it helps the model fully utilize unlabeled data by learning to generate pseudo labels, increasing the number of available samples.
Second, by integrating the features of text and image, more accurate classification results can be obtained.
arXiv Detail & Related papers (2024-10-21T14:44:08Z) - Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification [34.37262622415682]
We propose a new adaptation framework called Data Adaptive Traceback.
Specifically, we utilize a zero-shot-based method to extract the most downstream task-related subset of the pre-training data.
We adopt a pseudo-label-based semi-supervised technique to reuse the pre-training images and a vision-language contrastive learning method to address the confirmation bias issue in semi-supervised learning.
arXiv Detail & Related papers (2024-07-11T18:01:58Z) - Take the Bull by the Horns: Hard Sample-Reweighted Continual Training
Improves LLM Generalization [165.98557106089777]
A key challenge is to enhance the capabilities of large language models (LLMs) amid a looming shortage of high-quality training data.
Our study starts from an empirical strategy for the light continual training of LLMs using their original pre-training data sets.
We then formalize this strategy into a principled framework of Instance-Reweighted Distributionally Robust Optimization.
arXiv Detail & Related papers (2024-02-22T04:10:57Z) - An Analysis of Initial Training Strategies for Exemplar-Free
Class-Incremental Learning [36.619804184427245]
Class-Incremental Learning (CIL) aims to build classification models from data streams.
Due to catastrophic forgetting, CIL is particularly challenging when examples from past classes cannot be stored.
Use of models pre-trained in a self-supervised way on large amounts of data has recently gained momentum.
arXiv Detail & Related papers (2023-08-22T14:06:40Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - CMW-Net: Learning a Class-Aware Sample Weighting Mapping for Robust Deep
Learning [55.733193075728096]
Modern deep neural networks can easily overfit to biased training data containing corrupted labels or class imbalance.
Sample re-weighting methods are popularly used to alleviate this data bias issue.
We propose a meta-model capable of adaptively learning an explicit weighting scheme directly from data.
arXiv Detail & Related papers (2022-02-11T13:49:51Z) - Optimizing Active Learning for Low Annotation Budgets [6.753808772846254]
In deep learning, active learning is usually implemented as an iterative process in which successive deep models are updated via fine tuning.
We tackle this issue by using an approach inspired by transfer learning.
We introduce a novel acquisition function which exploits the iterative nature of AL process to select samples in a more robust fashion.
arXiv Detail & Related papers (2022-01-18T18:53:10Z) - Exploiting All Samples in Low-Resource Sentence Classification: Early Stopping and Initialization Parameters [6.368871731116769]
In this study, we discuss how to exploit labeled samples without additional data or model redesigns.
We propose an integrated method, which is to initialize the model with a weight averaging method and use a non-validation stop method to train all samples.
Our results highlight the importance of the training strategy and suggest that the integrated method can be the first step in the low-resource setting.
arXiv Detail & Related papers (2021-11-12T22:31:47Z) - Few-shot Classification via Adaptive Attention [93.06105498633492]
We propose a novel few-shot learning method via optimizing and fast adapting the query sample representation based on very few reference samples.
As demonstrated experimentally, the proposed model achieves state-of-the-art classification results on various benchmark few-shot classification and fine-grained recognition datasets.
arXiv Detail & Related papers (2020-08-06T05:52:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.