Unified Low-Resource Sequence Labeling by Sample-Aware Dynamic Sparse
Finetuning
- URL: http://arxiv.org/abs/2311.03748v1
- Date: Tue, 7 Nov 2023 06:19:37 GMT
- Title: Unified Low-Resource Sequence Labeling by Sample-Aware Dynamic Sparse
Finetuning
- Authors: Sarkar Snigdha Sarathi Das, Ranran Haoran Zhang, Peng Shi, Wenpeng
Yin, Rui Zhang
- Abstract summary: FISH-DIP is a sample-aware dynamic sparse finetuning strategy that selectively focuses on a fraction of parameters.
We demonstrate that FISH-DIP can smoothly optimize the model in low resource settings offering upto 40% performance improvements.
- Score: 24.765911297156855
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Unified Sequence Labeling that articulates different sequence labeling
problems such as Named Entity Recognition, Relation Extraction, Semantic Role
Labeling, etc. in a generalized sequence-to-sequence format opens up the
opportunity to make the maximum utilization of large language model knowledge
toward structured prediction. Unfortunately, this requires formatting them into
specialized augmented format unknown to the base pretrained language model
(PLMs) necessitating finetuning to the target format. This significantly bounds
its usefulness in data-limited settings where finetuning large models cannot
properly generalize to the target format. To address this challenge and
leverage PLM knowledge effectively, we propose FISH-DIP, a sample-aware dynamic
sparse finetuning strategy that selectively focuses on a fraction of
parameters, informed by feedback from highly regressing examples, during the
fine-tuning process. By leveraging the dynamism of sparsity, our approach
mitigates the impact of well-learned samples and prioritizes underperforming
instances for improvement in generalization. Across five tasks of sequence
labeling, we demonstrate that FISH-DIP can smoothly optimize the model in low
resource settings offering upto 40% performance improvements over full
fine-tuning depending on target evaluation settings. Also, compared to
in-context learning and other parameter-efficient fine-tuning approaches,
FISH-DIP performs comparably or better, notably in extreme low-resource
settings.
Related papers
- Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods [69.36397993451742]
This work introduces Context-aware Prompt Tuning (CPT), a method inspired by ICL, PT, and adversarial attacks.
We modify specific context tokens, considering the unique structure of input and output formats.
Inspired by adversarial attacks, we adjust the input based on the labels present in the context, focusing on minimizing, rather than maximizing, the loss.
arXiv Detail & Related papers (2024-10-22T17:45:47Z) - Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval Augmented Generation [13.120801609024147]
retrieval augmented generation (RAG) has been shown to enhance factuality of large language model (LLM) outputs.
RAG inputs are more complex than most datasets used for training NLI models.
We introduce Automatic Generative Domain Adaptation (Auto-GDA) to enable unsupervised domain adaptation.
arXiv Detail & Related papers (2024-10-04T14:21:27Z) - Functional Graphical Models: Structure Enables Offline Data-Driven Optimization [111.28605744661638]
We show how structure can enable sample-efficient data-driven optimization.
We also present a data-driven optimization algorithm that infers the FGM structure itself.
arXiv Detail & Related papers (2024-01-08T22:33:14Z) - A Closer Look at the Few-Shot Adaptation of Large Vision-Language Models [19.17722702457403]
We show that state-of-the-artETL approaches exhibit strong performance only in narrowly-defined experimental setups.
We propose a CLass-Adaptive linear Probe (CLAP) objective, whose balancing term is optimized via an adaptation of the general Augmented Lagrangian method.
arXiv Detail & Related papers (2023-12-20T02:58:25Z) - Uncertainty-aware Parameter-Efficient Self-training for Semi-supervised
Language Understanding [38.11411155621616]
We study self-training as one of the predominant semi-supervised learning approaches.
We present UPET, a novel Uncertainty-aware self-Training framework.
We show that UPET achieves a substantial improvement in terms of performance and efficiency.
arXiv Detail & Related papers (2023-10-19T02:18:29Z) - Prototypical Fine-tuning: Towards Robust Performance Under Varying Data
Sizes [47.880781811936345]
We propose a novel framework for fine-tuning pretrained language models (LM)
Our prototypical fine-tuning approach can automatically adjust the model capacity according to the number of data points and the model's inherent attributes.
arXiv Detail & Related papers (2022-11-24T14:38:08Z) - Partial sequence labeling with structured Gaussian Processes [8.239028141030621]
We propose structured Gaussian Processes for partial sequence labeling.
It encodes uncertainty in the prediction and does not need extra effort for model selection and hyper parameter learning.
It is evaluated on several sequence labeling tasks and the experimental results show the effectiveness of the proposed model.
arXiv Detail & Related papers (2022-09-20T00:56:49Z) - Fine-grained Retrieval Prompt Tuning [149.9071858259279]
Fine-grained Retrieval Prompt Tuning steers a frozen pre-trained model to perform the fine-grained retrieval task from the perspectives of sample prompt and feature adaptation.
Our FRPT with fewer learnable parameters achieves the state-of-the-art performance on three widely-used fine-grained datasets.
arXiv Detail & Related papers (2022-07-29T04:10:04Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z) - Feature Transformation Ensemble Model with Batch Spectral Regularization
for Cross-Domain Few-Shot Classification [66.91839845347604]
We propose an ensemble prediction model by performing diverse feature transformations after a feature extraction network.
We use a batch spectral regularization term to suppress the singular values of the feature matrix during pre-training to improve the generalization ability of the model.
The proposed model can then be fine tuned in the target domain to address few-shot classification.
arXiv Detail & Related papers (2020-05-18T05:31:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.