Related papers: Teacher-Student Training for Debiasing: General Permutation Debiasing for Large Language Models

Teacher-Student Training for Debiasing: General Permutation Debiasing for Large Language Models

URL: http://arxiv.org/abs/2403.13590v1
Date: Wed, 20 Mar 2024 13:38:07 GMT
Title: Teacher-Student Training for Debiasing: General Permutation Debiasing for Large Language Models
Authors: Adian Liusie, Yassir Fathullah, Mark J. F. Gales,
Abstract summary: Large Language Models (LLMs) have demonstrated impressive zero-shot capabilities and versatility in NLP tasks. They sometimes fail to maintain crucial invariances for specific tasks. This paper addresses this inefficiency at inference time.
Score: 39.82130327284791
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) have demonstrated impressive zero-shot capabilities and versatility in NLP tasks, however they sometimes fail to maintain crucial invariances for specific tasks. One example is permutation sensitivity, where LLMs' outputs may significantly vary depending on the order of the input options. While debiasing techniques can mitigate these issues, and yield better performance and reliability, they often come with a high computational cost at inference. This paper addresses this inefficiency at inference time. The aim is to distill the capabilities of a computationally intensive, debiased, teacher model into a more compact student model. We explore two variants of student models: one based on pure distillation, and the other on an error-correction approach for more complex tasks, where the student corrects a single biased decision from the teacher to achieve a debiased output. Our approach is general and can be applied to both black-box and white-box LLMs. Furthermore, we demonstrate that our compact, encoder-only student models can outperform their larger, biased teacher counterparts, achieving better results with significantly fewer parameters.

Related papers

Bias Similarity Across Large Language Models [32.0365189539138]
We analyze bias through output distribution across multiple dimensions using two datasets (4K and 1M questions) Our results show that fine-tuning has minimal impact on output distributions, and proprietary models tend to overly response as unknowns to minimize bias, compromising accuracy and utility. Open-source models like Llama3-Chat and Gemma2-it demonstrate fairness comparable to proprietary models like GPT-4, challenging the assumption that larger, closed-source models are inherently less biased.
arXiv Detail & Related papers (2024-10-15T19:21:14Z)
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights [89.56181323849512]
We propose SuperCorrect, a framework that uses a large teacher model to supervise and correct both the reasoning and reflection processes of a smaller student model. In the first stage, we extract hierarchical high-level and detailed thought templates from the teacher model to guide the student model in eliciting more fine-grained reasoning thoughts. In the second stage, we introduce cross-model collaborative direct preference optimization (DPO) to enhance the self-correction abilities of the student model.
arXiv Detail & Related papers (2024-10-11T17:25:52Z)
Subtle Errors Matter: Preference Learning via Error-injected Self-editing [59.405145971637204]
We propose a novel preference learning framework called eRror-Injected Self-Editing (RISE) RISE injects predefined subtle errors into partial tokens of correct solutions to construct hard pairs for error mitigation. Experiments validate the effectiveness of RISE, with preference learning on Qwen2-7B-Instruct yielding notable improvements of 3.0% on GSM8K and 7.9% on MATH.
arXiv Detail & Related papers (2024-10-09T07:43:38Z)
BiasDPO: Mitigating Bias in Language Models through Direct Preference Optimization [0.0]
Large Language Models (LLMs) have become pivotal in advancing natural language processing, yet their potential to perpetuate biases poses significant concerns. This paper introduces a new framework employing Direct Preference Optimization (DPO) to mitigate gender, racial, and religious biases in English text. By developing a loss function that favors less biased over biased completions, our approach cultivates a preference for respectful and non-discriminatory language.
arXiv Detail & Related papers (2024-07-18T22:32:20Z)
Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios. We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples. Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z)
COBias and Debias: Minimizing Language Model Pairwise Accuracy Bias via Nonlinear Integer Programming [12.287692969438169]
We tackle language models' imbalance in per-class prediction accuracy by reconceptualizing it as the Contextuality Bias (COBias) We are the first to engage nonlinear integer programming (NIP) to debias it. DNIP simultaneously achieves significant COBias reduction and accuracy improvement over the conventional ICL approach.
arXiv Detail & Related papers (2024-05-13T10:30:33Z)
Improving In-context Learning via Bidirectional Alignment [41.214003703218914]
Large language models (LLMs) have shown impressive few-shot generalization on many tasks via in-context learning (ICL) We propose Bidirectional Alignment (BiAlign) to fully leverage the models' preferences for ICL examples to improve the ICL abilities of student models. Specifically, we introduce the alignment of input preferences between student and teacher models by incorporating a novel ranking loss.
arXiv Detail & Related papers (2023-12-28T15:02:03Z)
Multitask Learning Can Improve Worst-Group Outcomes [76.92646345152788]
Multitask learning (MTL) is one such widely used technique. We propose to modify standard MTL by regularizing the joint multitask representation space. We find that our regularized MTL approach emphconsistently outperforms JTT on both average and worst-group outcomes.
arXiv Detail & Related papers (2023-12-05T21:38:24Z)
Less is More -- Towards parsimonious multi-task models using structured sparsity [4.874780144224057]
This work focuses on creating sparse models optimized for multiple tasks with fewer parameters. We introduce channel-wise l1/l2 group sparsity in the shared convolutional layers parameters (or weights) of the multi-task learning model. We analyzed the results of group sparsity in both single-task and multi-task settings on two widely-used Multi-Task Learning (MTL) datasets.
arXiv Detail & Related papers (2023-08-23T13:09:03Z)
When in Doubt, Summon the Titans: Efficient Inference with Large Models [80.2673230098021]
We propose a two-stage framework based on distillation that realizes the modelling benefits of large models. We use the large teacher models to guide the lightweight student models to only make correct predictions on a subset of "easy" examples. Our proposed use of distillation to only handle easy instances allows for a more aggressive trade-off in the student size, thereby reducing the amortized cost of inference.
arXiv Detail & Related papers (2021-10-19T22:56:49Z)
When Ensembling Smaller Models is More Efficient than Single Large Models [52.38997176317532]
We show that ensembles can outperform single models with both higher accuracy and requiring fewer total FLOPs to compute. This presents an interesting observation that output diversity in ensembling can often be more efficient than training larger models.
arXiv Detail & Related papers (2020-05-01T18:56:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.