Few Shot Rationale Generation using Self-Training with Dual Teachers
- URL: http://arxiv.org/abs/2306.03315v1
- Date: Mon, 5 Jun 2023 23:57:52 GMT
- Title: Few Shot Rationale Generation using Self-Training with Dual Teachers
- Authors: Aditya Srikanth Veerubhotla, Lahari Poddar, Jun Yin, Gy\"orgy Szarvas,
Sharanya Eswaran
- Abstract summary: Self-rationalizing models that also generate a free-text explanation for their predicted labels are an important tool to build trustworthy AI applications.
We introduce a novel dual-teacher learning framework, which learns two specialized teacher models for task prediction and rationalization.
We formulate a new loss function, Masked Label Regularization (MLR) which promotes explanations to be strongly conditioned on predicted labels.
- Score: 4.91890875296663
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-rationalizing models that also generate a free-text explanation for
their predicted labels are an important tool to build trustworthy AI
applications. Since generating explanations for annotated labels is a laborious
and costly pro cess, recent models rely on large pretrained language models
(PLMs) as their backbone and few-shot learning. In this work we explore a
self-training approach leveraging both labeled and unlabeled data to further
improve few-shot models, under the assumption that neither human written
rationales nor annotated task labels are available at scale. We introduce a
novel dual-teacher learning framework, which learns two specialized teacher
models for task prediction and rationalization using self-training and distills
their knowledge into a multi-tasking student model that can jointly generate
the task label and rationale. Furthermore, we formulate a new loss function,
Masked Label Regularization (MLR) which promotes explanations to be strongly
conditioned on predicted labels. Evaluation on three public datasets
demonstrate that the proposed methods are effective in modeling task labels and
generating faithful rationales.
Related papers
- Pre-Trained Vision-Language Models as Partial Annotators [40.89255396643592]
Pre-trained vision-language models learn massive data to model unified representations of images and natural languages.
In this paper, we investigate a novel "pre-trained annotating - weakly-supervised learning" paradigm for pre-trained model application and experiment on image classification tasks.
arXiv Detail & Related papers (2024-05-23T17:17:27Z) - Automated Labeling of German Chest X-Ray Radiology Reports using Deep
Learning [50.591267188664666]
We propose a deep learning-based CheXpert label prediction model, pre-trained on reports labeled by a rule-based German CheXpert model.
Our results demonstrate the effectiveness of our approach, which significantly outperformed the rule-based model on all three tasks.
arXiv Detail & Related papers (2023-06-09T16:08:35Z) - Data-Centric Learning from Unlabeled Graphs with Diffusion Model [21.417410006246147]
We propose to extract the knowledge underlying the large set of unlabeled graphs as a specific set of useful data points.
We use a diffusion model to fully utilize the unlabeled graphs and design two new objectives to guide the model's denoising process.
Experiments demonstrate that our data-centric approach performs significantly better than fifteen existing various methods on fifteen tasks.
arXiv Detail & Related papers (2023-03-17T16:39:21Z) - Distilling Knowledge from Self-Supervised Teacher by Embedding Graph
Alignment [52.704331909850026]
We formulate a new knowledge distillation framework to transfer the knowledge from self-supervised pre-trained models to any other student network.
Inspired by the spirit of instance discrimination in self-supervised learning, we model the instance-instance relations by a graph formulation in the feature embedding space.
Our distillation scheme can be flexibly applied to transfer the self-supervised knowledge to enhance representation learning on various student networks.
arXiv Detail & Related papers (2022-11-23T19:27:48Z) - Label Matching Semi-Supervised Object Detection [85.99282969977541]
Semi-supervised object detection has made significant progress with the development of mean teacher driven self-training.
Label mismatch problem is not yet fully explored in the previous works, leading to severe confirmation bias during self-training.
We propose a simple yet effective LabelMatch framework from two different yet complementary perspectives.
arXiv Detail & Related papers (2022-06-14T05:59:41Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - Self-training with Few-shot Rationalization: Teacher Explanations Aid
Student in Few-shot NLU [88.8401599172922]
We develop a framework based on self-training language models with limited task-specific labels and rationales.
We show that the neural model performance can be significantly improved by making it aware of its rationalized predictions.
arXiv Detail & Related papers (2021-09-17T00:36:46Z) - Noisy Self-Knowledge Distillation for Text Summarization [83.49809205891496]
We apply self-knowledge distillation to text summarization which we argue can alleviate problems with maximum-likelihood training.
Our student summarization model is trained with guidance from a teacher which generates smoothed labels to help regularize training.
We demonstrate experimentally on three benchmarks that our framework boosts the performance of both pretrained and non-pretrained summarizers.
arXiv Detail & Related papers (2020-09-15T12:53:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.