Self-training with Few-shot Rationalization: Teacher Explanations Aid
Student in Few-shot NLU
- URL: http://arxiv.org/abs/2109.08259v1
- Date: Fri, 17 Sep 2021 00:36:46 GMT
- Title: Self-training with Few-shot Rationalization: Teacher Explanations Aid
Student in Few-shot NLU
- Authors: Meghana Moorthy Bhat, Alessandro Sordoni, Subhabrata Mukherjee
- Abstract summary: We develop a framework based on self-training language models with limited task-specific labels and rationales.
We show that the neural model performance can be significantly improved by making it aware of its rationalized predictions.
- Score: 88.8401599172922
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While pre-trained language models have obtained state-of-the-art performance
for several natural language understanding tasks, they are quite opaque in
terms of their decision-making process. While some recent works focus on
rationalizing neural predictions by highlighting salient concepts in the text
as justifications or rationales, they rely on thousands of labeled training
examples for both task labels as well as an-notated rationales for every
instance. Such extensive large-scale annotations are infeasible to obtain for
many tasks. To this end, we develop a multi-task teacher-student framework
based on self-training language models with limited task-specific labels and
rationales, and judicious sample selection to learn from informative
pseudo-labeled examples1. We study several characteristics of what constitutes
a good rationale and demonstrate that the neural model performance can be
significantly improved by making it aware of its rationalized predictions,
particularly in low-resource settings. Extensive experiments in several
bench-mark datasets demonstrate the effectiveness of our approach.
Related papers
- Few Shot Rationale Generation using Self-Training with Dual Teachers [4.91890875296663]
Self-rationalizing models that also generate a free-text explanation for their predicted labels are an important tool to build trustworthy AI applications.
We introduce a novel dual-teacher learning framework, which learns two specialized teacher models for task prediction and rationalization.
We formulate a new loss function, Masked Label Regularization (MLR) which promotes explanations to be strongly conditioned on predicted labels.
arXiv Detail & Related papers (2023-06-05T23:57:52Z) - Active Learning Principles for In-Context Learning with Large Language
Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning.
We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z) - Skill-Based Few-Shot Selection for In-Context Learning [123.26522773708683]
Skill-KNN is a skill-based few-shot selection method for in-context learning.
It does not require training or fine-tuning of any models, making it suitable for frequently expanding or changing example banks.
Experimental results across five cross-domain semantic parsing datasets and six backbone models show that Skill-KNN significantly outperforms existing methods.
arXiv Detail & Related papers (2023-05-23T16:28:29Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Analyzing the Limits of Self-Supervision in Handling Bias in Language [52.26068057260399]
We evaluate how well language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing.
Our analyses indicate that language models are capable of performing these tasks to widely varying degrees across different bias dimensions, such as gender and political affiliation.
arXiv Detail & Related papers (2021-12-16T05:36:08Z) - Reordering Examples Helps during Priming-based Few-Shot Learning [6.579039107070663]
We show that PERO can learn to generalize efficiently using as few as 10 examples.
We demonstrate the effectiveness of the proposed method on the tasks of sentiment classification, natural language inference and fact retrieval.
arXiv Detail & Related papers (2021-06-03T11:02:36Z) - Representation Learning from Limited Educational Data with Crowdsourced
Labels [45.44620098891902]
We propose a novel framework which aims to learn effective representations from limited data with crowdsourced labels.
Specifically, we design a grouping based deep neural network to learn embeddings from a limited number of training samples.
We develop a hard example selection procedure to adaptively pick up training examples that are misclassified by the model.
arXiv Detail & Related papers (2020-09-23T15:34:40Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.