UniREx: A Unified Learning Framework for Language Model Rationale
Extraction
- URL: http://arxiv.org/abs/2112.08802v1
- Date: Thu, 16 Dec 2021 11:39:21 GMT
- Title: UniREx: A Unified Learning Framework for Language Model Rationale
Extraction
- Authors: Aaron Chan, Maziar Sanjabi, Lambert Mathias, Liang Tan, Shaoliang Nie,
Xiaochang Peng, Xiang Ren, Hamed Firooz
- Abstract summary: We propose UniREx, a unified and highly flexible learning framework for rationale extraction.
UniREx enables end-to-end customization of the rationale extractor training process.
Our best UniREx configurations achieve a superior balance of the five desiderata.
- Score: 30.39545674859148
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An extractive rationale explains a language model's (LM's) prediction on a
given task instance by highlighting the text inputs that most influenced the
output. Ideally, rationale extraction should be faithful (reflects LM's
behavior), plausible (makes sense to humans), data-efficient, and fast, without
sacrificing the LM's task performance. Prior rationale extraction works consist
of specialized approaches for addressing various subsets of these desiderata --
but never all five. Narrowly focusing on certain desiderata typically comes at
the expense of ignored ones, so existing rationale extractors are often
impractical in real-world applications. To tackle this challenge, we propose
UniREx, a unified and highly flexible learning framework for rationale
extraction, which allows users to easily account for all five factors. UniREx
enables end-to-end customization of the rationale extractor training process,
supporting arbitrary: (1) heuristic/learned rationale extractors, (2)
combinations of faithfulness and/or plausibility objectives, and (3) amounts of
gold rationale supervision. Across three text classification datasets, our best
UniREx configurations achieve a superior balance of the five desiderata, when
compared to strong baselines. Furthermore, UniREx-trained rationale extractors
can even generalize to unseen datasets and tasks.
Related papers
- Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - AutoRE: Document-Level Relation Extraction with Large Language Models [27.426703757501507]
We introduce AutoRE, an end-to-end DocRE model that adopts a novel RE extraction paradigm named RHF (Relation-Head-Facts)
Unlike existing approaches, AutoRE does not rely on the assumption of known relation options, making it more reflective of real-world scenarios.
Our experiments on the RE-DocRED dataset showcase AutoRE's best performance, achieving state-of-the-art results.
arXiv Detail & Related papers (2024-03-21T23:48:21Z) - TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale [66.01943465390548]
We introduce TriSum, a framework for distilling large language models' text summarization abilities into a compact, local model.
Our method enhances local model performance on various benchmarks.
It also improves interpretability by providing insights into the summarization rationale.
arXiv Detail & Related papers (2024-03-15T14:36:38Z) - GIELLM: Japanese General Information Extraction Large Language Model
Utilizing Mutual Reinforcement Effect [0.0]
We introduce the General Information Extraction Large Language Model (GIELLM)
It integrates text Classification, Sentiment Analysis, Named Entity Recognition, Relation Extraction, and Event Extraction using a uniform input-output schema.
This innovation marks the first instance of a model simultaneously handling such a diverse array of IE subtasks.
arXiv Detail & Related papers (2023-11-12T13:30:38Z) - Rationale-Augmented Ensembles in Language Models [53.45015291520658]
We reconsider rationale-augmented prompting for few-shot in-context learning.
We identify rationale sampling in the output space as the key component to robustly improve performance.
We demonstrate that rationale-augmented ensembles achieve more accurate and interpretable results than existing prompting approaches.
arXiv Detail & Related papers (2022-07-02T06:20:57Z) - TAGPRIME: A Unified Framework for Relational Structure Extraction [71.88926365652034]
TAGPRIME is a sequence tagging model that appends priming words about the information of the given condition to the input text.
With the self-attention mechanism in pre-trained language models, the priming words make the output contextualized representations contain more information about the given condition.
Extensive experiments and analyses on three different tasks that cover ten datasets across five different languages demonstrate the generality and effectiveness of TAGPRIME.
arXiv Detail & Related papers (2022-05-25T08:57:46Z) - Improving Multi-Document Summarization through Referenced Flexible
Extraction with Credit-Awareness [21.037841262371355]
A notable challenge in Multi-Document Summarization (MDS) is the extremely-long length of the input.
We present an extract-then-abstract Transformer framework to overcome the problem.
We propose a loss weighting mechanism that makes the model aware of the unequal importance for the sentences not in the pseudo extraction oracle.
arXiv Detail & Related papers (2022-05-04T04:40:39Z) - Abstract, Rationale, Stance: A Joint Model for Scientific Claim
Verification [18.330265729989843]
We propose an approach, named as ARSJoint, that jointly learns the modules for the three tasks with a machine reading comprehension framework.
The experimental results on the benchmark dataset SciFact show that our approach outperforms the existing works.
arXiv Detail & Related papers (2021-09-13T10:07:26Z) - Measuring Association Between Labels and Free-Text Rationales [60.58672852655487]
In interpretable NLP, we require faithful rationales that reflect the model's decision-making process for an explained instance.
We demonstrate that pipelines, existing models for faithful extractive rationalization on information-extraction style tasks, do not extend as reliably to "reasoning" tasks requiring free-text rationales.
We turn to models that jointly predict and rationalize, a class of widely used high-performance models for free-text rationalization whose faithfulness is not yet established.
arXiv Detail & Related papers (2020-10-24T03:40:56Z) - An Information Bottleneck Approach for Controlling Conciseness in
Rationale Extraction [84.49035467829819]
We show that it is possible to better manage this trade-off by optimizing a bound on the Information Bottleneck (IB) objective.
Our fully unsupervised approach jointly learns an explainer that predicts sparse binary masks over sentences, and an end-task predictor that considers only the extracted rationale.
arXiv Detail & Related papers (2020-05-01T23:26:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.