Explanation-aware Soft Ensemble Empowers Large Language Model In-context
Learning
- URL: http://arxiv.org/abs/2311.07099v1
- Date: Mon, 13 Nov 2023 06:13:38 GMT
- Title: Explanation-aware Soft Ensemble Empowers Large Language Model In-context
Learning
- Authors: Yue Yu, Jiaming Shen, Tianqi Liu, Zhen Qin, Jing Nathan Yan, Jialu
Liu, Chao Zhang, Michael Bendersky
- Abstract summary: Large language models (LLMs) have shown remarkable capabilities in various natural language understanding tasks.
We propose EASE, an Explanation-Aware Soft Ensemble framework to empower in-context learning with LLMs.
- Score: 50.00090601424348
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have shown remarkable capabilities in various
natural language understanding tasks. With only a few demonstration examples,
these LLMs can quickly adapt to target tasks without expensive gradient
updates. Common strategies to boost such 'in-context' learning ability are to
ensemble multiple model decoded results and require the model to generate an
explanation along with the prediction. However, these models often treat
different class predictions equally and neglect the potential discrepancy
between the explanations and predictions. To fully unleash the power of
explanations, we propose EASE, an Explanation-Aware Soft Ensemble framework to
empower in-context learning with LLMs. We design two techniques,
explanation-guided ensemble, and soft probability aggregation, to mitigate the
effect of unreliable explanations and improve the consistency between
explanations and final predictions. Experiments on seven natural language
understanding tasks and four varying-size LLMs demonstrate the effectiveness of
our proposed framework.
Related papers
- PromptExp: Multi-granularity Prompt Explanation of Large Language Models [16.259208045898415]
We introduce PromptExp, a framework for multi-granularity prompt explanations by aggregating token-level insights.
PromptExp supports both white-box and black-box explanations and extends explanations to higher granularity levels.
We evaluate PromptExp in case studies such as sentiment analysis, showing the perturbation-based approach performs best.
arXiv Detail & Related papers (2024-10-16T22:25:15Z) - Uncertainty Quantification for In-Context Learning of Large Language Models [52.891205009620364]
In-context learning has emerged as a groundbreaking ability of Large Language Models (LLMs)
We propose a novel formulation and corresponding estimation method to quantify both types of uncertainties.
The proposed method offers an unsupervised way to understand the prediction of in-context learning in a plug-and-play fashion.
arXiv Detail & Related papers (2024-02-15T18:46:24Z) - Learning to Generate Explainable Stock Predictions using Self-Reflective
Large Language Models [54.21695754082441]
We propose a framework to teach Large Language Models (LLMs) to generate explainable stock predictions.
A reflective agent learns how to explain past stock movements through self-reasoning, while the PPO trainer trains the model to generate the most likely explanations.
Our framework can outperform both traditional deep-learning and LLM methods in prediction accuracy and Matthews correlation coefficient.
arXiv Detail & Related papers (2024-02-06T03:18:58Z) - Can Large Language Models Understand Context? [17.196362853457412]
This paper introduces a context understanding benchmark by adapting existing datasets to suit the evaluation of generative models.
Experimental results indicate that pre-trained dense models struggle with understanding more nuanced contextual features when compared to state-of-the-art fine-tuned models.
As LLM compression holds growing significance in both research and real-world applications, we assess the context understanding of quantized models under in-context-learning settings.
arXiv Detail & Related papers (2024-02-01T18:55:29Z) - In-Context Explainers: Harnessing LLMs for Explaining Black Box Models [28.396104334980492]
Large Language Models (LLMs) have demonstrated exceptional capabilities in complex tasks like machine translation, commonsense reasoning, and language understanding.
One of the primary reasons for the adaptability of LLMs in such diverse tasks is their in-context learning (ICL) capability, which allows them to perform well on new tasks by simply using a few task samples in the prompt.
We propose a novel framework, In-Context Explainers, comprising of three novel approaches that exploit the ICL capabilities of LLMs to explain the predictions made by other predictive models.
arXiv Detail & Related papers (2023-10-09T15:31:03Z) - Explainability for Large Language Models: A Survey [59.67574757137078]
Large language models (LLMs) have demonstrated impressive capabilities in natural language processing.
This paper introduces a taxonomy of explainability techniques and provides a structured overview of methods for explaining Transformer-based language models.
arXiv Detail & Related papers (2023-09-02T22:14:26Z) - Complementary Explanations for Effective In-Context Learning [77.83124315634386]
Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts.
This work aims to better understand the mechanisms by which explanations are used for in-context learning.
arXiv Detail & Related papers (2022-11-25T04:40:47Z) - Explanations from Large Language Models Make Small Reasoners Better [61.991772773700006]
We show that our method can consistently and significantly outperform finetuning baselines across different settings.
As a side benefit, human evaluation shows that our method can generate high-quality explanations to justify its predictions.
arXiv Detail & Related papers (2022-10-13T04:50:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.