Retrieved In-Context Principles from Previous Mistakes
- URL: http://arxiv.org/abs/2407.05682v1
- Date: Mon, 8 Jul 2024 07:32:26 GMT
- Title: Retrieved In-Context Principles from Previous Mistakes
- Authors: Hao Sun, Yong Jiang, Bo Wang, Yingyan Hou, Yan Zhang, Pengjun Xie, Fei Huang,
- Abstract summary: In-context learning (ICL) has been instrumental in adapting Large Language Models (LLMs) to downstream tasks using correct input-output examples.
Recent advances have attempted to improve model performance through principles derived from mistakes.
We propose Retrieved In-Context Principles (RICP), a novel teacher-student framework.
- Score: 55.109234526031884
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In-context learning (ICL) has been instrumental in adapting Large Language Models (LLMs) to downstream tasks using correct input-output examples. Recent advances have attempted to improve model performance through principles derived from mistakes, yet these approaches suffer from lack of customization and inadequate error coverage. To address these limitations, we propose Retrieved In-Context Principles (RICP), a novel teacher-student framework. In RICP, the teacher model analyzes mistakes from the student model to generate reasons and insights for preventing similar mistakes. These mistakes are clustered based on their underlying reasons for developing task-level principles, enhancing the error coverage of principles. During inference, the most relevant mistakes for each question are retrieved to create question-level principles, improving the customization of the provided guidance. RICP is orthogonal to existing prompting methods and does not require intervention from the teacher model during inference. Experimental results across seven reasoning benchmarks reveal that RICP effectively enhances performance when applied to various prompting strategies.
Related papers
- IHEval: Evaluating Language Models on Following the Instruction Hierarchy [67.33509094445104]
The instruction hierarchy establishes a priority order from system messages to user messages, conversation history, and tool outputs.
Despite its importance, this topic receives limited attention, and there is a lack of comprehensive benchmarks for evaluating models' ability to follow the instruction hierarchy.
We bridge this gap by introducing IHEval, a novel benchmark covering cases where instructions in different priorities either align or conflict.
arXiv Detail & Related papers (2025-02-12T19:35:28Z) - The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities [51.594836904623534]
We investigate whether instruction-tuned models possess fundamentally different capabilities from base models that are prompted using in-context examples.
We show that the performance of instruction-tuned models is significantly correlated with the in-context performance of their base counterparts.
Specifically, we extend this understanding to instruction-tuned models, suggesting that their pretraining data similarly sets a limiting boundary on the tasks they can solve.
arXiv Detail & Related papers (2025-01-15T10:57:55Z) - Eliciting Causal Abilities in Large Language Models for Reasoning Tasks [14.512834333917414]
We introduce the Self-Causal Instruction Enhancement (SCIE) method, which enables LLMs to generate high-quality, low-quantity observational data.
In SCIE, the instructions are treated as the treatment, and textual features are used to process natural language.
Our method effectively generates instructions that enhance reasoning performance with reduced training cost of prompts.
arXiv Detail & Related papers (2024-12-19T17:03:02Z) - Subtle Errors Matter: Preference Learning via Error-injected Self-editing [59.405145971637204]
We propose a novel preference learning framework called eRror-Injected Self-Editing (RISE)
RISE injects predefined subtle errors into partial tokens of correct solutions to construct hard pairs for error mitigation.
Experiments validate the effectiveness of RISE, with preference learning on Qwen2-7B-Instruct yielding notable improvements of 3.0% on GSM8K and 7.9% on MATH.
arXiv Detail & Related papers (2024-10-09T07:43:38Z) - Recursive Introspection: Teaching Language Model Agents How to Self-Improve [30.086494067593268]
We develop RISE: Recursive IntroSpEction, an approach for fine-tuning large language models.
Our experiments show that RISE enables Llama2, Llama3, and Mistral models to improve themselves with more turns on math reasoning tasks.
arXiv Detail & Related papers (2024-07-25T17:35:59Z) - LLMs-as-Instructors: Learning from Errors Toward Automating Model Improvement [93.38736019287224]
"LLMs-as-Instructors" framework autonomously enhances the training of smaller target models.
Inspired by the theory of "Learning from Errors", this framework employs an instructor LLM to meticulously analyze the specific errors within a target model.
Within this framework, we implement two strategies: "Learning from Error," which focuses solely on incorrect responses to tailor training data, and "Learning from Error by Contrast", which uses contrastive learning to analyze both correct and incorrect responses for a deeper understanding of errors.
arXiv Detail & Related papers (2024-06-29T17:16:04Z) - Tag and correct: high precision post-editing approach to correction of speech recognition errors [0.0]
It consists of using a neural sequence tagger that learns how to correct an ASR (Automatic Speech Recognition) hypothesis word by word and a corrector module that applies corrections returned by the tagger.
The proposed solution is applicable to any ASR system, regardless of its architecture, and provides high-precision control over errors being corrected.
arXiv Detail & Related papers (2024-06-11T09:52:33Z) - TPD: Enhancing Student Language Model Reasoning via Principle Discovery
and Guidance [0.0]
We introduce a principle-based teacher-student framework called Teaching via Principle Discovery'' (TPD)
Inspired by human learning mechanisms, TPD mimics the interaction between a teacher and a student using a principle-based approach.
TPD significantly improves the student model's performance, achieving $6.2%$ improvement on average.
arXiv Detail & Related papers (2024-01-24T23:11:33Z) - Remembering for the Right Reasons: Explanations Reduce Catastrophic
Forgetting [100.75479161884935]
We propose a novel training paradigm called Remembering for the Right Reasons (RRR)
RRR stores visual model explanations for each example in the buffer and ensures the model has "the right reasons" for its predictions.
We demonstrate how RRR can be easily added to any memory or regularization-based approach and results in reduced forgetting.
arXiv Detail & Related papers (2020-10-04T10:05:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.