Failures Pave the Way: Enhancing Large Language Models through
Tuning-free Rule Accumulation
- URL: http://arxiv.org/abs/2310.15746v1
- Date: Tue, 24 Oct 2023 11:40:34 GMT
- Title: Failures Pave the Way: Enhancing Large Language Models through
Tuning-free Rule Accumulation
- Authors: Zeyuan Yang, Peng Li, Yang Liu
- Abstract summary: Large Language Models (LLMs) have showcased impressive performance.
Due to their inability to capture relationships among samples, these frozen LLMs inevitably keep repeating similar mistakes.
We propose our Tuning-free Rule Accumulation (TRAN) framework, which guides LLMs in improving their performance by learning from previous mistakes.
- Score: 11.366334433990588
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have showcased impressive performance. However,
due to their inability to capture relationships among samples, these frozen
LLMs inevitably keep repeating similar mistakes. In this work, we propose our
Tuning-free Rule Accumulation (TRAN) framework, which guides LLMs in improving
their performance by learning from previous mistakes. Considering data arrives
sequentially, LLMs gradually accumulate rules from incorrect cases, forming a
rule collection. These rules are then utilized by the LLMs to avoid making
similar mistakes when processing subsequent inputs. Moreover, the rules remain
independent of the primary prompts, seamlessly complementing prompt design
strategies. Experimentally, we show that TRAN improves over recent baselines by
a large margin.
Related papers
- Training Large Language Models to be Better Rule Followers [23.958458849973248]
Large language models (LLMs) have shown impressive performance across a wide range of tasks.
Current training methods fail to leverage these rules effectively.
We propose Meta Rule-Following Fine-Tuning (Meta-RFFT) to enhance the cross-task transferability of rule-following abilities.
arXiv Detail & Related papers (2025-02-17T07:54:50Z) - LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.
LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.
Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z) - Real-time Verification and Refinement of Language Model Text Generation [60.04718679054704]
Large language models (LLMs) have shown remarkable performance across a wide range of natural language tasks.
A critical challenge remains in that they sometimes generate factually incorrect answers.
We propose Streaming-VR, a novel approach designed to enhance the efficiency of verification and refinement of LLM outputs.
arXiv Detail & Related papers (2025-01-14T03:59:48Z) - CorrectionLM: Self-Corrections with SLM for Dialogue State Tracking [16.057622631156164]
Large language models (LLMs) have demonstrated self-improvement capabilities via feedback and refinement, but current small language models (SLMs) have had limited success in this area.
We introduce CORRECTIONLM, a novel correction framework that enables SLMs to self-correct using in-context exemplars without LLM involvement.
arXiv Detail & Related papers (2024-10-23T18:27:16Z) - From Yes-Men to Truth-Tellers: Addressing Sycophancy in Large Language Models with Pinpoint Tuning [91.79567270986901]
Large Language Models (LLMs) tend to prioritize adherence to user prompts over providing veracious responses.
Recent works propose to employ supervised fine-tuning (SFT) to mitigate the sycophancy issue.
We propose a novel supervised pinpoint tuning (SPT), where the region-of-interest modules are tuned for a given objective.
arXiv Detail & Related papers (2024-09-03T07:01:37Z) - Order-Independence Without Fine Tuning [18.020492646988746]
We present Set-Based Prompting, a technique that guarantees the output of an LLM will not have order dependence on a specified set of sub-sequences.
Despite our inputs being out of distribution, the impact on expected accuracy is small, where the expectation is over the order of uniformly chosen shuffling of the candidate responses.
arXiv Detail & Related papers (2024-06-04T16:09:13Z) - One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)
We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z) - Causal Prompting: Debiasing Large Language Model Prompting based on Front-Door Adjustment [32.12998469814097]
A novel causal prompting method based on front-door adjustment is proposed to effectively mitigate Large Language Models (LLMs) biases.
Experimental results show that the proposed causal prompting approach achieves excellent performance across seven natural language processing datasets.
arXiv Detail & Related papers (2024-03-05T07:47:34Z) - Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning [79.32236399694077]
Low-quality data in the training set are usually detrimental to instruction tuning.
We propose a novel method, termed "reflection-tuning"
This approach utilizes an oracle LLM to recycle the original training data by introspecting and enhancing the quality of instructions and responses in the data.
arXiv Detail & Related papers (2023-10-18T05:13:47Z) - RCOT: Detecting and Rectifying Factual Inconsistency in Reasoning by
Reversing Chain-of-Thought [56.558892336235914]
Reversing Chain-of-Thought (RCoT) is a novel method to improve large language models' reasoning abilities.
RCoT automatically detects and rectifys factual inconsistency in generated solutions.
We show that manually written fine-grained feedback can dramatically improve LLMs' reasoning abilities.
arXiv Detail & Related papers (2023-05-19T08:02:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.