HEFT: A Coarse-to-Fine Hierarchy for Enhancing the Efficiency and Accuracy of Language Model Reasoning
- URL: http://arxiv.org/abs/2509.09801v1
- Date: Thu, 11 Sep 2025 19:06:46 GMT
- Title: HEFT: A Coarse-to-Fine Hierarchy for Enhancing the Efficiency and Accuracy of Language Model Reasoning
- Authors: Brennen Hill,
- Abstract summary: HEFT is a novel hierarchical adaptation strategy that composes two distinct PEFT methods in a coarse-to-fine manner.<n>A model fine-tuned for only three epochs with our HEFT strategy achieves an accuracy of 85.17%, exceeding the performance of models trained for 20 epochs.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The adaptation of large language models (LLMs) to specialized reasoning tasks is fundamentally constrained by computational resources. Parameter-Efficient Fine-Tuning (PEFT) methods have emerged as a powerful solution, yet the landscape of these techniques is diverse, with distinct methods operating in either the model's weight space or its representation space. This paper investigates the hypothesis that a synergistic combination of these paradigms can unlock superior performance and efficiency. We introduce HEFT (Hierarchical Efficient Fine-Tuning), a novel hierarchical adaptation strategy that composes two distinct PEFT methods in a coarse-to-fine manner: first, a broad, foundational adaptation in the weight space using Low-Rank Adaptation (LoRA), followed by a precise, surgical refinement of internal activations using Representation Fine-Tuning (ReFT). We evaluate this approach by fine-tuning a Llama-2-7B model on the BoolQ benchmark, a challenging dataset for inferential reasoning. Our results reveal a profound synergistic effect. A model fine-tuned for only three epochs with our HEFT strategy achieves an accuracy of 85.17\%, exceeding the performance of models trained for 20 epochs with either LoRA-only (85.05\%) or ReFT-only (83.36\%) methodologies. This work demonstrates that the thoughtful composition of PEFT methods is a potent algorithmic innovation, offering a more efficient and effective path toward advancing the reasoning capabilities of language models. By achieving superior results with a fraction of the computational budget, our findings present a principled approach to overcoming the obstacles inherent in adapting large-scale models for complex cognitive tasks.
Related papers
- Performance and Complexity Trade-off Optimization of Speech Models During Training [5.335528687192602]
In speech machine learning, neural network models are typically designed by choosing an architecture with fixed layer sizes and structure.<n>While the overall architecture is usually guided by prior knowledge of the task, the sizes of individual layers are often chosen.<n>Unlike pruning methods, our approach allows the model size to be dynamically optimized for a target performance-complexity trade-off.
arXiv Detail & Related papers (2026-01-20T08:00:05Z) - TuckA: Hierarchical Compact Tensor Experts for Efficient Fine-Tuning [83.93651411533533]
We introduce Tucker Adaptation (TuckA), a method with four key properties.<n>We develop an efficient batch-level routing mechanism, which reduces the router's parameter size by a factor of $L$.<n>Experiments on benchmarks in natural language understanding, image classification, and mathematical reasoning speak to the efficacy of TuckA.
arXiv Detail & Related papers (2025-11-10T09:03:16Z) - TCPO: Thought-Centric Preference Optimization for Effective Embodied Decision-making [75.29820290660065]
This paper proposes Thought-Centric Preference Optimization ( TCPO) for effective embodied decision-making.<n>It emphasizes the alignment of the model's intermediate reasoning process, mitigating the problem of model degradation.<n>Experiments in the ALFWorld environment demonstrate an average success rate of 26.67%, achieving a 6% improvement over RL4VLM.
arXiv Detail & Related papers (2025-09-10T11:16:21Z) - LAPO: Internalizing Reasoning Efficiency via Length-Adaptive Policy Optimization [48.91511514636768]
Length-Adaptive Policy Optimization transforms reasoning length control from an external constraint into an intrinsic model capability.<n>LAPO enables models to internalize an understanding of appropriate reasoning depth through a two-stage reinforcement learning process.<n> Experiments on mathematical reasoning benchmarks demonstrate that LAPO reduces token usage by up to 40.9% while improving accuracy by 2.3%.
arXiv Detail & Related papers (2025-07-21T16:14:41Z) - Dual Decomposition of Weights and Singular Value Low Rank Adaptation [9.048461365342204]
We propose DuDe, a novel approach that decomposes weight matrices into magnitude and direction components.<n>Our evaluation demonstrates DuDe's superior performance and robustness, achieving up to 48.35% accuracy on MMLU and 62.53% ($pm$ 1.59) accuracy on GSM8K.
arXiv Detail & Related papers (2025-05-20T13:49:15Z) - Towards Fair Class-wise Robustness: Class Optimal Distribution Adversarial Training [1.5565181723989001]
Adversarial training has proven to be a highly effective method for improving the robustness of deep neural networks against adversarial attacks.<n>It has been observed to exhibit a limitation in terms of robust fairness, characterized by a significant disparity in robustness across different classes.<n>Recent efforts to mitigate this problem have turned to class-wise-weighted methods.<n>This paper proposes a novel min-max training framework, Class Optimal Distribution Adversarial Training.
arXiv Detail & Related papers (2025-01-08T14:19:03Z) - Feature Alignment-Based Knowledge Distillation for Efficient Compression of Large Language Models [4.737806982257592]
This study proposes a knowledge distillation algorithm based on large language models and feature alignment.<n>The proposed model performs very close to the state-of-the-art GPT-4 model in terms of evaluation indicators such as perplexity, BLEU, ROUGE, and CER.
arXiv Detail & Related papers (2024-12-27T04:37:06Z) - Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA)
Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%.
DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z) - See Further for Parameter Efficient Fine-tuning by Standing on the Shoulders of Decomposition [56.87609859444084]
parameter-efficient fine-tuning (PEFT) focuses on optimizing a select subset of parameters while keeping the rest fixed, significantly lowering computational and storage overheads.<n>We take the first step to unify all approaches by dissecting them from a decomposition perspective.<n>We introduce two novel PEFT methods alongside a simple yet effective framework designed to enhance the performance of PEFT techniques across various applications.
arXiv Detail & Related papers (2024-07-07T15:44:42Z) - Efficiency optimization of large-scale language models based on deep learning in natural language processing tasks [6.596361762662328]
Internal structure and operation mechanism of large-scale language models are analyzed theoretically.
We evaluate the contribution of adaptive optimization algorithms (such as AdamW), massively parallel computing techniques, and mixed precision training strategies.
arXiv Detail & Related papers (2024-05-20T00:10:00Z) - Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared
Pre-trained Language Models [109.06052781040916]
We introduce a technique to enhance the inference efficiency of parameter-shared language models.
We also propose a simple pre-training technique that leads to fully or partially shared models.
Results demonstrate the effectiveness of our methods on both autoregressive and autoencoding PLMs.
arXiv Detail & Related papers (2023-10-19T15:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.