Inference-Time Intervention in Large Language Models for Reliable Requirement Verification
- URL: http://arxiv.org/abs/2503.14130v1
- Date: Tue, 18 Mar 2025 10:49:36 GMT
- Title: Inference-Time Intervention in Large Language Models for Reliable Requirement Verification
- Authors: Paul Darm, James Xie, Annalisa Riccardi,
- Abstract summary: Inference-time intervention techniques provide a promising alternative to fine-tuning.<n>We demonstrate how interventions enable fine-grained control for automating the usually time-intensive requirement verification process.<n>Our method achieves robust and reliable outputs, significantly improving over both a baseline model and a fine-tuning approach.
- Score: 2.3759432635713895
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Steering the behavior of Large Language Models (LLMs) remains a challenge, particularly in engineering applications where precision and reliability are critical. While fine-tuning and prompting methods can modify model behavior, they lack the dynamic and exact control necessary for engineering applications. Inference-time intervention techniques provide a promising alternative, allowing targeted adjustments to LLM outputs. In this work, we demonstrate how interventions enable fine-grained control for automating the usually time-intensive requirement verification process in Model-Based Systems Engineering (MBSE). Using two early-stage Capella SysML models of space missions with associated requirements, we apply the intervened LLMs to reason over a graph representation of the model to determine whether a requirement is fulfilled. Our method achieves robust and reliable outputs, significantly improving over both a baseline model and a fine-tuning approach. By identifying and modifying as few as one to three specialised attention heads, we can significantly change the model's behavior. When combined with self-consistency, this allows us to achieve perfect precision on our holdout test set.
Related papers
- Self-Steering Language Models [113.96916935955842]
DisCIPL is a method for "self-steering" language models.
DisCIPL uses a Planner model to generate a task-specific inference program.
Our work opens up a design space of highly-parallelized Monte Carlo inference strategies.
arXiv Detail & Related papers (2025-04-09T17:54:22Z) - DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models [50.32663816994459]
Diffusion-styled Preference Optimization (model) provides an efficient and policy-agnostic solution for aligning LLMs with humans.<n>modelavoids the time latency associated with token-level generation.<n>Experiments on AlpacaEval 2, MT-bench, and HH-RLHF demonstrate that modelachieves superior alignment performance across various settings.
arXiv Detail & Related papers (2025-03-06T09:21:54Z) - Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging [75.93960998357812]
Deep model merging represents an emerging research direction that combines multiple fine-tuned models to harness their capabilities across different tasks and domains.<n>Current model merging techniques focus on merging all available models simultaneously, with weight matrices-based methods being the predominant approaches.<n>We propose a training-free projection-based continual merging method that processes models sequentially.
arXiv Detail & Related papers (2025-01-16T13:17:24Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - Aligning Large Language Models with Representation Editing: A Control Perspective [38.71496554018039]
Fine-tuning large language models (LLMs) to align with human objectives is crucial for real-world applications.
Test-time alignment techniques, such as prompting and guided decoding, do not modify the underlying model.
We propose aligning LLMs through representation editing.
arXiv Detail & Related papers (2024-06-10T01:21:31Z) - Calibrating Large Language Models with Sample Consistency [76.23956851098598]
We explore the potential of deriving confidence from the distribution of multiple randomly sampled model generations, via three measures of consistency.
Results show that consistency-based calibration methods outperform existing post-hoc approaches.
We offer practical guidance on choosing suitable consistency metrics for calibration, tailored to the characteristics of various LMs.
arXiv Detail & Related papers (2024-02-21T16:15:20Z) - InferAligner: Inference-Time Alignment for Harmlessness through
Cross-Model Guidance [56.184255657175335]
We develop textbfInferAligner, a novel inference-time alignment method that utilizes cross-model guidance for harmlessness alignment.
Experimental results show that our method can be very effectively applied to domain-specific models in finance, medicine, and mathematics.
It significantly diminishes the Attack Success Rate (ASR) of both harmful instructions and jailbreak attacks, while maintaining almost unchanged performance in downstream tasks.
arXiv Detail & Related papers (2024-01-20T10:41:03Z) - Correct-by-Construction Control for Stochastic and Uncertain Dynamical
Models via Formal Abstractions [44.99833362998488]
We develop an abstraction framework that can be used to solve this problem under various modeling assumptions.
We use state-of-the-art verification techniques to compute an optimal policy on the iMDP with guarantees for satisfying the given specification.
We then show that, by construction, we can refine this policy into a feedback controller for which these guarantees carry over to the dynamical model.
arXiv Detail & Related papers (2023-11-16T11:03:54Z) - MOSEL: Inference Serving Using Dynamic Modality Selection [4.849058875921672]
We introduce a form of dynamism, modality selection, where we adaptively choose modalities from inference inputs while maintaining the model quality.
We introduce MOSEL, an automated inference serving system for multi-modal ML models that carefully picks input modalities per request based on user-defined performance and accuracy requirements.
arXiv Detail & Related papers (2023-10-27T20:50:56Z) - Variable Importance Matching for Causal Inference [73.25504313552516]
We describe a general framework called Model-to-Match that achieves these goals.
Model-to-Match uses variable importance measurements to construct a distance metric.
We operationalize the Model-to-Match framework with LASSO.
arXiv Detail & Related papers (2023-02-23T00:43:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.