A Dialectic Pipeline for Improving LLM Robustness
- URL: http://arxiv.org/abs/2601.20659v1
- Date: Wed, 28 Jan 2026 14:42:49 GMT
- Title: A Dialectic Pipeline for Improving LLM Robustness
- Authors: Sara Candussio,
- Abstract summary: Methods such as fine-tuning on domain-specific data or the training of a separate textitad hoc verifier require demanding computational resources.<n>We propose a dialectic pipeline that preserves LLMs' generalization abilities while improving the quality of its answer via self-dialogue.<n>We find that our proposed dialectic pipeline is able to outperform by significative margins the standard model answers.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Assessing ways in which Language Models can reduce their hallucinations and improve the outputs' quality is crucial to ensure their large-scale use. However, methods such as fine-tuning on domain-specific data or the training of a separate \textit{ad hoc} verifier require demanding computational resources (not feasible for many user applications) and constrain the models to specific fields of knowledge. In this thesis, we propose a dialectic pipeline that preserves LLMs' generalization abilities while improving the quality of its answer via self-dialogue, enabling it to reflect upon and correct tentative wrong answers. We experimented with different pipeline settings, testing our proposed method on different datasets and on different families of models. All the pipeline stages are enriched with the relevant context (in an oracle-RAG setting) and a study on the impact of its summarization or its filtering is conducted. We find that our proposed dialectic pipeline is able to outperform by significative margins the standard model answers and that it consistently achieves higher performances than Chain-of-Thought only prompting.
Related papers
- SICL-AT: Another way to adapt Auditory LLM to low-resource task [34.82834349882226]
Auditory Large Language Models (LLMs) have demonstrated strong performance across a wide range of speech and audio understanding tasks.<n>They often struggle when applied to low-resource or unfamiliar tasks.<n>In-Context Learning (ICL) provides a training-free, inference-time solution.
arXiv Detail & Related papers (2026-01-26T19:15:16Z) - Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs [49.66344956133349]
Reasoning capacity shapes both inference-time performance and reinforcement learning (RL) training for large (vision-) language models.<n>This paper proposes Reasoning Palette, a novel latent-modulation framework that endows the model with a latent variable for strategic contextualization.
arXiv Detail & Related papers (2025-12-19T03:32:53Z) - Decomposition-Enhanced Training for Post-Hoc Attributions In Language Models [64.49342399229529]
We argue that post-hoc attribution can be reframed as a reasoning problem, where answers are decomposed into constituent units, each tied to specific context.<n>We introduce DecompTune, a post-training method that teaches models to produce answer decompositions as intermediate reasoning steps.<n>Across extensive experiments and ablations, DecompTune substantially improves attribution quality, outperforming prior methods and matching or exceeding state-of-the-art frontier models.
arXiv Detail & Related papers (2025-10-29T17:58:59Z) - Efficient Response Generation Strategy Selection for Fine-Tuning Large Language Models Through Self-Aligned Perplexity [29.665161650753742]
Fine-tuning large language models (LLMs) typically relies on producing large sets of input-output pairs.<n>Recent findings show that how these training outputs are generated can significantly affect the performance of the fine-tuned model.<n>This paper proposes a scalable approximate method that assesses a small subset of generated data to estimate its suitability for a specific target LLM.
arXiv Detail & Related papers (2025-02-17T13:14:11Z) - Scalable Influence and Fact Tracing for Large Language Model Pretraining [14.598556308631018]
Training data attribution (TDA) methods aim to attribute model outputs back to specific training examples.<n>We refine existing gradient-based methods to work effectively at scale.<n>We release our prompt set and model outputs, along with a web-based visualization tool to explore influential examples.
arXiv Detail & Related papers (2024-10-22T20:39:21Z) - In-context Demonstration Matters: On Prompt Optimization for Pseudo-Supervision Refinement [71.60563181678323]
Large language models (LLMs) have achieved great success across diverse tasks, and fine-tuning is sometimes needed to further enhance generation quality.<n>To handle these challenges, a direct solution is to generate high-confidence'' data from unsupervised downstream tasks.<n>We propose a novel approach, pseudo-supervised demonstrations aligned prompt optimization (PAPO) algorithm, which jointly refines both the prompt and the overall pseudo-supervision.
arXiv Detail & Related papers (2024-10-04T03:39:28Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - Fine-Tuning or Fine-Failing? Debunking Performance Myths in Large Language Models [0.8399688944263842]
Large Language Models (LLMs) have the capability to understand and generate human-like text from input queries.
This study extends this concept to the integration of LLMs within Retrieval-Augmented Generation (RAG) pipelines.
We evaluate the impact of fine-tuning on the LLMs' capacity for data extraction and contextual understanding.
arXiv Detail & Related papers (2024-06-17T04:35:17Z) - Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
We propose uncertainty-aware learning (UAL) to improve the model alignment of different task scenarios.
We implement UAL in a simple fashion -- adaptively setting the label smoothing value of training according to the uncertainty of individual samples.
Experiments on widely used benchmarks demonstrate that our UAL significantly and consistently outperforms standard supervised fine-tuning.
arXiv Detail & Related papers (2024-06-07T11:37:45Z) - Using Set Covering to Generate Databases for Holistic Steganalysis [2.089615335919449]
We explore a grid of processing pipelines to study the origins of Cover Source Mismatch (CSM)
A set-covering greedy algorithm is used to select representative pipelines minimizing the maximum regret between the representative and the pipelines within the set.
Our analysis also shows that parameters as denoising, sharpening, and downsampling are very important to foster diversity.
arXiv Detail & Related papers (2022-11-07T10:53:02Z) - Bridging the Gap Between Clean Data Training and Real-World Inference
for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference.
We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space.
Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.