Related papers: CorrectionLM: Self-Corrections with SLM for Dialogue State Tracking

Related papers

Evaluating Small Decoder-Only Language Models for Grammar Correction and Text Simplification [0.0]
This paper investigates whether small, decoder-only language models can provide an efficient alternative for the tasks of grammar correction and text simplification.<n>Experiments in this paper focus on testing small language models out of the box, fine-tuned, and run sequentially on the JFLEG and ASSET datasets.
arXiv Detail & Related papers (2026-01-07T12:39:31Z)
LLM4VV: Evaluating Cutting-Edge LLMs for Generation and Evaluation of Directive-Based Parallel Programming Model Compiler Tests [7.6818904666624395]
This paper proposes a dual-LLM system and experiments with the usage of LLMs for the generation of compiler tests.<n>It is evident that LLMs possess the promising potential to generate quality compiler tests and verify them automatically.
arXiv Detail & Related papers (2025-07-29T02:34:28Z)
Reasoning-Aligned Perception Decoupling for Scalable Multi-modal Reasoning [95.44766931218896]
Multi-modal large language models (MLLMs) still lag behind text-based reasoning.<n>We introduce Perception-Reasoning Decoupling, which modularizes the MLLM's reasoning component and makes it easily replaceable.<n>We propose a novel reinforcement learning algorithm called Visual Perception Optimization (VPO) to align the MLLM's perceptual output with the final reasoning task.
arXiv Detail & Related papers (2025-06-05T02:28:07Z)
DRE: An Effective Dual-Refined Method for Integrating Small and Large Language Models in Open-Domain Dialogue Evaluation [21.229539297323488]
Large Language Models (LLMs) excel at many tasks but struggle with ambiguous scenarios where multiple valid responses exist.<n>Small Language Models (SLMs) demonstrate robustness in such scenarios but are susceptible to misleading or adversarial inputs.<n>We introduce SLIDE (Small and Large Integrated for Dialogue Evaluation), a method integrating SLMs and LLMs via adaptive weighting.
arXiv Detail & Related papers (2025-06-04T23:41:31Z)
LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression. LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model. Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z)
A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs [74.35290684163718]
A primary challenge in large language model (LLM) development is their onerous pre-training cost. This paper explores a promising paradigm to improve LLM pre-training efficiency and quality by leveraging a small language model (SLM)
arXiv Detail & Related papers (2024-10-24T14:31:52Z)
Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification [76.14641982122696]
We propose a constraint learning schema for fine-tuning Large Language Models (LLMs) with attribute control. We show that our approach leads to an LLM that produces fewer inappropriate responses while achieving competitive performance on benchmarks and a toxicity detection task.
arXiv Detail & Related papers (2024-10-07T23:38:58Z)
Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks. LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning. We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z)
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models. It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop. Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z)
$\forall$uto$\exists$val: Autonomous Assessment of LLMs in Formal Synthesis and Interpretation Tasks [21.12437562185667]
This paper presents a new approach for scaling LLM assessment in translating formal syntax to natural language. We use context-free grammars (CFGs) to generate out-of-distribution datasets on the fly. We also conduct an assessment of several SOTA closed and open-source LLMs to showcase the feasibility and scalability of this paradigm.
arXiv Detail & Related papers (2024-03-27T08:08:00Z)
TEaR: Improving LLM-based Machine Translation with Systematic Self-Refinement [26.26493253161022]
Large Language Models (LLMs) have achieved impressive results in Machine Translation (MT) We introduce a systematic LLM-based self-refinement translation framework, named textbfTEaR.
arXiv Detail & Related papers (2024-02-26T07:58:12Z)
Rethinking the Roles of Large Language Models in Chinese Grammatical Error Correction [62.409807640887834]
Chinese Grammatical Error Correction (CGEC) aims to correct all potential grammatical errors in the input sentences. LLMs' performance as correctors on CGEC remains unsatisfactory due to its challenging task focus. We rethink the roles of LLMs in the CGEC task so that they can be better utilized and explored in CGEC.
arXiv Detail & Related papers (2024-02-18T01:40:34Z)
Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models [20.28989820878285]
Large language models (LLMs) have achieved remarkable advancements in natural language processing. The massive scale and computational demands of these models present formidable challenges when considering their practical deployment in resource-constrained environments.
arXiv Detail & Related papers (2023-11-15T18:56:23Z)
Assessing the Reliability of Large Language Model Knowledge [78.38870272050106]
Large language models (LLMs) have been treated as knowledge bases due to their strong performance in knowledge probing tasks. How do we evaluate the capabilities of LLMs to consistently produce factually correct answers? We propose MOdel kNowledge relIabiliTy scORe (MONITOR), a novel metric designed to directly measure LLMs' factual reliability.
arXiv Detail & Related papers (2023-10-15T12:40:30Z)
Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study [0.0]
This paper explores the integration of Large Language Models (LLMs) into Automatic Speech Recognition (ASR) systems. Our primary focus is to investigate the potential of using an LLM's in-context learning capabilities to enhance the performance of ASR systems.
arXiv Detail & Related papers (2023-07-13T02:31:55Z)
Small Language Models Improve Giants by Rewriting Their Outputs [18.025736098795296]
We tackle the problem of leveraging training data to improve the performance of large language models (LLMs) without fine-tuning. We create a pool of candidates from the LLM through few-shot prompting and we employ a compact model, the LM-corrector (LMCor), specifically trained to merge these candidates to produce an enhanced output. Experiments on four natural language generation tasks demonstrate that even a small LMCor model (250M) substantially improves the few-shot performance of LLMs (62B), matching and even outperforming standard fine-tuning.
arXiv Detail & Related papers (2023-05-22T22:07:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.