Guiding Large Language Models to Generate Computer-Parsable Content
- URL: http://arxiv.org/abs/2404.05499v3
- Date: Sun, 21 Apr 2024 14:45:28 GMT
- Title: Guiding Large Language Models to Generate Computer-Parsable Content
- Authors: Jiaye Wang,
- Abstract summary: We propose a method to guide Large Language Models (LLMs) in generating structured content adhering to specific conventions without fine-tuning.
This enhances stability and consistency in generating target data structures, types, or instructions, reducing application development complexities.
- Score: 0.6798775532273751
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a method to guide Large Language Models (LLMs) in generating structured content adhering to specific conventions without fine-tuning. By utilizing coroutine-based content generation constraints through a pre-agreed context-free grammar (CFG), LLMs are directed during decoding to produce formal language compliant outputs. This enhances stability and consistency in generating target data structures, types, or instructions, reducing application development complexities. Experimentally, error rates of GPT-2 and Gemma exceed 95% for DSLs longer than 36 and 282 tokens, respectively. We introduce YieldLang, a coroutine-based DSL generation framework, and evaluate it with LLMs on various tasks including JSON and Mermaid flowchart generation. Compared to benchmarks, our approach improves accuracy by 1.09 to 11.6 times, with LLMs requiring only about 16.5% of the samples to generate JSON effectively. This enhances usability of LLM-generated content for computer programs.
Related papers
- InstructLR: A Scalable Approach to Create Instruction Dataset for Under-Resourced Languages [5.046479786355341]
In this paper, we introduce InstructLR, a novel framework designed to generate high-quality instruction datasets for low-resource languages (LRLs)<n>Our approach integrates LLM-driven text generation with a dual-layer quality filtering mechanism.<n>InstructLR has facilitated the creation of three multi-domain instruction benchmarks: ZarmaInstruct-50k, BambaraInstruct-50k, and FulfuldeInstruct-50k.
arXiv Detail & Related papers (2025-12-01T21:25:33Z) - SLMFix: Leveraging Small Language Models for Error Fixing with Reinforcement Learning [39.94602104823846]
Large language models (LLMs) generate programs that contains syntactic errors and fail to complete the given tasks.<n>In this work, we propose SLMFix, a novel code generation pipeline that leverages a small language model (SLM) finetuned using reinforcement learning (RL) techniques.
arXiv Detail & Related papers (2025-11-24T18:56:47Z) - DICE: Structured Reasoning in LLMs through SLM-Guided Chain-of-Thought Correction [29.22321321753093]
Large language models (LLMs) often prioritize reasoning over adherence to detailed instructions.<n>Fine-tuning LLMs on supervised datasets to address this is impractical due to high computational costs and limited parameter access.<n>We propose DICE, a lightweight framework that guides small language models (SLMs) to refine LLMs' outputs through chain-of-thought (CoT) correction.
arXiv Detail & Related papers (2025-10-10T09:45:35Z) - IFEvalCode: Controlled Code Generation [69.28317223249358]
The paper introduces forward and backward constraints generation to improve the instruction-following capabilities of Code LLMs.<n>The authors present IFEvalCode, a multilingual benchmark comprising 1.6K test samples across seven programming languages.
arXiv Detail & Related papers (2025-07-30T08:08:48Z) - Beyond In-Context Learning: Aligning Long-form Generation of Large Language Models via Task-Inherent Attribute Guidelines [71.14354526117958]
In-context learning (ICL) is an important yet not fully understood ability of pre-trained large language models (LLMs)<n>We present LongGuide, which efficiently generates two parallel streams of guidelines capturing task language and format properties.<n>LongGuide automatically selects the best combination of guidelines, improving both strong open- and closed-source LLMs by over 5% in both zero- and few-shot settings.
arXiv Detail & Related papers (2025-06-02T02:35:24Z) - StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs [39.108050455592036]
StructEval is a benchmark for evaluating Large Language Models' capabilities in producing structured formats.<n>Our benchmark encompasses 18 formats and 44 types of task, with novel metrics for format adherence and structural correctness.<n>Results reveal significant performance gaps, even state-of-the-art models like o1-mini achieve only 75.58 average score.
arXiv Detail & Related papers (2025-05-26T15:40:42Z) - SLOT: Structuring the Output of Large Language Models [5.683327173793259]
We present SLOT (Structured LLM Output Transformer), a model-agnostic approach that transforms unstructured LLM outputs into precise structured formats.<n>Our results demonstrate that fine-tuned Mistral-7B model with constrained decoding achieves near perfect schema accuracy.<n> Notably, even compact models like Llama-3.2-1B can match or exceed the structured output capabilities of much larger proprietary models.
arXiv Detail & Related papers (2025-05-06T23:29:43Z) - Type-Constrained Code Generation with Language Models [51.03439021895432]
We introduce a type-constrained decoding approach that leverages type systems to guide code generation.<n>For this purpose, we develop novel prefix automata and a search over inhabitable types, forming a sound approach to enforce well-typedness on LLM-generated code.<n>Our approach reduces compilation errors by more than half and significantly increases functional correctness in code synthesis, translation, and repair tasks.
arXiv Detail & Related papers (2025-04-12T15:03:00Z) - Idiosyncrasies in Large Language Models [54.26923012617675]
We unveil and study idiosyncrasies in Large Language Models (LLMs)<n>We find that fine-tuning text embedding models on LLM-generated texts yields excellent classification accuracy.<n>We leverage LLM as judges to generate detailed, open-ended descriptions of each model's idiosyncrasies.
arXiv Detail & Related papers (2025-02-17T18:59:02Z) - Learning to Keep a Promise: Scaling Language Model Decoding Parallelism with Learned Asynchronous Decoding [26.571743941748238]
PASTA is a learning-based system that teaches large language models to identify semantic independence and express parallel decoding opportunities in their own responses.
PASTA-Lang is an annotation language that enables LLMs to express semantic independence in their own responses.
Our results demonstrate geometric mean speedups ranging from 1.21x to 1.93x with corresponding quality changes of +2.2% to -7.1%, measured by length-controlled win rates against sequential decoding baseline.
arXiv Detail & Related papers (2025-02-17T07:39:16Z) - Chunk-Distilled Language Modeling [25.238256586953487]
Chunk-Distilled Language Modeling (CD-LM) is an approach to text generation that addresses two challenges in current large language models (LLMs)
Our method combines deep network-based LLMs with a straightforward retrieval module, which allows the generation of multi-token text chunks at a single decoding step.
arXiv Detail & Related papers (2024-12-31T08:32:15Z) - Training LLMs for Generating IEC 61131-3 Structured Text with Online Feedback [0.0]
This paper proposes a novel approach to training large language models (LLMs) that emphasizes improving the quality of learning data.
The framework proves highly suitable for industrial automation applications and outperforms state-of-the-art models.
arXiv Detail & Related papers (2024-10-29T15:54:09Z) - Large Language Models as Code Executors: An Exploratory Study [29.545321608864295]
This paper pioneers the exploration of Large Language Models (LLMs) as code executors.
We are the first to examine this feasibility across various LLMs, including OpenAI's o1, GPT-4o, GPT-3.5, DeepSeek, and Qwen-Coder.
We introduce an Iterative Instruction Prompting (IIP) technique that processes code snippets line by line, enhancing the accuracy of weaker models by an average of 7.22%.
arXiv Detail & Related papers (2024-10-09T08:23:22Z) - DecorateLM: Data Engineering through Corpus Rating, Tagging, and Editing with Language Models [78.51470038301436]
We introduce DecorateLM, a data engineering method designed to refine the pretraining corpus through data rating, tagging and editing.
We then apply DecorateLM to enhance 100 billion tokens of the training corpus, selecting 45 billion tokens that exemplify high quality and diversity for the further training of another 1.2 billion parameter LLM.
Our results demonstrate that employing such high-quality data can significantly boost model performance, showcasing a powerful approach to enhance the quality of the pretraining corpus.
arXiv Detail & Related papers (2024-10-08T02:42:56Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - SpecTra: Enhancing the Code Translation Ability of Language Models by Generating Multi-Modal Specifications [17.60108067953814]
Large language models (LLMs) are increasingly being used for the task of automated code translation.
We propose SpecTra, a multi-stage approach that uses a novel self-consistency filter to first generate high-quality specifications.
arXiv Detail & Related papers (2024-05-28T20:48:30Z) - CodecLM: Aligning Language Models with Tailored Synthetic Data [51.59223474427153]
We introduce CodecLM, a framework for adaptively generating high-quality synthetic data for instruction-following abilities.
We first encode seed instructions into metadata, which are concise keywords generated on-the-fly to capture the target instruction distribution.
We also introduce Self-Rubrics and Contrastive Filtering during decoding to tailor data-efficient samples.
arXiv Detail & Related papers (2024-04-08T21:15:36Z) - Exploring the Impact of the Output Format on the Evaluation of Large Language Models for Code Translation [8.81447711370817]
We empirically analyze the generated outputs of eleven popular instruct-tuned large language models (LLMs)
Our results demonstrate that a strategic combination of prompt engineering and regular expression can effectively extract the source code from the model generation output.
arXiv Detail & Related papers (2024-03-25T21:41:31Z) - Chain-of-Thought in Neural Code Generation: From and For Lightweight Language Models [22.392809555644646]
Large Language Models (LLMs) have demonstrated remarkable potential in code generation.
In this study, we investigate lightweight Language Models (lLMs) which are defined to have fewer than 10 billion parameters.
Based on these findings, we design a novel approach COTTON which can leverage lLMs to automatically generate Chain of Thought (CoTs)
The results show that the CoTs generated by COTTON outperform the baselines in terms of automated and human evaluation metrics.
arXiv Detail & Related papers (2023-12-09T12:20:50Z) - Large Language Models can Contrastively Refine their Generation for Better Sentence Representation Learning [57.74233319453229]
Large language models (LLMs) have emerged as a groundbreaking technology and their unparalleled text generation capabilities have sparked interest in their application to the fundamental sentence representation learning task.
We propose MultiCSR, a multi-level contrastive sentence representation learning framework that decomposes the process of prompting LLMs to generate a corpus.
Our experiments reveal that MultiCSR enables a less advanced LLM to surpass the performance of ChatGPT, while applying it to ChatGPT achieves better state-of-the-art results.
arXiv Detail & Related papers (2023-10-17T03:21:43Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.