Learning to Generate Structured Output with Schema Reinforcement Learning
- URL: http://arxiv.org/abs/2502.18878v2
- Date: Thu, 06 Mar 2025 07:06:40 GMT
- Title: Learning to Generate Structured Output with Schema Reinforcement Learning
- Authors: Yaxi Lu, Haolun Li, Xin Cong, Zhong Zhang, Yesai Wu, Yankai Lin, Zhiyuan Liu, Fangming Liu, Maosong Sun,
- Abstract summary: This study investigates the structured generation capabilities of large language models (LLMs)<n>We find that the latest LLMs are still struggling to generate a valid string.<n>Our models demonstrate significant improvement in both generating outputs and downstream tasks.
- Score: 83.09230124049667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study investigates the structured generation capabilities of large language models (LLMs), focusing on producing valid JSON outputs against a given schema. Despite the widespread use of JSON in integrating language models with programs, there is a lack of comprehensive analysis and benchmarking of these capabilities. We explore various aspects of JSON generation, such as structure understanding, escaping, and natural language description, to determine how to assess and enable LLMs to generate valid responses. Building upon this, we propose SchemaBench features around 40K different JSON schemas to obtain and assess models' abilities in generating valid JSON. We find that the latest LLMs are still struggling to generate a valid JSON string. Moreover, we demonstrate that incorporating reinforcement learning with a Fine-grained Schema Validator can further enhance models' understanding of JSON schema, leading to improved performance. Our models demonstrate significant improvement in both generating JSON outputs and downstream tasks.
Related papers
- Ensemble Learning for Large Language Models in Text and Code Generation: A Survey [6.041894045506043]
We focus on four methods and models that show strong performance and potential for broader applications.
These include better representation of diversity, improved output quality, and greater flexibility in applications.
arXiv Detail & Related papers (2025-03-13T18:50:57Z) - Generating Structured Outputs from Language Models: Benchmark and Studies [24.017253364927086]
Constrained decoding has emerged as the dominant technology across sectors for enforcing structured outputs during generation.<n>We present an evaluation framework to assess constrained decoding approaches across three critical dimensions: efficiency in generating constraint-compliant outputs, coverage of diverse quality of the generated outputs.<n>Our work provides actionable insights for improving constrained decoding frameworks and setting a new standard for evaluating constrained decoding structured generation.
arXiv Detail & Related papers (2025-01-18T20:26:00Z) - Structured Object Language Modeling (SoLM): Native Structured Objects Generation Conforming to Complex Schemas with Self-Supervised Denoising [7.59750288224997]
We frame the problem as a Language Modeling problem (Structured Object Language Modeling)<n>We propose a self-supervised denoising method to train the model from an existing dataset of such objects.<n> Experimental results show that the proposed method matches or outperforms prompt-engineered general-purpose state-of-the-art LLMs.
arXiv Detail & Related papers (2024-11-28T18:16:41Z) - Matchmaker: Self-Improving Large Language Model Programs for Schema Matching [60.23571456538149]
We propose a compositional language model program for schema matching, comprised of candidate generation, refinement and confidence scoring.
Matchmaker self-improves in a zero-shot manner without the need for labeled demonstrations.
Empirically, we demonstrate on real-world medical schema matching benchmarks that Matchmaker outperforms previous ML-based approaches.
arXiv Detail & Related papers (2024-10-31T16:34:03Z) - Large Language Models Based JSON Parser Fuzzing for Bug Discovery and Behavioral Analysis [0.0]
This research project focuses on leveraging Large Language Models (LLMs) to enhance testing.
The primary objectives are to generate test cases and mutants using LLMs for the discovery of potential bugs in open-sources.
We aim to uncover underlying bugs, plus discovering (and overcoming) behavioral diversities.
arXiv Detail & Related papers (2024-10-29T07:23:43Z) - EPIC: Effective Prompting for Imbalanced-Class Data Synthesis in Tabular Data Classification via Large Language Models [39.347666307218006]
Large language models (LLMs) have demonstrated remarkable in-context learning capabilities across diverse applications.<n>We introduce EPIC, a novel approach that leverages balanced, grouped data samples and consistent formatting with unique variable mapping to guide LLMs in generating accurate synthetic data across all classes, even for imbalanced datasets.
arXiv Detail & Related papers (2024-04-15T17:49:16Z) - Effective Large Language Model Adaptation for Improved Grounding and Citation Generation [48.07830615309543]
This paper focuses on improving large language models (LLMs) by grounding their responses in retrieved passages and by providing citations.
We propose a new framework, AGREE, that improves the grounding from a holistic perspective.
Our framework tunes LLMs to selfground the claims in their responses and provide accurate citations to retrieved documents.
arXiv Detail & Related papers (2023-11-16T03:22:25Z) - L2CEval: Evaluating Language-to-Code Generation Capabilities of Large
Language Models [102.00201523306986]
We present L2CEval, a systematic evaluation of the language-to-code generation capabilities of large language models (LLMs)
We analyze the factors that potentially affect their performance, such as model size, pretraining data, instruction tuning, and different prompting methods.
In addition to assessing model performance, we measure confidence calibration for the models and conduct human evaluations of the output programs.
arXiv Detail & Related papers (2023-09-29T17:57:00Z) - Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data? [49.688233418425995]
Struc-Bench is a comprehensive benchmark featuring prominent Large Language Models (LLMs)
We propose two innovative metrics, P-Score (Prompting Score) and H-Score (Heuristical Score)
Our experiments show that applying our structure-aware fine-tuning to LLaMA-7B leads to substantial performance gains.
arXiv Detail & Related papers (2023-09-16T11:31:58Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Explicitly Modeling Syntax in Language Models with Incremental Parsing
and a Dynamic Oracle [88.65264818967489]
We propose a new syntax-aware language model: Syntactic Ordered Memory (SOM)
The model explicitly models the structure with an incremental and maintains the conditional probability setting of a standard language model.
Experiments show that SOM can achieve strong results in language modeling, incremental parsing and syntactic generalization tests.
arXiv Detail & Related papers (2020-10-21T17:39:15Z) - A Framework for End-to-End Learning on Semantic Tree-Structured Data [4.241801379755808]
A common form of structured data is what we term "semantic tree-structures"
We propose a novel framework for end-to-end learning on generic semantic tree-structured data.
arXiv Detail & Related papers (2020-02-13T18:49:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.