Related papers: Generating consistent PDDL domains with Large Language Models

Generating consistent PDDL domains with Large Language Models

URL: http://arxiv.org/abs/2404.07751v1
Date: Thu, 11 Apr 2024 13:48:48 GMT
Title: Generating consistent PDDL domains with Large Language Models
Authors: Pavel Smirnov, Frank Joublin, Antonello Ceravola, Michael Gienger,
Abstract summary: Large Language Models (LLMs) are capable of transforming natural language domain descriptions into plausibly looking PDDL markup. We present a novel concept to significantly improve the quality of LLM-generated PDDL models by performing automated consistency checking during the generation process. Although the proposed consistency checking strategies still can't guarantee absolute correctness of generated models, they can serve as valuable source of feedback reducing the amount of correction efforts expected from a human in the loop.
Score: 4.8551773468225745
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) are capable of transforming natural language domain descriptions into plausibly looking PDDL markup. However, ensuring that actions are consistent within domains still remains a challenging task. In this paper we present a novel concept to significantly improve the quality of LLM-generated PDDL models by performing automated consistency checking during the generation process. Although the proposed consistency checking strategies still can't guarantee absolute correctness of generated models, they can serve as valuable source of feedback reducing the amount of correction efforts expected from a human in the loop. We demonstrate the capabilities of our error detection approach on a number of classical and custom planning domains (logistics, gripper, tyreworld, household, pizza).

Related papers

Generating Symbolic World Models via Test-time Scaling of Large Language Models [28.258707611580643]
Planning Domain Definition Language (PDDL) is leveraged as a planning abstraction that enables precise and formal state descriptions. We introduce a simple yet effective algorithm, which first employs a Best-of-N sampling approach to improve the quality of the initial solution and then refines the solution in a fine-grained manner with verbalized machine learning. Our method outperforms o1-mini by a considerable margin in the generation of PDDL domain, achieving over 50% success rate on two tasks.
arXiv Detail & Related papers (2025-02-07T07:52:25Z)
DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization [61.492590008258986]
Large language models (LLMs) deliver impressive results but face challenges from increasing model sizes and computational costs. We propose DRPruning, which incorporates distributionally robust optimization to restore balanced performance across domains.
arXiv Detail & Related papers (2024-11-21T12:02:39Z)
Leveraging Environment Interaction for Automated PDDL Translation and Planning with Large Language Models [7.3238629831871735]
Large Language Models (LLMs) have shown remarkable performance in various natural language tasks. Planning problems into the Planning Domain Definition Language (PDDL) has been proposed as a potential solution. We propose a novel approach that leverages LLMs and environment feedback to automatically generate PDDL domain and problem description files.
arXiv Detail & Related papers (2024-07-17T19:50:51Z)
DALD: Improving Logits-based Detector without Logits from Black-box LLMs [56.234109491884126]
Large Language Models (LLMs) have revolutionized text generation, producing outputs that closely mimic human writing. We present Distribution-Aligned LLMs Detection (DALD), an innovative framework that redefines the state-of-the-art performance in black-box text detection. DALD is designed to align the surrogate model's distribution with that of unknown target LLMs, ensuring enhanced detection capability and resilience against rapid model iterations.
arXiv Detail & Related papers (2024-06-07T19:38:05Z)
COPAL: Continual Pruning in Large Language Generative Models [23.747878534962663]
COPAL is an algorithm developed for pruning large language generative models under a continual model adaptation setting. Our empirical evaluation on a various size of LLMs show that COPAL outperforms baseline models.
arXiv Detail & Related papers (2024-05-02T18:24:41Z)
Adapting Large Language Models for Content Moderation: Pitfalls in Data Engineering and Supervised Fine-tuning [79.53130089003986]
Large Language Models (LLMs) have become a feasible solution for handling tasks in various domains. In this paper, we introduce how to fine-tune a LLM model that can be privately deployed for content moderation.
arXiv Detail & Related papers (2023-10-05T09:09:44Z)
Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models [23.881825575095945]
Large pre-trained language models (PLMs) have demonstrated strong performance on natural language understanding (NLU) tasks through fine-tuning. However, fine-tuned models still suffer from overconfident predictions, especially in out-of-domain settings. We demonstrate that the PLMs are well-calibrated on the masked language modeling task with robust predictive confidence under domain shift. We show that preserving pre-trained features can improve the calibration of fine-tuned language models.
arXiv Detail & Related papers (2023-05-30T17:35:31Z)
Precision-Recall Divergence Optimization for Generative Modeling with GANs and Normalizing Flows [54.050498411883495]
We develop a novel training method for generative models, such as Generative Adversarial Networks and Normalizing Flows. We show that achieving a specified precision-recall trade-off corresponds to minimizing a unique $f$-divergence from a family we call the textitPR-divergences. Our approach improves the performance of existing state-of-the-art models like BigGAN in terms of either precision or recall when tested on datasets such as ImageNet.
arXiv Detail & Related papers (2023-05-30T10:07:17Z)
Leveraging Pre-trained Large Language Models to Construct and Utilize World Models for Model-based Task Planning [39.29964085305846]
Methods that use pre-trained large language models directly as planners are currently impractical due to limited correctness of plans. In this work, we introduce a novel alternative paradigm that constructs an explicit world (domain) model in planning domain definition language (PDDL) and then uses it to plan with sound domain-independent planners.
arXiv Detail & Related papers (2023-05-24T08:59:15Z)
Robustness, Evaluation and Adaptation of Machine Learning Models in the Wild [4.304803366354879]
We study causes of impaired robustness to domain shifts and present algorithms for training domain robust models. A key source of model brittleness is due to domain overfitting, which our new training algorithms suppress and instead encourage domain-general hypotheses.
arXiv Detail & Related papers (2023-03-05T21:41:16Z)
Tailoring Language Generation Models under Total Variation Distance [55.89964205594829]
The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method. We develop practical bounds to apply it to language generation. We introduce the TaiLr objective that balances the tradeoff of estimating TVD.
arXiv Detail & Related papers (2023-02-26T16:32:52Z)
Model Reprogramming: Resource-Efficient Cross-Domain Machine Learning [65.268245109828]
In data-rich domains such as vision, language, and speech, deep learning prevails to deliver high-performance task-specific models. Deep learning in resource-limited domains still faces multiple challenges including (i) limited data, (ii) constrained model development cost, and (iii) lack of adequate pre-trained models for effective finetuning. Model reprogramming enables resource-efficient cross-domain machine learning by repurposing a well-developed pre-trained model from a source domain to solve tasks in a target domain without model finetuning.
arXiv Detail & Related papers (2022-02-22T02:33:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.