Large Language Models as Planning Domain Generators
- URL: http://arxiv.org/abs/2405.06650v1
- Date: Tue, 2 Apr 2024 19:39:23 GMT
- Title: Large Language Models as Planning Domain Generators
- Authors: James Oswald, Kavitha Srinivas, Harsha Kokel, Junkyu Lee, Michael Katz, Shirin Sohrabi,
- Abstract summary: Large language models (LLMs) can be used to generate planning domain models from simple textual descriptions.
LLMs exhibit a moderate level of proficiency in generating correct planning domains from natural language descriptions.
- Score: 21.61328475841776
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developing domain models is one of the few remaining places that require manual human labor in AI planning. Thus, in order to make planning more accessible, it is desirable to automate the process of domain model generation. To this end, we investigate if large language models (LLMs) can be used to generate planning domain models from simple textual descriptions. Specifically, we introduce a framework for automated evaluation of LLM-generated domains by comparing the sets of plans for domain instances. Finally, we perform an empirical analysis of 7 large language models, including coding and chat models across 9 different planning domains, and under three classes of natural language domain descriptions. Our results indicate that LLMs, particularly those with high parameter counts, exhibit a moderate level of proficiency in generating correct planning domains from natural language descriptions. Our code is available at https://github.com/IBM/NL2PDDL.
Related papers
- Learning to Generalize Unseen Domains via Multi-Source Meta Learning for Text Classification [71.08024880298613]
We study the multi-source Domain Generalization of text classification.
We propose a framework to use multiple seen domains to train a model that can achieve high accuracy in an unseen domain.
arXiv Detail & Related papers (2024-09-20T07:46:21Z) - Multi-agent Planning using Visual Language Models [2.2369578015657954]
Large Language Models (LLMs) and Visual Language Models (VLMs) are attracting increasing interest due to their improving performance and applications across various domains and tasks.
LLMs andVLMs can produce erroneous results, especially when a deep understanding of the problem domain is required.
We propose a multi-agent architecture for embodied task planning that operates without the need for specific data structures as input.
arXiv Detail & Related papers (2024-08-10T08:10:17Z) - BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models [56.89958793648104]
Large Language Models (LLMs) are versatile and capable of addressing a diverse range of tasks.
Previous approaches either conduct continuous pre-training with domain-specific data or employ retrieval augmentation to support general LLMs.
We present a novel framework named BLADE, which enhances Black-box LArge language models with small Domain-spEcific models.
arXiv Detail & Related papers (2024-03-27T08:57:21Z) - PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset [0.0]
We present PARADISE, an abductive reasoning task using Q&A format on practical procedural text sourced from wikiHow.
It involves warning and tip inference tasks directly associated with goals, excluding intermediary steps, with the aim of testing the ability of the models to infer implicit knowledge of the plan solely from the given goal.
Our experiments, utilizing fine-tuned language models and zero-shot prompting, reveal the effectiveness of task-specific small models over large language models in most scenarios.
arXiv Detail & Related papers (2024-03-05T18:01:59Z) - ChipNeMo: Domain-Adapted LLMs for Chip Design [19.43613652552849]
ChipNeMo aims to explore the applications of large language models (LLMs) for industrial chip design.
We adopt the following domain adaptation techniques: domain-adaptive tokenization, domain-adaptive continued pretraining, model alignment with domain-specific instructions, and domain-adapted retrieval models.
arXiv Detail & Related papers (2023-10-31T22:35:58Z) - L2CEval: Evaluating Language-to-Code Generation Capabilities of Large
Language Models [102.00201523306986]
We present L2CEval, a systematic evaluation of the language-to-code generation capabilities of large language models (LLMs)
We analyze the factors that potentially affect their performance, such as model size, pretraining data, instruction tuning, and different prompting methods.
In addition to assessing model performance, we measure confidence calibration for the models and conduct human evaluations of the output programs.
arXiv Detail & Related papers (2023-09-29T17:57:00Z) - Progressive Generation of Long Text with Pretrained Language Models [83.62523163717448]
Large-scale language models (LMs) pretrained on massive corpora of text, such as GPT-2, are powerful open-domain text generators.
It is still challenging for such models to generate coherent long passages of text, especially when the models are fine-tuned to the target domain on a small corpus.
We propose a simple but effective method of generating text in a progressive manner, inspired by generating images from low to high resolution.
arXiv Detail & Related papers (2020-06-28T21:23:05Z) - DomBERT: Domain-oriented Language Model for Aspect-based Sentiment
Analysis [71.40586258509394]
We propose DomBERT, an extension of BERT to learn from both in-domain corpus and relevant domain corpora.
Experiments are conducted on an assortment of tasks in aspect-based sentiment analysis, demonstrating promising results.
arXiv Detail & Related papers (2020-04-28T21:07:32Z) - Unsupervised Domain Clusters in Pretrained Language Models [61.832234606157286]
We show that massive pre-trained language models implicitly learn sentence representations that cluster by domains without supervision.
We propose domain data selection methods based on such models.
We evaluate our data selection methods for neural machine translation across five diverse domains.
arXiv Detail & Related papers (2020-04-05T06:22:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.