Planning with Logical Graph-based Language Model for Instruction Generation
- URL: http://arxiv.org/abs/2308.13782v2
- Date: Fri, 5 Jul 2024 15:24:03 GMT
- Title: Planning with Logical Graph-based Language Model for Instruction Generation
- Authors: Fan Zhang, Kebing Jin, Hankz Hankui Zhuo,
- Abstract summary: We propose a graph-based language model, Logical-GLM, to infuse logic into language models.
We generate logical skeletons to guide language model training, infusing domain knowledge into language models.
Our approach can generate instructional texts with more correct logic owing to the internalized domain knowledge.
- Score: 9.70880913062245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the superior performance of large language models to generate natural language texts, it is hard to generate texts with correct logic according to a given task, due to the difficulties for neural models to capture implied rules from free-form texts. In this paper, we propose a novel graph-based language model, Logical-GLM, to infuse logic into language models for more valid text generation and interpretability. Specifically, we first capture information from natural language instructions and construct logical bayes graphs that generally describe domains. Next, we generate logical skeletons to guide language model training, infusing domain knowledge into language models. Finally, we alternately optimize the searching policy of graphs and language models until convergence. The experimental results show that Logical-GLM is both effective and efficient compared with traditional language models, despite using smaller-scale training data and fewer parameters. Our approach can generate instructional texts with more correct logic owing to the internalized domain knowledge. Moreover, the usage of logical graphs reflects the inner mechanism of the language models, which improves the interpretability of black-box models.
Related papers
- Small Language Models Like Small Vocabularies: Probing the Linguistic Abilities of Grapheme- and Phoneme-Based Baby Llamas [7.585433383340306]
We show that small models based on the Llama architecture can achieve strong linguistic performance on standard syntactic and novel lexical/phonetic benchmarks.
Our findings suggest a promising direction for creating more linguistically plausible language models that are better suited for computational studies of language acquisition and processing.
arXiv Detail & Related papers (2024-10-02T12:36:08Z) - CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
We propose CodeGRAG, a Graphical Retrieval Augmented Code Generation framework to enhance the performance of LLMs.
CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to fill the gap between programming languages and natural language.
Various experiments and ablations are done on four datasets including both the C++ and python languages to validate the hard meta-graph prompt, the soft prompting technique, and the effectiveness of the objectives for pretrained GNN expert.
arXiv Detail & Related papers (2024-05-03T02:48:55Z) - Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - Empower Nested Boolean Logic via Self-Supervised Curriculum Learning [67.46052028752327]
We find that any pre-trained language models even including large language models only behave like a random selector in the face of multi-nested logic.
To empower language models with this fundamental capability, this paper proposes a new self-supervised learning method textitCurriculum Logical Reasoning (textscClr)
arXiv Detail & Related papers (2023-10-09T06:54:02Z) - APOLLO: A Simple Approach for Adaptive Pretraining of Language Models
for Logical Reasoning [73.3035118224719]
We propose APOLLO, an adaptively pretrained language model that has improved logical reasoning abilities.
APOLLO performs comparably on ReClor and outperforms baselines on LogiQA.
arXiv Detail & Related papers (2022-12-19T07:40:02Z) - MURMUR: Modular Multi-Step Reasoning for Semi-Structured Data-to-Text
Generation [102.20036684996248]
We propose MURMUR, a neuro-symbolic modular approach to text generation from semi-structured data with multi-step reasoning.
We conduct experiments on two data-to-text generation tasks like WebNLG and LogicNLG.
arXiv Detail & Related papers (2022-12-16T17:36:23Z) - The Whole Truth and Nothing But the Truth: Faithful and Controllable
Dialogue Response Generation with Dataflow Transduction and Constrained
Decoding [65.34601470417967]
We describe a hybrid architecture for dialogue response generation that combines the strengths of neural language modeling and rule-based generation.
Our experiments show that this system outperforms both rule-based and learned approaches in human evaluations of fluency, relevance, and truthfulness.
arXiv Detail & Related papers (2022-09-16T09:00:49Z) - Language Models are not Models of Language [0.0]
Transfer learning has enabled large deep learning neural networks trained on the language modeling task to vastly improve performance.
We argue that the term language model is misleading because deep learning models are not theoretical models of language.
arXiv Detail & Related papers (2021-12-13T22:39:46Z) - Interpreting Language Models Through Knowledge Graph Extraction [42.97929497661778]
We compare BERT-based language models through snapshots of acquired knowledge at sequential stages of the training process.
We present a methodology to unveil a knowledge acquisition timeline by generating knowledge graph extracts from cloze "fill-in-the-blank" statements.
We extend this analysis to a comparison of pretrained variations of BERT models (DistilBERT, BERT-base, RoBERTa)
arXiv Detail & Related papers (2021-11-16T15:18:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.