Related papers: Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments

Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments

URL: http://arxiv.org/abs/2212.09736v2
Date: Wed, 3 May 2023 04:32:35 GMT
Title: Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments
Authors: Yu Gu, Xiang Deng, Yu Su
Abstract summary: Pangu is a generic framework for grounded language understanding. It capitalizes on the discriminative ability of LMs instead of their generative ability. Pangu enables, for the first time, effective few-shot in-context learning for KBQA with large LMs such as Codex.
Score: 11.496084599325807
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A key missing capacity of current language models (LMs) is grounding to real-world environments. Most existing work for grounded language understanding uses LMs to directly generate plans that can be executed in the environment to achieve the desired effects. It thereby casts the burden of ensuring grammaticality, faithfulness, and controllability all on the LMs. We propose Pangu, a generic framework for grounded language understanding that capitalizes on the discriminative ability of LMs instead of their generative ability. Pangu consists of a symbolic agent and a neural LM working in a concerted fashion: The agent explores the environment to incrementally construct valid plans, and the LM evaluates the plausibility of the candidate plans to guide the search process. A case study on the challenging problem of knowledge base question answering (KBQA), which features a massive environment, demonstrates the remarkable effectiveness and flexibility of Pangu: A BERT-base LM is sufficient for setting a new record on standard KBQA datasets, and larger LMs further bring substantial gains. Pangu also enables, for the first time, effective few-shot in-context learning for KBQA with large LMs such as Codex.

Related papers

TinyHelen's First Curriculum: Training and Evaluating Tiny Language Models in a Simpler Language Environment [30.93798042712827]
Training language models (LMs) and their application agents is increasingly costly due to large datasets and models. We propose a pipeline to refine text data by eliminating noise, minimizing vocabulary, and maintaining genre-specific patterns. Our experiments show that leaner pre-training boosts LM learning efficiency.
arXiv Detail & Related papers (2024-12-31T16:08:15Z)
Language Agents Meet Causality -- Bridging LLMs and Causal World Models [50.79984529172807]
We propose a framework that integrates causal representation learning with large language models. This framework learns a causal world model, with causal variables linked to natural language expressions. We evaluate the framework on causal inference and planning tasks across temporal scales and environmental complexities.
arXiv Detail & Related papers (2024-10-25T18:36:37Z)
Few-Shot Cross-Lingual Transfer for Prompting Large Language Models in Low-Resource Languages [0.0]
"prompting" is where a user provides a description of a task and some completed examples of the task to a PLM as context before prompting the PLM to perform the task on a new example. We consider three methods: few-shot prompting (prompt), language-adaptive fine-tuning (LAFT), and neural machine translation (translate) We find that translate and prompt settings are a compute-efficient and cost-effective method of few-shot prompting for the selected low-resource languages.
arXiv Detail & Related papers (2024-03-09T21:36:13Z)
Reliable, Adaptable, and Attributable Language Models with Retrieval [144.26890121729514]
Parametric language models (LMs) are trained on vast amounts of web data. They face practical challenges such as hallucinations, difficulty in adapting to new data distributions, and a lack of verifiability. We advocate for retrieval-augmented LMs to replace parametric LMs as the next generation of LMs.
arXiv Detail & Related papers (2024-03-05T18:22:33Z)
Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models [31.509994889286183]
We introduce Language Agent Tree Search (LATS) -- the first general framework that synergizes the capabilities of language models (LMs) in reasoning, acting, and planning. A key feature of our approach is the incorporation of an environment for external feedback, which offers a more deliberate and adaptive problem-solving mechanism. LATS achieves state-of-the-art pass@1 accuracy (92.7%) for programming on HumanEval with GPT-4 and demonstrates gradient-free performance (average score of 75.9) comparable to gradient-based fine-tuning for web navigation on WebShop with GPT
arXiv Detail & Related papers (2023-10-06T17:55:11Z)
Augmented Language Models: a Survey [55.965967655575454]
This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. We refer to them as Augmented Language Models (ALMs) The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks.
arXiv Detail & Related papers (2023-02-15T18:25:52Z)
In-Context Retrieval-Augmented Language Models [28.23702459322163]
We show that In-Context RALM builds on off-the-shelf general purpose retrievers to provide surprisingly large LM gains across model sizes and diverse corpora. We conclude that In-Context RALM has considerable potential to increase the prevalence of LM grounding.
arXiv Detail & Related papers (2023-01-31T20:26:16Z)
Prompting as Probing: Using Language Models for Knowledge Base Construction [1.6050172226234583]
We present ProP (Prompting as Probing), which utilizes GPT-3, a large Language Model originally proposed by OpenAI in 2020. ProP implements a multi-step approach that combines a variety of prompting techniques to achieve this. Our evaluation study indicates that these proposed techniques can substantially enhance the quality of the final predictions.
arXiv Detail & Related papers (2022-08-23T16:03:50Z)
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents [111.33545170562337]
We investigate the possibility of grounding high-level tasks, expressed in natural language, to a chosen set of actionable steps. We find that if pre-trained LMs are large enough and prompted appropriately, they can effectively decompose high-level tasks into low-level plans. We propose a procedure that conditions on existing demonstrations and semantically translates the plans to admissible actions.
arXiv Detail & Related papers (2022-01-18T18:59:45Z)
Language Model Prior for Low-Resource Neural Machine Translation [85.55729693003829]
We propose a novel approach to incorporate a LM as prior in a neural translation model (TM) We add a regularization term, which pushes the output distributions of the TM to be probable under the LM prior. Results on two low-resource machine translation datasets show clear improvements even with limited monolingual data.
arXiv Detail & Related papers (2020-04-30T16:29:56Z)
oLMpics -- On what Language Model Pre-training Captures [84.60594612120173]
We propose eight reasoning tasks, which require operations such as comparison, conjunction, and composition. A fundamental challenge is to understand whether the performance of a LM on a task should be attributed to the pre-trained representations or to the process of fine-tuning on the task data.
arXiv Detail & Related papers (2019-12-31T12:11:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.