Few-shot Subgoal Planning with Language Models
- URL: http://arxiv.org/abs/2205.14288v1
- Date: Sat, 28 May 2022 01:03:30 GMT
- Title: Few-shot Subgoal Planning with Language Models
- Authors: Lajanugen Logeswaran, Yao Fu, Moontae Lee, Honglak Lee
- Abstract summary: We show that language priors encoded in pre-trained language models allow us to infer fine-grained subgoal sequences.
In contrast to recent methods which make strong assumptions about subgoal supervision, our experiments show that language models can infer detailed subgoal sequences without any fine-tuning.
- Score: 58.11102061150875
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Pre-trained large language models have shown successful progress in many
language understanding benchmarks. This work explores the capability of these
models to predict actionable plans in real-world environments. Given a text
instruction, we show that language priors encoded in pre-trained language
models allow us to infer fine-grained subgoal sequences. In contrast to recent
methods which make strong assumptions about subgoal supervision, our
experiments show that language models can infer detailed subgoal sequences from
few training sequences without any fine-tuning. We further propose a simple
strategy to re-rank language model predictions based on interaction and
feedback from the environment. Combined with pre-trained navigation and visual
reasoning components, our approach demonstrates competitive performance on
subgoal prediction and task completion in the ALFRED benchmark compared to
prior methods that assume more subgoal supervision.
Related papers
- Cross-Lingual Supervision improves Large Language Models Pre-training [36.932380291416365]
We demonstrate that pre-training Large Language Models on a mixture of a self-supervised Language Modeling objective and the supervised Machine Translation objective, yields models with better in-context learning abilities.
As pre-training is a very resource-intensive process and a grid search on the best mixing ratio between the two objectives is prohibitively expensive, we propose a simple yet effective strategy to learn it during pre-training.
arXiv Detail & Related papers (2023-05-19T16:14:07Z) - Towards Linguistically Informed Multi-Objective Pre-Training for Natural
Language Inference [0.38233569758620045]
We introduce a linguistically enhanced combination of pre-training methods for transformers.
The pre-training objectives include POS-tagging, synset prediction based on semantic knowledge graphs, and parent prediction based on dependency parse trees.
Our approach achieves competitive results on the Natural Language Inference task, compared to the state of the art.
arXiv Detail & Related papers (2022-12-14T10:50:13Z) - Language Model Pre-Training with Sparse Latent Typing [66.75786739499604]
We propose a new pre-training objective, Sparse Latent Typing, which enables the model to sparsely extract sentence-level keywords with diverse latent types.
Experimental results show that our model is able to learn interpretable latent type categories in a self-supervised manner without using any external knowledge.
arXiv Detail & Related papers (2022-10-23T00:37:08Z) - Forging Multiple Training Objectives for Pre-trained Language Models via
Meta-Learning [97.28779163988833]
Multiple pre-training objectives fill the vacancy of the understanding capability of single-objective language modeling.
We propose textitMOMETAS, a novel adaptive sampler based on meta-learning, which learns the latent sampling pattern on arbitrary pre-training objectives.
arXiv Detail & Related papers (2022-10-19T04:38:26Z) - A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities.
We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention.
Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z) - Language Models are Few-shot Multilingual Learners [66.11011385895195]
We evaluate the multilingual skills of the GPT and T5 models in conducting multi-class classification on non-English languages.
We show that, given a few English examples as context, pre-trained language models can predict not only English test samples but also non-English ones.
arXiv Detail & Related papers (2021-09-16T03:08:22Z) - SLM: Learning a Discourse Language Representation with Sentence
Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation.
We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.