Situated Language Learning via Interactive Narratives
- URL: http://arxiv.org/abs/2103.09977v1
- Date: Thu, 18 Mar 2021 01:55:16 GMT
- Title: Situated Language Learning via Interactive Narratives
- Authors: Prithviraj Ammanabrolu and Mark O. Riedl
- Abstract summary: This paper explores the question of how to imbue learning agents with the ability to understand and generate contextually relevant natural language.
Two key components in creating such agents are interactivity and environment grounding.
We discuss the unique challenges a text games' puzzle-like structure combined with natural language state-and-action spaces provides.
- Score: 16.67845396797253
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper provides a roadmap that explores the question of how to imbue
learning agents with the ability to understand and generate contextually
relevant natural language in service of achieving a goal. We hypothesize that
two key components in creating such agents are interactivity and environment
grounding, shown to be vital parts of language learning in humans, and posit
that interactive narratives should be the environments of choice for such
training these agents. These games are simulations in which an agent interacts
with the world through natural language -- "perceiving", "acting upon", and
"talking to" the world using textual descriptions, commands, and dialogue --
and as such exist at the intersection of natural language processing,
storytelling, and sequential decision making. We discuss the unique challenges
a text games' puzzle-like structure combined with natural language
state-and-action spaces provides: knowledge representation, commonsense
reasoning, and exploration. Beyond the challenges described so far, progress in
the realm of interactive narratives can be applied in adjacent problem domains.
These applications provide interesting challenges of their own as well as
extensions to those discussed so far. We describe three of them in detail: (1)
evaluating AI system's commonsense understanding by automatically creating
interactive narratives; (2) adapting abstract text-based policies to include
other modalities such as vision; and (3) enabling multi-agent and human-AI
collaboration in shared, situated worlds.
Related papers
- Enhancing HOI Detection with Contextual Cues from Large Vision-Language Models [56.257840490146]
ConCue is a novel approach for improving visual feature extraction in HOI detection.
We develop a transformer-based feature extraction module with a multi-tower architecture that integrates contextual cues into both instance and interaction detectors.
arXiv Detail & Related papers (2023-11-26T09:11:32Z) - Learning to Model the World with Language [100.76069091703505]
To interact with humans and act in the world, agents need to understand the range of language that people use and relate it to the visual world.
Our key idea is that agents should interpret such diverse language as a signal that helps them predict the future.
We instantiate this in Dynalang, an agent that learns a multimodal world model to predict future text and image representations.
arXiv Detail & Related papers (2023-07-31T17:57:49Z) - Tachikuma: Understading Complex Interactions with Multi-Character and
Novel Objects by Large Language Models [67.20964015591262]
We introduce a benchmark named Tachikuma, comprising a Multiple character and novel Object based interaction Estimation task and a supporting dataset.
The dataset captures log data from real-time communications during gameplay, providing diverse, grounded, and complex interactions for further explorations.
We present a simple prompting baseline and evaluate its performance, demonstrating its effectiveness in enhancing interaction understanding.
arXiv Detail & Related papers (2023-07-24T07:40:59Z) - Interactive Natural Language Processing [67.87925315773924]
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP.
This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept.
arXiv Detail & Related papers (2023-05-22T17:18:29Z) - Transforming Human-Centered AI Collaboration: Redefining Embodied Agents
Capabilities through Interactive Grounded Language Instructions [23.318236094953072]
Human intelligence's adaptability is remarkable, allowing us to adjust to new tasks and multi-modal environments swiftly.
The research community is actively pursuing the development of interactive "embodied agents"
These agents must possess the ability to promptly request feedback in case communication breaks down or instructions are unclear.
arXiv Detail & Related papers (2023-05-18T07:51:33Z) - Knowledge-enhanced Agents for Interactive Text Games [16.055119735473017]
We propose a knowledge-injection framework for improved functional grounding of agents in text-based games.
We consider two forms of domain knowledge that we inject into learning-based agents: memory of previous correct actions and affordances of relevant objects in the environment.
Our framework supports two representative model classes: reinforcement learning agents and language model agents.
arXiv Detail & Related papers (2023-05-08T23:31:39Z) - Collecting Interactive Multi-modal Datasets for Grounded Language
Understanding [66.30648042100123]
We formalized the collaborative embodied agent using natural language task.
We developed a tool for extensive and scalable data collection.
We collected the first dataset for interactive grounded language understanding.
arXiv Detail & Related papers (2022-11-12T02:36:32Z) - Learning Knowledge Graph-based World Models of Textual Environments [16.67845396797253]
This work focuses on the task of building world models of text-based game environments.
Our world model learns to simultaneously: (1) predict changes in the world caused by an agent's actions when representing the world as a knowledge graph; and (2) generate the set of contextually relevant natural language actions required to operate in the world.
arXiv Detail & Related papers (2021-06-17T15:45:54Z) - Modeling Worlds in Text [16.67845396797253]
We provide a dataset that enables the creation of learning agents that can build knowledge graph-based world models of interactive narratives.
Our dataset provides 24198 mappings between rich natural language observations and knowledge graphs.
The training data is collected across 27 games in multiple genres and contains a further 7836 heldout instances over 9 additional games in the test set.
arXiv Detail & Related papers (2021-06-17T15:02:16Z) - Experience Grounds Language [185.73483760454454]
Language understanding research is held back by a failure to relate language to the physical world it describes and to the social interactions it facilitates.
Despite the incredible effectiveness of language processing models to tackle tasks after being trained on text alone, successful linguistic communication relies on a shared experience of the world.
arXiv Detail & Related papers (2020-04-21T16:56:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.