Commonsense Knowledge from Scene Graphs for Textual Environments
- URL: http://arxiv.org/abs/2210.14162v1
- Date: Wed, 19 Oct 2022 03:09:17 GMT
- Title: Commonsense Knowledge from Scene Graphs for Textual Environments
- Authors: Tsunehiko Tanaka, Daiki Kimura, Michiaki Tatsubori
- Abstract summary: We investigate the advantage of employing commonsense reasoning obtained from visual datasets such as scene graph datasets.
Our proposed methods have higher and competitive performance than existing state-of-the-art methods.
- Score: 6.617487928813374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-based games are becoming commonly used in reinforcement learning as
real-world simulation environments. They are usually imperfect information
games, and their interactions are only in the textual modality. To challenge
these games, it is effective to complement the missing information by providing
knowledge outside the game, such as human common sense. However, such knowledge
has only been available from textual information in previous works. In this
paper, we investigate the advantage of employing commonsense reasoning obtained
from visual datasets such as scene graph datasets. In general, images convey
more comprehensive information compared with text for humans. This property
enables to extract commonsense relationship knowledge more useful for acting
effectively in a game. We compare the statistics of spatial relationships
available in Visual Genome (a scene graph dataset) and ConceptNet (a text-based
knowledge) to analyze the effectiveness of introducing scene graph datasets. We
also conducted experiments on a text-based game task that requires commonsense
reasoning. Our experimental results demonstrated that our proposed methods have
higher and competitive performance than existing state-of-the-art methods.
Related papers
- VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning [66.23296689828152]
We leverage the capabilities of Vision-and-Large-Language Models to enhance in-context emotion classification.
In the first stage, we propose prompting VLLMs to generate descriptions in natural language of the subject's apparent emotion.
In the second stage, the descriptions are used as contextual information and, along with the image input, are used to train a transformer-based architecture.
arXiv Detail & Related papers (2024-04-10T15:09:15Z) - Comprehensive Event Representations using Event Knowledge Graphs and
Natural Language Processing [0.0]
This work seeks to utilise and build on the growing body of work that uses findings from the field of natural language processing (NLP) to extract knowledge from text and build knowledge graphs.
Specifically, sub-event extraction is used as a way of creating sub-event-aware event representations.
These event representations are enriched through fine-grained location extraction and contextualised through the alignment of historically relevant quotes.
arXiv Detail & Related papers (2023-03-08T18:43:39Z) - Story Shaping: Teaching Agents Human-like Behavior with Stories [9.649246837532417]
We introduce Story Shaping, in which a reinforcement learning agent infers tacit knowledge from an exemplar story of how to accomplish a task.
An intrinsic reward is generated based on the similarity between the agent's inferred world state graph and the inferred story world graph.
We conducted experiments in text-based games requiring commonsense reasoning and shaping the behaviors of agents as virtual game characters.
arXiv Detail & Related papers (2023-01-24T16:19:09Z) - Leveraging Visual Knowledge in Language Tasks: An Empirical Study on
Intermediate Pre-training for Cross-modal Knowledge Transfer [61.34424171458634]
We study whether integrating visual knowledge into a language model can fill the gap.
Our experiments show that visual knowledge transfer can improve performance in both low-resource and fully supervised settings.
arXiv Detail & Related papers (2022-03-14T22:02:40Z) - One-shot Scene Graph Generation [130.57405850346836]
We propose Multiple Structured Knowledge (Relational Knowledgesense Knowledge) for the one-shot scene graph generation task.
Our method significantly outperforms existing state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2022-02-22T11:32:59Z) - Seeing the advantage: visually grounding word embeddings to better
capture human semantic knowledge [8.208534667678792]
Distributional semantic models capture word-level meaning that is useful in many natural language processing tasks.
We create visually grounded word embeddings by combining English text and images and compare them to popular text-based methods.
Our analysis shows that visually grounded embedding similarities are more predictive of the human reaction times than the purely text-based embeddings.
arXiv Detail & Related papers (2022-02-21T15:13:48Z) - Deep Reinforcement Learning with Stacked Hierarchical Attention for
Text-based Games [64.11746320061965]
We study reinforcement learning for text-based games, which are interactive simulations in the context of natural language.
We aim to conduct explicit reasoning with knowledge graphs for decision making, so that the actions of an agent are generated and supported by an interpretable inference procedure.
We extensively evaluate our method on a number of man-made benchmark games, and the experimental results demonstrate that our method performs better than existing text-based agents.
arXiv Detail & Related papers (2020-10-22T12:40:22Z) - Learning Dynamic Belief Graphs to Generalize on Text-Based Games [55.59741414135887]
Playing text-based games requires skills in processing natural language and sequential decision making.
In this work, we investigate how an agent can plan and generalize in text-based games using graph-structured representations learned end-to-end from raw text.
arXiv Detail & Related papers (2020-02-21T04:38:37Z) - Exploration Based Language Learning for Text-Based Games [72.30525050367216]
This work presents an exploration and imitation-learning-based agent capable of state-of-the-art performance in playing text-based computer games.
Text-based computer games describe their world to the player through natural language and expect the player to interact with the game using text.
These games are of interest as they can be seen as a testbed for language understanding, problem-solving, and language generation by artificial agents.
arXiv Detail & Related papers (2020-01-24T03:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.