Knowledge-Aware Procedural Text Understanding with Multi-Stage Training
- URL: http://arxiv.org/abs/2009.13199v2
- Date: Sat, 13 Feb 2021 14:28:25 GMT
- Title: Knowledge-Aware Procedural Text Understanding with Multi-Stage Training
- Authors: Zhihan Zhang, Xiubo Geng, Tao Qin, Yunfang Wu, Daxin Jiang
- Abstract summary: We focus on the task of procedural text understanding, which aims to comprehend such documents and track entities' states and locations during a process.
Two challenges, the difficulty of commonsense reasoning and data insufficiency, still remain unsolved.
We propose a novel KnOwledge-Aware proceduraL text understAnding (KOALA) model, which effectively leverages multiple forms of external knowledge.
- Score: 110.93934567725826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Procedural text describes dynamic state changes during a step-by-step natural
process (e.g., photosynthesis). In this work, we focus on the task of
procedural text understanding, which aims to comprehend such documents and
track entities' states and locations during a process. Although recent
approaches have achieved substantial progress, their results are far behind
human performance. Two challenges, the difficulty of commonsense reasoning and
data insufficiency, still remain unsolved, which require the incorporation of
external knowledge bases. Previous works on external knowledge injection
usually rely on noisy web mining tools and heuristic rules with limited
applicable scenarios. In this paper, we propose a novel KnOwledge-Aware
proceduraL text understAnding (KOALA) model, which effectively leverages
multiple forms of external knowledge in this task. Specifically, we retrieve
informative knowledge triples from ConceptNet and perform knowledge-aware
reasoning while tracking the entities. Besides, we employ a multi-stage
training schema which fine-tunes the BERT model over unlabeled data collected
from Wikipedia before further fine-tuning it on the final model. Experimental
results on two procedural text datasets, ProPara and Recipes, verify the
effectiveness of the proposed methods, in which our model achieves
state-of-the-art performance in comparison to various baselines.
Related papers
- A Universal Prompting Strategy for Extracting Process Model Information from Natural Language Text using Large Language Models [0.8899670429041453]
We show that generative large language models (LLMs) can solve NLP tasks with very high quality without the need for extensive data.
Based on a novel prompting strategy, we show that LLMs are able to outperform state-of-the-art machine learning approaches.
arXiv Detail & Related papers (2024-07-26T06:39:35Z) - Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos [16.333295670635557]
We explore the capability of an agent to construct a logical sequence of action steps, thereby assembling a strategic procedural plan.
This plan is crucial for navigating from an initial visual observation to a target visual outcome, as depicted in real-life instructional videos.
We coin our approach KEPP, a novel Knowledge-Enhanced Procedure Planning system, which harnesses a probabilistic procedural knowledge graph extracted from training data.
arXiv Detail & Related papers (2024-03-05T08:55:51Z) - Can LMs Learn New Entities from Descriptions? Challenges in Propagating
Injected Knowledge [72.63368052592004]
We study LMs' abilities to make inferences based on injected facts (or propagate those facts)
We find that existing methods for updating knowledge show little propagation of injected knowledge.
Yet, prepending entity definitions in an LM's context improves performance across all settings.
arXiv Detail & Related papers (2023-05-02T17:59:46Z) - Transferring Procedural Knowledge across Commonsense Tasks [17.929737518694616]
We study the ability of AI models to transfer procedural knowledge to novel narrative tasks in a transparent manner.
We design LEAP: a comprehensive framework that integrates state-of-the-art modeling architectures, training regimes, and augmentation strategies.
Our experiments with in- and out-of-domain tasks reveal insights into the interplay of different architectures, training regimes, and augmentation strategies.
arXiv Detail & Related papers (2023-04-26T23:24:50Z) - Exploring External Knowledge for Accurate modeling of Visual and
Language Problems [2.7190267444272056]
This dissertation focuses on visual and language understanding which involves many challenging tasks.
The state-of-the-art methods for solving these problems usually involves only two parts: source data and target labels.
We developed a methodology that we can first extract external knowledge and then integrate it with the original models.
arXiv Detail & Related papers (2023-01-27T02:01:50Z) - Recitation-Augmented Language Models [85.30591349383849]
We show that RECITE is a powerful paradigm for knowledge-intensive NLP tasks.
Specifically, we show that by utilizing recitation as the intermediate step, a recite-and-answer scheme can achieve new state-of-the-art performance.
arXiv Detail & Related papers (2022-10-04T00:49:20Z) - Ered: Enhanced Text Representations with Entities and Descriptions [5.977668609935748]
External knowledge,e.g., entities and entity descriptions, can help humans understand texts.
This paper aims to explicitly include both entities and entity descriptions in the fine-tuning stage.
We conducted experiments on four knowledge-oriented tasks and two common tasks, and the results achieved new state-of-the-art on several datasets.
arXiv Detail & Related papers (2022-08-18T16:51:16Z) - TegTok: Augmenting Text Generation via Task-specific and Open-world
Knowledge [83.55215993730326]
We propose augmenting TExt Generation via Task-specific and Open-world Knowledge (TegTok) in a unified framework.
Our model selects knowledge entries from two types of knowledge sources through dense retrieval and then injects them into the input encoding and output decoding stages respectively.
arXiv Detail & Related papers (2022-03-16T10:37:59Z) - Procedural Reading Comprehension with Attribute-Aware Context Flow [85.34405161075276]
Procedural texts often describe processes that happen over entities.
We introduce an algorithm for procedural reading comprehension by translating the text into a general formalism.
arXiv Detail & Related papers (2020-03-31T00:06:29Z) - Exploring the Limits of Transfer Learning with a Unified Text-to-Text
Transformer [64.22926988297685]
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP)
In this paper, we explore the landscape of introducing transfer learning techniques for NLP by a unified framework that converts all text-based language problems into a text-to-text format.
arXiv Detail & Related papers (2019-10-23T17:37:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.