KALM: Knowledge-Aware Integration of Local, Document, and Global
Contexts for Long Document Understanding
- URL: http://arxiv.org/abs/2210.04105v2
- Date: Sun, 14 May 2023 23:50:19 GMT
- Title: KALM: Knowledge-Aware Integration of Local, Document, and Global
Contexts for Long Document Understanding
- Authors: Shangbin Feng, Zhaoxuan Tan, Wenqian Zhang, Zhenyu Lei, Yulia Tsvetkov
- Abstract summary: KALM is a Knowledge-Aware Language Model that jointly leverages knowledge in local, document-level, and global contexts.
It achieves state-of-the-art performance on six long document understanding tasks and datasets.
- Score: 27.4842322089676
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the advent of pretrained language models (LMs), increasing research
efforts have been focusing on infusing commonsense and domain-specific
knowledge to prepare LMs for downstream tasks. These works attempt to leverage
knowledge graphs, the de facto standard of symbolic knowledge representation,
along with pretrained LMs. While existing approaches have leveraged external
knowledge, it remains an open question how to jointly incorporate knowledge
graphs representing varying contexts, from local (e.g., sentence), to
document-level, to global knowledge, to enable knowledge-rich exchange across
these contexts. Such rich contextualization can be especially beneficial for
long document understanding tasks since standard pretrained LMs are typically
bounded by the input sequence length. In light of these challenges, we propose
KALM, a Knowledge-Aware Language Model that jointly leverages knowledge in
local, document-level, and global contexts for long document understanding.
KALM first encodes long documents and knowledge graphs into the three
knowledge-aware context representations. It then processes each context with
context-specific layers, followed by a context fusion layer that facilitates
knowledge exchange to derive an overarching document representation. Extensive
experiments demonstrate that KALM achieves state-of-the-art performance on six
long document understanding tasks and datasets. Further analyses reveal that
the three knowledge-aware contexts are complementary and they all contribute to
model performance, while the importance and information exchange patterns of
different contexts vary with respect to different tasks and datasets.
Related papers
- DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models [66.91204604417912]
This study aims to enhance generalizability of small VDU models by distilling knowledge from LLMs.
We present a new framework (called DocKD) that enriches the data generation process by integrating external document knowledge.
Experiments show that DocKD produces high-quality document annotations and surpasses the direct knowledge distillation approach.
arXiv Detail & Related papers (2024-10-04T00:53:32Z) - How Large Language Models Encode Context Knowledge? A Layer-Wise Probing
Study [27.23388511249688]
This paper investigates the layer-wise capability of large language models to encode knowledge.
We leverage the powerful generative capability of ChatGPT to construct probing datasets.
Experiments on conflicting and newly acquired knowledge show that LLMs prefer to encode more context knowledge in the upper layers.
arXiv Detail & Related papers (2024-02-25T11:15:42Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Beyond Factuality: A Comprehensive Evaluation of Large Language Models
as Knowledge Generators [78.63553017938911]
Large language models (LLMs) outperform information retrieval techniques for downstream knowledge-intensive tasks.
However, community concerns abound regarding the factuality and potential implications of using this uncensored knowledge.
We introduce CONNER, designed to evaluate generated knowledge from six important perspectives.
arXiv Detail & Related papers (2023-10-11T08:22:37Z) - TegTok: Augmenting Text Generation via Task-specific and Open-world
Knowledge [83.55215993730326]
We propose augmenting TExt Generation via Task-specific and Open-world Knowledge (TegTok) in a unified framework.
Our model selects knowledge entries from two types of knowledge sources through dense retrieval and then injects them into the input encoding and output decoding stages respectively.
arXiv Detail & Related papers (2022-03-16T10:37:59Z) - Knowledge Graph Augmented Network Towards Multiview Representation
Learning for Aspect-based Sentiment Analysis [96.53859361560505]
We propose a knowledge graph augmented network (KGAN) to incorporate external knowledge with explicitly syntactic and contextual information.
KGAN captures the sentiment feature representations from multiple perspectives, i.e., context-, syntax- and knowledge-based.
Experiments on three popular ABSA benchmarks demonstrate the effectiveness and robustness of our KGAN.
arXiv Detail & Related papers (2022-01-13T08:25:53Z) - CoLAKE: Contextualized Language and Knowledge Embedding [81.90416952762803]
We propose the Contextualized Language and Knowledge Embedding (CoLAKE)
CoLAKE jointly learns contextualized representation for both language and knowledge with the extended objective.
We conduct experiments on knowledge-driven tasks, knowledge probing tasks, and language understanding tasks.
arXiv Detail & Related papers (2020-10-01T11:39:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.