Integrating a Heterogeneous Graph with Entity-aware Self-attention using
Relative Position Labels for Reading Comprehension Model
- URL: http://arxiv.org/abs/2307.10443v3
- Date: Sat, 13 Jan 2024 15:24:59 GMT
- Title: Integrating a Heterogeneous Graph with Entity-aware Self-attention using
Relative Position Labels for Reading Comprehension Model
- Authors: Shima Foolad and Kourosh Kiani
- Abstract summary: We introduce a novel attention pattern that integrates reasoning knowledge derived from a heterogeneous graph into the transformer architecture without relying on external knowledge.
The proposed attention pattern comprises three key elements: global-local attention for word tokens, graph attention for entity tokens that exhibit strong attention towards tokens connected in the graph, and the consideration of the type of relationship between each entity token and word token.
Our model outperforms both the cutting-edge LUKE-Graph and the baseline LUKE model across two distinct datasets.
- Score: 14.721615285883429
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Despite the significant progress made by transformer models in machine
reading comprehension tasks, they still fall short in handling complex
reasoning tasks due to the absence of explicit knowledge in the input sequence.
To address this limitation, many recent works have proposed injecting external
knowledge into the model. However, selecting relevant external knowledge,
ensuring its availability, and requiring additional processing steps remain
challenging. In this paper, we introduce a novel attention pattern that
integrates reasoning knowledge derived from a heterogeneous graph into the
transformer architecture without relying on external knowledge. The proposed
attention pattern comprises three key elements: global-local attention for word
tokens, graph attention for entity tokens that exhibit strong attention towards
tokens connected in the graph as opposed to those unconnected, and the
consideration of the type of relationship between each entity token and word
token. This results in optimized attention between the two if a relationship
exists. The pattern is coupled with special relative position labels, allowing
it to integrate with LUKE's entity-aware self-attention mechanism. The
experimental findings corroborate that our model outperforms both the
cutting-edge LUKE-Graph and the baseline LUKE model across two distinct
datasets: ReCoRD, emphasizing commonsense reasoning, and WikiHop, focusing on
multi-hop reasoning challenges.
Related papers
- EnriCo: Enriched Representation and Globally Constrained Inference for Entity and Relation Extraction [3.579132482505273]
Joint entity and relation extraction plays a pivotal role in various applications, notably in the construction of knowledge graphs.
Existing approaches often fall short of two key aspects: richness of representation and coherence in output structure.
In our work, we introduce EnriCo, which mitigates these shortcomings.
arXiv Detail & Related papers (2024-04-18T20:15:48Z) - Relation Rectification in Diffusion Model [64.84686527988809]
We introduce a novel task termed Relation Rectification, aiming to refine the model to accurately represent a given relationship it initially fails to generate.
We propose an innovative solution utilizing a Heterogeneous Graph Convolutional Network (HGCN)
The lightweight HGCN adjusts the text embeddings generated by the text encoder, ensuring the accurate reflection of the textual relation in the embedding space.
arXiv Detail & Related papers (2024-03-29T15:54:36Z) - Message Intercommunication for Inductive Relation Reasoning [49.731293143079455]
We develop a novel inductive relation reasoning model called MINES.
We introduce a Message Intercommunication mechanism on the Neighbor-Enhanced Subgraph.
Our experiments show that MINES outperforms existing state-of-the-art models.
arXiv Detail & Related papers (2023-05-23T13:51:46Z) - LUKE-Graph: A Transformer-based Approach with Gated Relational Graph
Attention for Cloze-style Reading Comprehension [13.173307471333619]
We propose the LUKE-Graph, a model that builds a heterogeneous graph based on the intuitive relationships between entities in a document.
We then use the Attention reading (RGAT) to fuse the graph's reasoning information and the contextual representation encoded by the pre-trained LUKE model.
Experimental results demonstrate that the LUKE-Graph achieves state-of-the-art performance with commonsense reasoning.
arXiv Detail & Related papers (2023-03-12T14:31:44Z) - A Dynamic Graph Interactive Framework with Label-Semantic Injection for
Spoken Language Understanding [43.48113981442722]
We propose a framework termed DGIF, which first leverages the semantic information of labels to give the model additional signals and enriched priors.
We propose a novel approach to construct the interactive graph based on the injection of label semantics, which can automatically update the graph to better alleviate error propagation.
arXiv Detail & Related papers (2022-11-08T05:57:46Z) - Mutual Graph Learning for Camouflaged Object Detection [31.422775969808434]
A major challenge is that intrinsic similarities between foreground objects and background surroundings make the features extracted by deep model indistinguishable.
We design a novel Mutual Graph Learning model, which generalizes the idea of conventional mutual learning from regular grids to the graph domain.
In contrast to most mutual learning approaches that use a shared function to model all between-task interactions, MGL is equipped with typed functions for handling different complementary relations.
arXiv Detail & Related papers (2021-04-03T10:14:39Z) - Understanding Neural Abstractive Summarization Models via Uncertainty [54.37665950633147]
seq2seq abstractive summarization models generate text in a free-form manner.
We study the entropy, or uncertainty, of the model's token-level predictions.
We show that uncertainty is a useful perspective for analyzing summarization and text generation models more broadly.
arXiv Detail & Related papers (2020-10-15T16:57:27Z) - ConsNet: Learning Consistency Graph for Zero-Shot Human-Object
Interaction Detection [101.56529337489417]
We consider the problem of Human-Object Interaction (HOI) Detection, which aims to locate and recognize HOI instances in the form of human, action, object> in images.
We argue that multi-level consistencies among objects, actions and interactions are strong cues for generating semantic representations of rare or previously unseen HOIs.
Our model takes visual features of candidate human-object pairs and word embeddings of HOI labels as inputs, maps them into visual-semantic joint embedding space and obtains detection results by measuring their similarities.
arXiv Detail & Related papers (2020-08-14T09:11:18Z) - Jointly Cross- and Self-Modal Graph Attention Network for Query-Based
Moment Localization [77.21951145754065]
We propose a novel Cross- and Self-Modal Graph Attention Network (CSMGAN) that recasts this task as a process of iterative messages passing over a joint graph.
Our CSMGAN is able to effectively capture high-order interactions between two modalities, thus enabling a further precise localization.
arXiv Detail & Related papers (2020-08-04T08:25:24Z) - Attention improves concentration when learning node embeddings [1.2233362977312945]
Given nodes labelled with search query text, we want to predict links to related queries that share products.
Experiments with a range of deep neural architectures show that simple feedforward networks with an attention mechanism perform best for learning embeddings.
We propose an analytically tractable model of query generation, AttEST, that views both products and the query text as vectors embedded in a latent space.
arXiv Detail & Related papers (2020-06-11T21:21:12Z) - SEEK: Segmented Embedding of Knowledge Graphs [77.5307592941209]
We propose a lightweight modeling framework that can achieve highly competitive relational expressiveness without increasing the model complexity.
Our framework focuses on the design of scoring functions and highlights two critical characteristics: 1) facilitating sufficient feature interactions; 2) preserving both symmetry and antisymmetry properties of relations.
arXiv Detail & Related papers (2020-05-02T15:15:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.