Improving Interpretability via Explicit Word Interaction Graph Layer
- URL: http://arxiv.org/abs/2302.02016v1
- Date: Fri, 3 Feb 2023 21:56:32 GMT
- Title: Improving Interpretability via Explicit Word Interaction Graph Layer
- Authors: Arshdeep Sekhon, Hanjie Chen, Aman Shrivastava, Zhe Wang, Yangfeng Ji,
Yanjun Qi
- Abstract summary: We propose a trainable neural network layer that learns a global interaction graph between words and then selects more informative words.
Our layer, we call WIGRAPH, can plug into any neural network-based NLP text classifiers right after its word embedding layer.
- Score: 28.28660926203816
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent NLP literature has seen growing interest in improving model
interpretability. Along this direction, we propose a trainable neural network
layer that learns a global interaction graph between words and then selects
more informative words using the learned word interactions. Our layer, we call
WIGRAPH, can plug into any neural network-based NLP text classifiers right
after its word embedding layer. Across multiple SOTA NLP models and various NLP
datasets, we demonstrate that adding the WIGRAPH layer substantially improves
NLP models' interpretability and enhances models' prediction performance at the
same time.
Related papers
- Graph-Augmented Relation Extraction Model with LLMs-Generated Support Document [7.0421339410165045]
This study introduces a novel approach to sentence-level relation extraction (RE)
It integrates Graph Neural Networks (GNNs) with Large Language Models (LLMs) to generate contextually enriched support documents.
Our experiments, conducted on the CrossRE dataset, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-30T20:48:34Z) - Improving Neuron-level Interpretability with White-box Language Models [11.898535906016907]
We introduce a white-box transformer-like architecture named Coding RAte TransformEr (CRATE)
Our comprehensive experiments showcase significant improvements (up to 103% relative improvement) in neuron-level interpretability.
CRATE's increased interpretability comes from its enhanced ability to consistently and distinctively activate on relevant tokens.
arXiv Detail & Related papers (2024-10-21T19:12:33Z) - Parameter-Efficient Tuning Large Language Models for Graph Representation Learning [62.26278815157628]
We introduce Graph-aware.
Efficient Fine-Tuning - GPEFT, a novel approach for efficient graph representation learning.
We use a graph neural network (GNN) to encode structural information from neighboring nodes into a graph prompt.
We validate our approach through comprehensive experiments conducted on 8 different text-rich graphs, observing an average improvement of 2% in hit@1 and Mean Reciprocal Rank (MRR) in link prediction evaluations.
arXiv Detail & Related papers (2024-04-28T18:36:59Z) - Disentangled Representation Learning with Large Language Models for
Text-Attributed Graphs [57.052160123387104]
We present the Disentangled Graph-Text Learner (DGTL) model, which is able to enhance the reasoning and predicting capabilities of LLMs for TAGs.
Our proposed DGTL model incorporates graph structure information through tailored disentangled graph neural network (GNN) layers.
Experimental evaluations demonstrate the effectiveness of the proposed DGTL model on achieving superior or comparable performance over state-of-the-art baselines.
arXiv Detail & Related papers (2023-10-27T14:00:04Z) - Graph Neural Networks Provably Benefit from Structural Information: A
Feature Learning Perspective [53.999128831324576]
Graph neural networks (GNNs) have pioneered advancements in graph representation learning.
This study investigates the role of graph convolution within the context of feature learning theory.
arXiv Detail & Related papers (2023-06-24T10:21:11Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - Leveraging Graph-based Cross-modal Information Fusion for Neural Sign
Language Translation [46.825957917649795]
Sign Language (SL), as the mother tongue of the deaf community, is a special visual language that most hearing people cannot understand.
We propose a novel neural SLT model with multi-modal feature fusion based on the dynamic graph.
We are the first to introduce graph neural networks, for fusing multi-modal information, into neural sign language translation models.
arXiv Detail & Related papers (2022-11-01T15:26:22Z) - A semantic hierarchical graph neural network for text classification [1.439766998338892]
We propose a new hierarchical graph neural network (HieGNN) which extracts corresponding information from word-level, sentence-level and document-level respectively.
Experimental results on several benchmark datasets achieve better or similar results compared to several baseline methods.
arXiv Detail & Related papers (2022-09-15T03:59:31Z) - A Unified Understanding of Deep NLP Models for Text Classification [88.35418976241057]
We have developed a visual analysis tool, DeepNLPVis, to enable a unified understanding of NLP models for text classification.
The key idea is a mutual information-based measure, which provides quantitative explanations on how each layer of a model maintains the information of input words in a sample.
A multi-level visualization, which consists of a corpus-level, a sample-level, and a word-level visualization, supports the analysis from the overall training set to individual samples.
arXiv Detail & Related papers (2022-06-19T08:55:07Z) - Graph Neural Networks for Natural Language Processing: A Survey [64.36633422999905]
We present a comprehensive overview onGraph Neural Networks (GNNs) for Natural Language Processing.
We propose a new taxonomy of GNNs for NLP, which organizes existing research of GNNs for NLP along three axes: graph construction,graph representation learning, and graph based encoder-decoder models.
arXiv Detail & Related papers (2021-06-10T23:59:26Z) - How transfer learning impacts linguistic knowledge in deep NLP models? [22.035813865470956]
Deep NLP models learn non-trivial amount of linguistic knowledge, captured at different layers of the model.
We investigate how fine-tuning towards downstream NLP tasks impacts the learned linguistic knowledge.
arXiv Detail & Related papers (2021-05-31T17:43:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.