BertGCN: Transductive Text Classification by Combining GCN and BERT
- URL: http://arxiv.org/abs/2105.05727v2
- Date: Thu, 13 May 2021 11:32:18 GMT
- Title: BertGCN: Transductive Text Classification by Combining GCN and BERT
- Authors: Yuxiao Lin, Yuxian Meng, Xiaofei Sun, Qinghong Han, Kun Kuang, Jiwei
Li and Fei Wu
- Abstract summary: BertGCN is a model that combines large scale pretraining and transductive learning for text classification.
BertGCN SOTA achieves performances on a wide range of text classification datasets.
- Score: 33.866453485862124
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In this work, we propose BertGCN, a model that combines large scale
pretraining and transductive learning for text classification. BertGCN
constructs a heterogeneous graph over the dataset and represents documents as
nodes using BERT representations. By jointly training the BERT and GCN modules
within BertGCN, the proposed model is able to leverage the advantages of both
worlds: large-scale pretraining which takes the advantage of the massive amount
of raw data and transductive learning which jointly learns representations for
both training data and unlabeled test data by propagating label influence
through graph convolution. Experiments show that BertGCN achieves SOTA
performances on a wide range of text classification datasets. Code is available
at https://github.com/ZeroRin/BertGCN.
Related papers
- You do not have to train Graph Neural Networks at all on text-attributed graphs [25.044734252779975]
We introduce TrainlessGNN, a linear GNN model capitalizing on the observation that text encodings from the same class often cluster together in a linear subspace.
Our experiments reveal that our trainless models can either match or even surpass their conventionally trained counterparts.
arXiv Detail & Related papers (2024-04-17T02:52:11Z) - Article Classification with Graph Neural Networks and Multigraphs [0.12499537119440243]
We propose a method to enhance the performance of article classification by enriching simple Graph Neural Network (GNN) pipelines with multi-graph representations.
fully supervised transductive node classification experiments are conducted on the Open Graph Benchmark OGBN-arXiv dataset and the PubMed diabetes dataset.
Results demonstrate that multi-graphs consistently improve the performance of a variety of GNN models compared to the default graphs.
arXiv Detail & Related papers (2023-09-20T14:18:04Z) - A Robust Stacking Framework for Training Deep Graph Models with
Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data.
The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN.
Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - BERT4GCN: Using BERT Intermediate Layers to Augment GCN for Aspect-based
Sentiment Classification [2.982218441172364]
Graph-based Sentiment Classification (ABSC) approaches have yielded state-of-the-art results, expecially when equipped with contextual word embedding from pre-training language models (PLMs)
We propose a novel model, BERT4GCN, which integrates the grammatical sequential features from the PLM of BERT, and the syntactic knowledge from dependency graphs.
arXiv Detail & Related papers (2021-10-01T02:03:43Z) - On the Equivalence of Decoupled Graph Convolution Network and Label
Propagation [60.34028546202372]
Some work shows that coupling is inferior to decoupling, which supports deep graph propagation better.
Despite effectiveness, the working mechanisms of the decoupled GCN are not well understood.
We propose a new label propagation method named propagation then training Adaptively (PTA), which overcomes the flaws of the decoupled GCN.
arXiv Detail & Related papers (2020-10-23T13:57:39Z) - DeeperGCN: All You Need to Train Deeper GCNs [66.64739331859226]
Graph Convolutional Networks (GCNs) have been drawing significant attention with the power of representation learning on graphs.
Unlike Convolutional Neural Networks (CNNs), which are able to take advantage of stacking very deep layers, GCNs suffer from vanishing gradient, over-smoothing and over-fitting issues when going deeper.
This paper proposes DeeperGCN that is capable of successfully and reliably training very deep GCNs.
arXiv Detail & Related papers (2020-06-13T23:00:22Z) - VGCN-BERT: Augmenting BERT with Graph Embedding for Text Classification [21.96079052962283]
VGCN-BERT model combines the capability of BERT with a Vocabulary Graph Convolutional Network (VGCN)
In our experiments on several text classification datasets, our approach outperforms BERT and GCN alone.
arXiv Detail & Related papers (2020-04-12T22:02:33Z) - Distilling Knowledge from Graph Convolutional Networks [146.71503336770886]
Existing knowledge distillation methods focus on convolutional neural networks (CNNs)
We propose the first dedicated approach to distilling knowledge from a pre-trained graph convolutional network (GCN) model.
We show that our method achieves the state-of-the-art knowledge distillation performance for GCN models.
arXiv Detail & Related papers (2020-03-23T18:23:11Z) - Incorporating BERT into Neural Machine Translation [251.54280200353674]
We propose a new algorithm named BERT-fused model, in which we first use BERT to extract representations for an input sequence.
We conduct experiments on supervised (including sentence-level and document-level translations), semi-supervised and unsupervised machine translation, and achieve state-of-the-art results on seven benchmark datasets.
arXiv Detail & Related papers (2020-02-17T08:13:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.