Related papers: BertGCN: Transductive Text Classification by Combining GCN and BERT

BertGCN: Transductive Text Classification by Combining GCN and BERT

URL: http://arxiv.org/abs/2105.05727v2
Date: Thu, 13 May 2021 11:32:18 GMT
Title: BertGCN: Transductive Text Classification by Combining GCN and BERT
Authors: Yuxiao Lin, Yuxian Meng, Xiaofei Sun, Qinghong Han, Kun Kuang, Jiwei Li and Fei Wu
Abstract summary: BertGCN is a model that combines large scale pretraining and transductive learning for text classification. BertGCN SOTA achieves performances on a wide range of text classification datasets.
Score: 33.866453485862124
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: In this work, we propose BertGCN, a model that combines large scale pretraining and transductive learning for text classification. BertGCN constructs a heterogeneous graph over the dataset and represents documents as nodes using BERT representations. By jointly training the BERT and GCN modules within BertGCN, the proposed model is able to leverage the advantages of both worlds: large-scale pretraining which takes the advantage of the massive amount of raw data and transductive learning which jointly learns representations for both training data and unlabeled test data by propagating label influence through graph convolution. Experiments show that BertGCN achieves SOTA performances on a wide range of text classification datasets. Code is available at https://github.com/ZeroRin/BertGCN.

Related papers

GraphBridge: Towards Arbitrary Transfer Learning in GNNs [65.01790632978962]
GraphBridge is a novel framework to enable knowledge transfer across disparate tasks and domains in GNNs. It allows for the augmentation of any pre-trained GNN with prediction heads and a bridging network that connects the input to the output layer. Empirical validation, conducted over 16 datasets representative of these scenarios, confirms the framework's capacity for task- and domain-agnostic transfer learning.
arXiv Detail & Related papers (2025-02-26T15:57:51Z)
You do not have to train Graph Neural Networks at all on text-attributed graphs [25.044734252779975]
We introduce TrainlessGNN, a linear GNN model capitalizing on the observation that text encodings from the same class often cluster together in a linear subspace. Our experiments reveal that our trainless models can either match or even surpass their conventionally trained counterparts.
arXiv Detail & Related papers (2024-04-17T02:52:11Z)
Article Classification with Graph Neural Networks and Multigraphs [0.12499537119440243]
We propose a method to enhance the performance of article classification by enriching simple Graph Neural Network (GNN) pipelines with multi-graph representations. fully supervised transductive node classification experiments are conducted on the Open Graph Benchmark OGBN-arXiv dataset and the PubMed diabetes dataset. Results demonstrate that multi-graphs consistently improve the performance of a variety of GNN models compared to the default graphs.
arXiv Detail & Related papers (2023-09-20T14:18:04Z)
A Robust Stacking Framework for Training Deep Graph Models with Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data. The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN. Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z)
Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data. We present a novel Graph Matching based GNN Pre-Training framework, called GMPT. The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z)
BERT4GCN: Using BERT Intermediate Layers to Augment GCN for Aspect-based Sentiment Classification [2.982218441172364]
Graph-based Sentiment Classification (ABSC) approaches have yielded state-of-the-art results, expecially when equipped with contextual word embedding from pre-training language models (PLMs) We propose a novel model, BERT4GCN, which integrates the grammatical sequential features from the PLM of BERT, and the syntactic knowledge from dependency graphs.
arXiv Detail & Related papers (2021-10-01T02:03:43Z)
On the Equivalence of Decoupled Graph Convolution Network and Label Propagation [60.34028546202372]
Some work shows that coupling is inferior to decoupling, which supports deep graph propagation better. Despite effectiveness, the working mechanisms of the decoupled GCN are not well understood. We propose a new label propagation method named propagation then training Adaptively (PTA), which overcomes the flaws of the decoupled GCN.
arXiv Detail & Related papers (2020-10-23T13:57:39Z)
DeeperGCN: All You Need to Train Deeper GCNs [66.64739331859226]
Graph Convolutional Networks (GCNs) have been drawing significant attention with the power of representation learning on graphs. Unlike Convolutional Neural Networks (CNNs), which are able to take advantage of stacking very deep layers, GCNs suffer from vanishing gradient, over-smoothing and over-fitting issues when going deeper. This paper proposes DeeperGCN that is capable of successfully and reliably training very deep GCNs.
arXiv Detail & Related papers (2020-06-13T23:00:22Z)
VGCN-BERT: Augmenting BERT with Graph Embedding for Text Classification [21.96079052962283]
VGCN-BERT model combines the capability of BERT with a Vocabulary Graph Convolutional Network (VGCN) In our experiments on several text classification datasets, our approach outperforms BERT and GCN alone.
arXiv Detail & Related papers (2020-04-12T22:02:33Z)
Distilling Knowledge from Graph Convolutional Networks [146.71503336770886]
Existing knowledge distillation methods focus on convolutional neural networks (CNNs) We propose the first dedicated approach to distilling knowledge from a pre-trained graph convolutional network (GCN) model. We show that our method achieves the state-of-the-art knowledge distillation performance for GCN models.
arXiv Detail & Related papers (2020-03-23T18:23:11Z)
Incorporating BERT into Neural Machine Translation [251.54280200353674]
We propose a new algorithm named BERT-fused model, in which we first use BERT to extract representations for an input sequence. We conduct experiments on supervised (including sentence-level and document-level translations), semi-supervised and unsupervised machine translation, and achieve state-of-the-art results on seven benchmark datasets.
arXiv Detail & Related papers (2020-02-17T08:13:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.