Token-Level Graphs for Short Text Classification
- URL: http://arxiv.org/abs/2412.12754v1
- Date: Tue, 17 Dec 2024 10:19:44 GMT
- Title: Token-Level Graphs for Short Text Classification
- Authors: Gregor Donabauer, Udo Kruschwitz,
- Abstract summary: We propose an approach which constructs text graphs entirely based on tokens obtained through pre-trained language models (PLMs)
Our method captures contextual and semantic information, overcomes vocabulary constraints, and allows for context-dependent word meanings.
Experimental results demonstrate how our method consistently achieves higher scores or on-par performance with existing methods.
- Score: 1.6819960041696331
- License:
- Abstract: The classification of short texts is a common subtask in Information Retrieval (IR). Recent advances in graph machine learning have led to interest in graph-based approaches for low resource scenarios, showing promise in such settings. However, existing methods face limitations such as not accounting for different meanings of the same words or constraints from transductive approaches. We propose an approach which constructs text graphs entirely based on tokens obtained through pre-trained language models (PLMs). By applying a PLM to tokenize and embed the texts when creating the graph(-nodes), our method captures contextual and semantic information, overcomes vocabulary constraints, and allows for context-dependent word meanings. Our approach also makes classification more efficient with reduced parameters compared to classical PLM fine-tuning, resulting in more robust training with few samples. Experimental results demonstrate how our method consistently achieves higher scores or on-par performance with existing methods, presenting an advancement in graph-based text classification techniques. To support reproducibility of our work we make all implementations publicly available to the community\footnote{\url{https://github.com/doGregor/TokenGraph}}.
Related papers
- Graph-based Retrieval Augmented Generation for Dynamic Few-shot Text Classification [15.0627807767152]
We propose a graph-based online retrieval-augmented generation framework, namely GORAG, for dynamic few-shot text classification.
GORAG constructs and maintains a weighted graph by extracting side information across all target texts.
Empirical evaluations demonstrate that GORAG outperforms existing approaches by providing more comprehensive and precise contextual information.
arXiv Detail & Related papers (2025-01-06T08:43:31Z) - Instance-Aware Graph Prompt Learning [71.26108600288308]
We introduce Instance-Aware Graph Prompt Learning (IA-GPL) in this paper.
The process involves generating intermediate prompts for each instance using a lightweight architecture.
Experiments conducted on multiple datasets and settings showcase the superior performance of IA-GPL compared to state-of-the-art baselines.
arXiv Detail & Related papers (2024-11-26T18:38:38Z) - Node Level Graph Autoencoder: Unified Pretraining for Textual Graph Learning [45.70767623846523]
We propose a novel unified unsupervised learning autoencoder framework, named Node Level Graph AutoEncoder (NodeGAE)
We employ language models as the backbone of the autoencoder, with pretraining on text reconstruction.
Our method maintains simplicity in the training process and demonstrates generalizability across diverse textual graphs and downstream tasks.
arXiv Detail & Related papers (2024-08-09T14:57:53Z) - Pre-Training and Prompting for Few-Shot Node Classification on Text-Attributed Graphs [35.44563283531432]
Text-attributed graph (TAG) is one kind of important real-world graph-structured data with each node associated with raw texts.
For TAGs, traditional few-shot node classification methods directly conduct training on the pre-processed node features.
We propose P2TAG, a framework designed for few-shot node classification on TAGs with graph pre-training and prompting.
arXiv Detail & Related papers (2024-07-22T07:24:21Z) - Parameter-Efficient Tuning Large Language Models for Graph Representation Learning [62.26278815157628]
We introduce Graph-aware.
Efficient Fine-Tuning - GPEFT, a novel approach for efficient graph representation learning.
We use a graph neural network (GNN) to encode structural information from neighboring nodes into a graph prompt.
We validate our approach through comprehensive experiments conducted on 8 different text-rich graphs, observing an average improvement of 2% in hit@1 and Mean Reciprocal Rank (MRR) in link prediction evaluations.
arXiv Detail & Related papers (2024-04-28T18:36:59Z) - ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings [20.25180279903009]
We propose Contrastive Graph-Text pretraining (ConGraT) for jointly learning separate representations of texts and nodes in a text-attributed graph (TAG)
Our method trains a language model (LM) and a graph neural network (GNN) to align their representations in a common latent space using a batch-wise contrastive learning objective inspired by CLIP.
Experiments demonstrate that ConGraT outperforms baselines on various downstream tasks, including node and text category classification, link prediction, and language modeling.
arXiv Detail & Related papers (2023-05-23T17:53:30Z) - PRODIGY: Enabling In-context Learning Over Graphs [112.19056551153454]
In-context learning is the ability of a pretrained model to adapt to novel and diverse downstream tasks.
We develop PRODIGY, the first pretraining framework that enables in-context learning over graphs.
arXiv Detail & Related papers (2023-05-21T23:16:30Z) - ChatGraph: Interpretable Text Classification by Converting ChatGPT
Knowledge to Graphs [54.48467003509595]
ChatGPT has shown superior performance in various natural language processing (NLP) tasks.
We propose a novel framework that leverages the power of ChatGPT for specific tasks, such as text classification.
Our method provides a more transparent decision-making process compared with previous text classification methods.
arXiv Detail & Related papers (2023-05-03T19:57:43Z) - Fine-Grained Visual Entailment [51.66881737644983]
We propose an extension of this task, where the goal is to predict the logical relationship of fine-grained knowledge elements within a piece of text to an image.
Unlike prior work, our method is inherently explainable and makes logical predictions at different levels of granularity.
We evaluate our method on a new dataset of manually annotated knowledge elements and show that our method achieves 68.18% accuracy at this challenging task.
arXiv Detail & Related papers (2022-03-29T16:09:38Z) - Hierarchical Heterogeneous Graph Representation Learning for Short Text
Classification [60.233529926965836]
We propose a new method called SHINE, which is based on graph neural network (GNN) for short text classification.
First, we model the short text dataset as a hierarchical heterogeneous graph consisting of word-level component graphs.
Then, we dynamically learn a short document graph that facilitates effective label propagation among similar short texts.
arXiv Detail & Related papers (2021-10-30T05:33:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.