Multi-Scale Feature Fusion and Graph Neural Network Integration for Text Classification with Large Language Models
- URL: http://arxiv.org/abs/2511.05752v1
- Date: Fri, 07 Nov 2025 22:54:26 GMT
- Title: Multi-Scale Feature Fusion and Graph Neural Network Integration for Text Classification with Large Language Models
- Authors: Xiangchen Song, Yulin Huang, Jinxu Guo, Yuchen Liu, Yaxuan Luan,
- Abstract summary: This study investigates a hybrid method for text classification that integrates deep feature extraction from large language models, multi-scale fusion through feature pyramids, and structured modeling with graph neural networks to enhance performance in complex semantic contexts.<n>The proposed method demonstrates significant advantages in robustness alignment experiments, outperforming existing models on ACC, F1-Score, AUC, and Precision, which verifies the effectiveness and stability of the framework.
- Score: 11.071281023081582
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study investigates a hybrid method for text classification that integrates deep feature extraction from large language models, multi-scale fusion through feature pyramids, and structured modeling with graph neural networks to enhance performance in complex semantic contexts. First, the large language model captures contextual dependencies and deep semantic representations of the input text, providing a rich feature foundation for subsequent modeling. Then, based on multi-level feature representations, the feature pyramid mechanism effectively integrates semantic features of different scales, balancing global information and local details to construct hierarchical semantic expressions. Furthermore, the fused features are transformed into graph representations, and graph neural networks are employed to capture latent semantic relations and logical dependencies in the text, enabling comprehensive modeling of complex interactions among semantic units. On this basis, the readout and classification modules generate the final category predictions. The proposed method demonstrates significant advantages in robustness alignment experiments, outperforming existing models on ACC, F1-Score, AUC, and Precision, which verifies the effectiveness and stability of the framework. This study not only constructs an integrated framework that balances global and local information as well as semantics and structure, but also provides a new perspective for multi-scale feature fusion and structured semantic modeling in text classification tasks.
Related papers
- Explicit Grammar Semantic Feature Fusion for Robust Text Classification [0.0]
Natural Language Processing enables computers to understand human language by analysing and classifying text efficiently.<n>Existing models capture features by learning from large corpora with transformer models, which are computationally intensive and unsuitable for resource-constrained environments.<n>Our proposed study incorporates comprehensive grammatical rules alongside semantic information to build a robust, lightweight classification model.
arXiv Detail & Related papers (2026-02-24T10:25:29Z) - Structure-Aware Decoding Mechanisms for Complex Entity Extraction with Large-Scale Language Models [8.15127799301814]
This paper proposes a structure-aware decoding method based on large language models.<n>It addresses the difficulty of maintaining both semantic integrity and structural consistency in nested and overlapping entity extraction tasks.<n> Experiments conducted on the ACE 2005 dataset demonstrate significant improvements in Accuracy, Precision, Recall, and F1-Score.
arXiv Detail & Related papers (2025-12-16T00:40:06Z) - Advancing Text Classification with Large Language Models and Neural Attention Mechanisms [11.31737492247233]
The framework includes text encoding, contextual representation modeling, attention-based enhancement, and classification prediction.<n>Results show that the proposed method outperforms existing models on all metrics.
arXiv Detail & Related papers (2025-12-10T09:18:41Z) - Multi-Modal Interpretability for Enhanced Localization in Vision-Language Models [2.984679075401059]
This paper presents the Multi-Modal Explainable Learning framework, designed to enhance the interpretability of vision-language models.<n>Our approach processes features at multiple semantic levels to capture relationships between image regions at different granularities.<n>We show that by incorporating semantic relationship information into gradient-based attribution maps, MMEL produces more focused and contextually aware visualizations.
arXiv Detail & Related papers (2025-09-17T18:18:59Z) - How Compositional Generalization and Creativity Improve as Diffusion Models are Trained [82.08869888944324]
How many samples do generative models need in order to learn composition rules?<n>What signal in the data is exploited to learn those rules?<n>We discuss connections between the hierarchical clustering mechanism we introduce here and the renormalization group in physics.
arXiv Detail & Related papers (2025-02-17T18:06:33Z) - UniDiff: Advancing Vision-Language Models with Generative and
Discriminative Learning [86.91893533388628]
This paper presents UniDiff, a unified multi-modal model that integrates image-text contrastive learning (ITC), text-conditioned image synthesis learning (IS), and reciprocal semantic consistency modeling (RSC)
UniDiff demonstrates versatility in both multi-modal understanding and generative tasks.
arXiv Detail & Related papers (2023-06-01T15:39:38Z) - Constructing Word-Context-Coupled Space Aligned with Associative
Knowledge Relations for Interpretable Language Modeling [0.0]
The black-box structure of the deep neural network in pre-trained language models seriously limits the interpretability of the language modeling process.
A Word-Context-Coupled Space (W2CSpace) is proposed by introducing the alignment processing between uninterpretable neural representation and interpretable statistical logic.
Our language model can achieve better performance and highly credible interpretable ability compared to related state-of-the-art methods.
arXiv Detail & Related papers (2023-05-19T09:26:02Z) - Topics as Entity Clusters: Entity-based Topics from Large Language Models and Graph Neural Networks [0.6486052012623045]
We propose a novel topic clustering approach using bimodal vector representations of entities.
Our approach is better suited to working with entities in comparison to state-of-the-art models.
arXiv Detail & Related papers (2023-01-06T10:54:54Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - TAGPRIME: A Unified Framework for Relational Structure Extraction [71.88926365652034]
TAGPRIME is a sequence tagging model that appends priming words about the information of the given condition to the input text.
With the self-attention mechanism in pre-trained language models, the priming words make the output contextualized representations contain more information about the given condition.
Extensive experiments and analyses on three different tasks that cover ten datasets across five different languages demonstrate the generality and effectiveness of TAGPRIME.
arXiv Detail & Related papers (2022-05-25T08:57:46Z) - Neural Function Modules with Sparse Arguments: A Dynamic Approach to
Integrating Information across Layers [84.57980167400513]
Neural Function Modules (NFM) aims to introduce the same structural capability into deep learning.
Most of the work in the context of feed-forward networks combining top-down and bottom-up feedback is limited to classification problems.
The key contribution of our work is to combine attention, sparsity, top-down and bottom-up feedback, in a flexible algorithm.
arXiv Detail & Related papers (2020-10-15T20:43:17Z) - S2RMs: Spatially Structured Recurrent Modules [105.0377129434636]
We take a step towards exploiting dynamic structure that are capable of simultaneously exploiting both modular andtemporal structures.
We find our models to be robust to the number of available views and better capable of generalization to novel tasks without additional training.
arXiv Detail & Related papers (2020-07-13T17:44:30Z) - Global Context-Aware Progressive Aggregation Network for Salient Object
Detection [117.943116761278]
We propose a novel network named GCPANet to integrate low-level appearance features, high-level semantic features, and global context features.
We show that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
arXiv Detail & Related papers (2020-03-02T04:26:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.