Related papers: Graph Neural Network-Driven Hierarchical Mining for Complex Imbalanced Data

Graph Neural Network-Driven Hierarchical Mining for Complex Imbalanced Data

URL: http://arxiv.org/abs/2502.03803v1
Date: Thu, 06 Feb 2025 06:26:41 GMT
Title: Graph Neural Network-Driven Hierarchical Mining for Complex Imbalanced Data
Authors: Yijiashun Qi, Quanchao Lu, Shiyu Dou, Xiaoxuan Sun, Muqing Li, Yankaiqi Li,
Abstract summary: This study presents a hierarchical mining framework for high-dimensional imbalanced data.<n>By constructing a structured graph representation of the dataset and integrating graph neural network embeddings, the proposed method effectively captures global interdependencies among samples.<n> Empirical evaluations across multiple experimental scenarios validate the efficacy of the proposed approach.
Score: 0.8246494848934447
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This study presents a hierarchical mining framework for high-dimensional imbalanced data, leveraging a depth graph model to address the inherent performance limitations of conventional approaches in handling complex, high-dimensional data distributions with imbalanced sample representations. By constructing a structured graph representation of the dataset and integrating graph neural network (GNN) embeddings, the proposed method effectively captures global interdependencies among samples. Furthermore, a hierarchical strategy is employed to enhance the characterization and extraction of minority class feature patterns, thereby facilitating precise and robust imbalanced data mining. Empirical evaluations across multiple experimental scenarios validate the efficacy of the proposed approach, demonstrating substantial improvements over traditional methods in key performance metrics, including pattern discovery count, average support, and minority class coverage. Notably, the method exhibits superior capabilities in minority-class feature extraction and pattern correlation analysis. These findings underscore the potential of depth graph models, in conjunction with hierarchical mining strategies, to significantly enhance the efficiency and accuracy of imbalanced data analysis. This research contributes a novel computational framework for high-dimensional complex data processing and lays the foundation for future extensions to dynamically evolving imbalanced data and multi-modal data applications, thereby expanding the applicability of advanced data mining methodologies to more intricate analytical domains.

Related papers

Comparing Methods for Bias Mitigation in Graph Neural Networks [5.256237513030105]
This paper examines the critical role of Graph Neural Networks (GNNs) in data preparation for generative artificial intelligence (GenAI) systems. We present a comparative analysis of three distinct methods for bias mitigation: data sparsification, feature modification, and synthetic data augmentation.
arXiv Detail & Related papers (2025-03-28T16:18:48Z)
ACTGNN: Assessment of Clustering Tendency with Synthetically-Trained Graph Neural Networks [4.668678950572517]
ACTGNN is a graph-based framework designed to assess clustering tendency by leveraging graph representations of data.<n>A Graph Neural Network (GNN) is trained exclusively on synthetic datasets, enabling robust learning of clustering structures.<n>Our results highlight the generalizability and effectiveness of the proposed approach, making it a promising tool for robust clustering tendency assessment.
arXiv Detail & Related papers (2025-01-30T03:31:26Z)
Enhancing Few-Shot Learning with Integrated Data and GAN Model Approaches [35.431340001608476]
This paper presents an innovative approach to enhancing few-shot learning by integrating data augmentation with model fine-tuning. It aims to tackle the challenges posed by small-sample data in fields such as drug discovery, target recognition, and malicious traffic detection. Results confirm that the MhERGAN algorithm developed in this research is highly effective for few-shot learning.
arXiv Detail & Related papers (2024-11-25T16:51:11Z)
Tackling Diverse Minorities in Imbalanced Classification [80.78227787608714]
Imbalanced datasets are commonly observed in various real-world applications, presenting significant challenges in training classifiers. We propose generating synthetic samples iteratively by mixing data samples from both minority and majority classes. We demonstrate the effectiveness of our proposed framework through extensive experiments conducted on seven publicly available benchmark datasets.
arXiv Detail & Related papers (2023-08-28T18:48:34Z)
Augmentation Invariant Manifold Learning [0.5827521884806071]
We introduce a new representation learning method called augmentation invariant manifold learning.<n>Compared with existing self-supervised methods, the new method simultaneously exploits the manifold's geometric structure and invariant property of augmented data.<n>Our theoretical investigation characterizes the role of data augmentation in the proposed method and reveals why and how the data representation learned from augmented data can improve the $k$-nearest neighbor in the downstream analysis.
arXiv Detail & Related papers (2022-11-01T13:42:44Z)
Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes. We propose a metric that quantifies the ability of a graph to mix the current gradients. Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z)
Deep Equilibrium Assisted Block Sparse Coding of Inter-dependent Signals: Application to Hyperspectral Imaging [71.57324258813675]
A dataset of inter-dependent signals is defined as a matrix whose columns demonstrate strong dependencies. A neural network is employed to act as structure prior and reveal the underlying signal interdependencies. Deep unrolling and Deep equilibrium based algorithms are developed, forming highly interpretable and concise deep-learning-based architectures.
arXiv Detail & Related papers (2022-03-29T21:00:39Z)
CAFE: Learning to Condense Dataset by Aligning Features [72.99394941348757]
We propose a novel scheme to Condense dataset by Aligning FEatures (CAFE) At the heart of our approach is an effective strategy to align features from the real and synthetic data across various scales. We validate the proposed CAFE across various datasets, and demonstrate that it generally outperforms the state of the art.
arXiv Detail & Related papers (2022-03-03T05:58:49Z)
Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders. Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector. We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z)
A graph representation based on fluid diffusion model for multimodal data analysis: theoretical aspects and enhanced community detection [14.601444144225875]
We introduce a novel model for graph definition based on fluid diffusion. Our method is able to strongly outperform state-of-the-art schemes for community detection in multimodal data analysis.
arXiv Detail & Related papers (2021-12-07T16:30:03Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
Hierarchical regularization networks for sparsification based learning on noisy datasets [0.0]
hierarchy follows from approximation spaces identified at successively finer scales. For promoting model generalization at each scale, we also introduce a novel, projection based penalty operator across multiple dimension. Results show the performance of the approach as a data reduction and modeling strategy on both synthetic and real datasets.
arXiv Detail & Related papers (2020-06-09T18:32:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.