Skew Probabilistic Neural Networks for Learning from Imbalanced Data
- URL: http://arxiv.org/abs/2312.05878v1
- Date: Sun, 10 Dec 2023 13:12:55 GMT
- Title: Skew Probabilistic Neural Networks for Learning from Imbalanced Data
- Authors: Shraddha M. Naik, Tanujit Chakraborty, Abdenour Hadid, Bibhas
Chakraborty
- Abstract summary: This paper introduces an imbalanced data-oriented approach using probabilistic neural networks (PNNs) with a skew normal probability kernel.
We show that SkewPNNs substantially outperform state-of-the-art machine learning methods for both balanced and imbalanced datasets in most experimental settings.
- Score: 3.7892198600060945
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-world datasets often exhibit imbalanced data distribution, where certain
class levels are severely underrepresented. In such cases, traditional pattern
classifiers have shown a bias towards the majority class, impeding accurate
predictions for the minority class. This paper introduces an imbalanced
data-oriented approach using probabilistic neural networks (PNNs) with a skew
normal probability kernel to address this major challenge. PNNs are known for
providing probabilistic outputs, enabling quantification of prediction
confidence and uncertainty handling. By leveraging the skew normal
distribution, which offers increased flexibility, particularly for imbalanced
and non-symmetric data, our proposed Skew Probabilistic Neural Networks
(SkewPNNs) can better represent underlying class densities. To optimize the
performance of the proposed approach on imbalanced datasets, hyperparameter
fine-tuning is imperative. To this end, we employ a population-based heuristic
algorithm, Bat optimization algorithms, for effectively exploring the
hyperparameter space. We also prove the statistical consistency of the density
estimates which suggests that the true distribution will be approached smoothly
as the sample size increases. Experimental simulations have been conducted on
different synthetic datasets, comparing various benchmark-imbalanced learners.
Our real-data analysis shows that SkewPNNs substantially outperform
state-of-the-art machine learning methods for both balanced and imbalanced
datasets in most experimental settings.
Related papers
- Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Probabilistic Neural Networks (PNNs) for Modeling Aleatoric Uncertainty
in Scientific Machine Learning [2.348041867134616]
This paper investigates the use of probabilistic neural networks (PNNs) to model aleatoric uncertainty.
PNNs generate probability distributions for the target variable, allowing the determination of both predicted means and intervals in regression scenarios.
In a real-world scientific machine learning context, PNNs yield remarkably accurate output mean estimates with R-squared scores approaching 0.97, and their predicted intervals exhibit a high correlation coefficient of nearly 0.80.
arXiv Detail & Related papers (2024-02-21T17:15:47Z) - Rethinking Semi-Supervised Imbalanced Node Classification from
Bias-Variance Decomposition [18.3055496602884]
This paper introduces a new approach to address the issue of class imbalance in graph neural networks (GNNs) for learning on graph-structured data.
Our approach integrates imbalanced node classification and Bias-Variance Decomposition, establishing a theoretical framework that closely relates data imbalance to model variance.
arXiv Detail & Related papers (2023-10-28T17:28:07Z) - Amortised Inference in Bayesian Neural Networks [0.0]
We introduce the Amortised Pseudo-Observation Variational Inference Bayesian Neural Network (APOVI-BNN)
We show that the amortised inference is of similar or better quality to those obtained through traditional variational inference.
We then discuss how the APOVI-BNN may be viewed as a new member of the neural process family.
arXiv Detail & Related papers (2023-09-06T14:02:33Z) - Effective Class-Imbalance learning based on SMOTE and Convolutional
Neural Networks [0.1074267520911262]
Imbalanced Data (ID) is a problem that deters Machine Learning (ML) models for achieving satisfactory results.
In this paper, we investigate the effectiveness of methods based on Deep Neural Networks (DNNs) and Convolutional Neural Networks (CNNs)
In order to achieve reliable results, we conducted our experiments 100 times with randomly shuffled data distributions.
arXiv Detail & Related papers (2022-09-01T07:42:16Z) - coVariance Neural Networks [119.45320143101381]
Graph neural networks (GNN) are an effective framework that exploit inter-relationships within graph-structured data for learning.
We propose a GNN architecture, called coVariance neural network (VNN), that operates on sample covariance matrices as graphs.
We show that VNN performance is indeed more stable than PCA-based statistical approaches.
arXiv Detail & Related papers (2022-05-31T15:04:43Z) - Discovering Invariant Rationales for Graph Neural Networks [104.61908788639052]
Intrinsic interpretability of graph neural networks (GNNs) is to find a small subset of the input graph's features.
We propose a new strategy of discovering invariant rationale (DIR) to construct intrinsically interpretable GNNs.
arXiv Detail & Related papers (2022-01-30T16:43:40Z) - Rank-R FNN: A Tensor-Based Learning Model for High-Order Data
Classification [69.26747803963907]
Rank-R Feedforward Neural Network (FNN) is a tensor-based nonlinear learning model that imposes Canonical/Polyadic decomposition on its parameters.
First, it handles inputs as multilinear arrays, bypassing the need for vectorization, and can thus fully exploit the structural information along every data dimension.
We establish the universal approximation and learnability properties of Rank-R FNN, and we validate its performance on real-world hyperspectral datasets.
arXiv Detail & Related papers (2021-04-11T16:37:32Z) - Statistical model-based evaluation of neural networks [74.10854783437351]
We develop an experimental setup for the evaluation of neural networks (NNs)
The setup helps to benchmark a set of NNs vis-a-vis minimum-mean-square-error (MMSE) performance bounds.
This allows us to test the effects of training data size, data dimension, data geometry, noise, and mismatch between training and testing conditions.
arXiv Detail & Related papers (2020-11-18T00:33:24Z) - Balance-Subsampled Stable Prediction [55.13512328954456]
We propose a novel balance-subsampled stable prediction (BSSP) algorithm based on the theory of fractional factorial design.
A design-theoretic analysis shows that the proposed method can reduce the confounding effects among predictors induced by the distribution shift.
Numerical experiments on both synthetic and real-world data sets demonstrate that our BSSP algorithm significantly outperforms the baseline methods for stable prediction across unknown test data.
arXiv Detail & Related papers (2020-06-08T07:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.