Towards Improved Illicit Node Detection with Positive-Unlabelled
Learning
- URL: http://arxiv.org/abs/2303.02462v1
- Date: Sat, 4 Mar 2023 17:36:09 GMT
- Title: Towards Improved Illicit Node Detection with Positive-Unlabelled
Learning
- Authors: Junliang Luo, Farimah Poursafaei, Xue Liu
- Abstract summary: We discuss the label mechanism assumption for the hidden positive labels and its effect on the evaluation metrics.
We show that PU classifiers dealing with potential hidden positive labels can have improved performance compared to regular machine learning models.
- Score: 5.879542875341689
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting illicit nodes on blockchain networks is a valuable task for
strengthening future regulation. Recent machine learning-based methods proposed
to tackle the tasks are using some blockchain transaction datasets with a small
portion of samples labeled positive and the rest unlabelled (PU). Albeit the
assumption that a random sample of unlabeled nodes are normal nodes is used in
some works, we discuss that the label mechanism assumption for the hidden
positive labels and its effect on the evaluation metrics is worth considering.
We further explore that PU classifiers dealing with potential hidden positive
labels can have improved performance compared to regular machine learning
models. We test the PU classifiers with a list of graph representation learning
methods for obtaining different feature distributions for the same data to have
more reliable results.
Related papers
- All Points Matter: Entropy-Regularized Distribution Alignment for
Weakly-supervised 3D Segmentation [67.30502812804271]
Pseudo-labels are widely employed in weakly supervised 3D segmentation tasks where only sparse ground-truth labels are available for learning.
We propose a novel learning strategy to regularize the generated pseudo-labels and effectively narrow the gaps between pseudo-labels and model predictions.
arXiv Detail & Related papers (2023-05-25T08:19:31Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - Learning From Positive and Unlabeled Data Using Observer-GAN [0.0]
A problem of learning from positive and unlabeled data (A.K.A. PU learning) has been studied in a binary (i.e., positive versus negative) classification setting.
Generative Adversarial Networks (GANs) have been used to reduce the problem to the supervised setting with the advantage that supervised learning has state-of-the-art accuracy in classification tasks.
arXiv Detail & Related papers (2022-08-26T07:35:28Z) - Binary Classification with Positive Labeling Sources [71.37692084951355]
We propose WEAPO, a simple yet competitive WS method for producing training labels without negative labeling sources.
We show WEAPO achieves the highest averaged performance on 10 benchmark datasets.
arXiv Detail & Related papers (2022-08-02T19:32:08Z) - Positive Unlabeled Contrastive Learning [14.975173394072053]
We extend the self-supervised pretraining paradigm to the classical positive unlabeled (PU) setting.
We develop a simple methodology to pseudo-label the unlabeled samples using a new PU-specific clustering scheme.
Our method handily outperforms state-of-the-art PU methods over several standard PU benchmark datasets.
arXiv Detail & Related papers (2022-06-01T20:16:32Z) - Adaptive Positive-Unlabelled Learning via Markov Diffusion [0.0]
Positive-Unlabelled (PU) learning is the machine learning setting in which only a set of positive instances are labelled.
The principal aim of the algorithm is to identify a set of instances which are likely to contain positive instances that were originally unlabelled.
arXiv Detail & Related papers (2021-08-13T10:25:47Z) - Disentangling Sampling and Labeling Bias for Learning in Large-Output
Spaces [64.23172847182109]
We show that different negative sampling schemes implicitly trade-off performance on dominant versus rare labels.
We provide a unified means to explicitly tackle both sampling bias, arising from working with a subset of all labels, and labeling bias, which is inherent to the data due to label imbalance.
arXiv Detail & Related papers (2021-05-12T15:40:13Z) - A Novel Perspective for Positive-Unlabeled Learning via Noisy Labels [49.990938653249415]
This research presents a methodology that assigns initial pseudo-labels to unlabeled data which is used as noisy-labeled data, and trains a deep neural network using the noisy-labeled data.
Experimental results demonstrate that the proposed method significantly outperforms the state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-03-08T11:46:02Z) - Active Learning for Node Classification: The Additional Learning Ability
from Unlabelled Nodes [33.97571297149204]
Given a limited labelling budget, active learning aims to improve performance by carefully choosing which nodes to label.
Our empirical study shows that existing active learning methods for node classification are considerably outperformed by a simple method.
We propose a novel latent space clustering-based active learning method for node classification (LSCALE)
arXiv Detail & Related papers (2020-12-13T13:59:48Z) - Learning with Out-of-Distribution Data for Audio Classification [60.48251022280506]
We show that detecting and relabelling certain OOD instances, rather than discarding them, can have a positive effect on learning.
The proposed method is shown to improve the performance of convolutional neural networks by a significant margin.
arXiv Detail & Related papers (2020-02-11T21:08:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.