Automated Machine Learning for Positive-Unlabelled Learning
- URL: http://arxiv.org/abs/2401.06452v1
- Date: Fri, 12 Jan 2024 08:54:34 GMT
- Title: Automated Machine Learning for Positive-Unlabelled Learning
- Authors: Jack D. Saunders and Alex A. Freitas
- Abstract summary: Positive-Unlabelled (PU) learning is a growing field of machine learning.
We propose two new Auto-ML systems for PU learning: BO-Auto-PU, based on a Bayesian optimisation approach, and EBO-Auto-PU, based on a novel evolutionary/Bayesian optimisation approach.
We also present an extensive evaluation of the three Auto-ML systems, comparing them to each other and to well-established PU learning methods across 60 datasets.
- Score: 1.450405446885067
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Positive-Unlabelled (PU) learning is a growing field of machine learning that
aims to learn classifiers from data consisting of labelled positive and
unlabelled instances, which can be in reality positive or negative, but whose
label is unknown. An extensive number of methods have been proposed to address
PU learning over the last two decades, so many so that selecting an optimal
method for a given PU learning task presents a challenge. Our previous work has
addressed this by proposing GA-Auto-PU, the first Automated Machine Learning
(Auto-ML) system for PU learning. In this work, we propose two new Auto-ML
systems for PU learning: BO-Auto-PU, based on a Bayesian Optimisation approach,
and EBO-Auto-PU, based on a novel evolutionary/Bayesian optimisation approach.
We also present an extensive evaluation of the three Auto-ML systems, comparing
them to each other and to well-established PU learning methods across 60
datasets (20 real-world datasets, each with 3 versions in terms of PU learning
characteristics).
Related papers
- Meta-learning for Positive-unlabeled Classification [40.11462237689747]
The proposed method minimizes the test classification risk after the model is adapted to PU data.
The method embeds each instance into a task-specific space using neural networks.
We empirically show that the proposed method outperforms existing methods with one synthetic and three real-world datasets.
arXiv Detail & Related papers (2024-06-06T01:50:01Z) - Contrastive Approach to Prior Free Positive Unlabeled Learning [15.269090018352875]
We propose a novel PU learning framework, that starts by learning a feature space through pretext-invariant representation learning.
Our proposed approach handily outperforms state-of-the-art PU learning methods across several standard PU benchmark datasets.
arXiv Detail & Related papers (2024-02-08T20:20:54Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - ALBench: A Framework for Evaluating Active Learning in Object Detection [102.81795062493536]
This paper contributes an active learning benchmark framework named as ALBench for evaluating active learning in object detection.
Developed on an automatic deep model training system, this ALBench framework is easy-to-use, compatible with different active learning algorithms, and ensures the same training and testing protocols.
arXiv Detail & Related papers (2022-07-27T07:46:23Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - An Extensive Experimental Evaluation of Automated Machine Learning
Methods for Recommending Classification Algorithms (Extended Version) [4.400989370979334]
Three of these methods are based on Evolutionary Algorithms (EAs), and the other is Auto-WEKA, a well-known AutoML method.
We performed controlled experiments where these four AutoML methods were given the same runtime limit for different values of this limit.
In general, the difference in predictive accuracy of the three best AutoML methods was not statistically significant.
arXiv Detail & Related papers (2020-09-16T02:36:43Z) - A Survey on Large-scale Machine Learning [67.6997613600942]
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions.
Most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data.
Large-scale Machine Learning aims to learn patterns from big data with comparable performance efficiently.
arXiv Detail & Related papers (2020-08-10T06:07:52Z) - Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training [118.10946662410639]
We propose a novel Self-PU learning framework, which seamlessly integrates PU learning and self-training.
Self-PU highlights three "self"-oriented building blocks: a self-paced training algorithm that adaptively discovers and augments confident examples as the training proceeds.
We study a real-world application of PU learning, i.e., classifying brain images of Alzheimer's Disease.
arXiv Detail & Related papers (2020-06-22T17:53:59Z) - PONE: A Novel Automatic Evaluation Metric for Open-Domain Generative
Dialogue Systems [48.99561874529323]
There are three kinds of automatic methods to evaluate the open-domain generative dialogue systems.
Due to the lack of systematic comparison, it is not clear which kind of metrics are more effective.
We propose a novel and feasible learning-based metric that can significantly improve the correlation with human judgments.
arXiv Detail & Related papers (2020-04-06T04:36:33Z) - ARDA: Automatic Relational Data Augmentation for Machine Learning [23.570173866941612]
We present system, an end-to-end system that takes as input a dataset and a data repository, and outputs an augmented data set.
Our system has two distinct components: (1) a framework to search and join data with the input data, based on various attributes of the input, and (2) an efficient feature selection algorithm that prunes out noisy or irrelevant features from the resulting join.
arXiv Detail & Related papers (2020-03-21T21:55:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.