Uncovering Coresets for Classification With Multi-Objective Evolutionary
Algorithms
- URL: http://arxiv.org/abs/2002.08645v1
- Date: Thu, 20 Feb 2020 09:59:56 GMT
- Title: Uncovering Coresets for Classification With Multi-Objective Evolutionary
Algorithms
- Authors: Pietro Barbiero, Giovanni Squillero, Alberto Tonda
- Abstract summary: A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data.
A novel approach is presented: candidate corsets are iteratively optimized, adding and removing samples.
A multi-objective evolutionary algorithm is used to minimize simultaneously the number of points in the set and the classification error.
- Score: 0.8057006406834467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A coreset is a subset of the training set, using which a machine learning
algorithm obtains performances similar to what it would deliver if trained over
the whole original data. Coreset discovery is an active and open line of
research as it allows improving training speed for the algorithms and may help
human understanding the results. Building on previous works, a novel approach
is presented: candidate corsets are iteratively optimized, adding and removing
samples. As there is an obvious trade-off between limiting training size and
quality of the results, a multi-objective evolutionary algorithm is used to
minimize simultaneously the number of points in the set and the classification
error. Experimental results on non-trivial benchmarks show that the proposed
approach is able to deliver results that allow a classifier to obtain lower
error and better ability of generalizing on unseen data than state-of-the-art
coreset discovery techniques.
Related papers
- Refined Coreset Selection: Towards Minimal Coreset Size under Model
Performance Constraints [69.27190330994635]
Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms.
We propose an innovative method, which maintains optimization priority order over the model performance and coreset size.
Empirically, extensive experiments confirm its superiority, often yielding better model performance with smaller coreset sizes.
arXiv Detail & Related papers (2023-11-15T03:43:04Z) - Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only.
We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z) - Composable Core-sets for Diversity Approximation on Multi-Dataset
Streams [4.765131728094872]
Composable core-sets are core-sets with the property that subsets of the core set can be unioned together to obtain an approximation for the original data.
We introduce a core-set construction algorithm for constructing composable core-sets to summarize streamed data for use in active learning environments.
arXiv Detail & Related papers (2023-08-10T23:24:51Z) - Probabilistic Bilevel Coreset Selection [24.874967723659022]
We propose a continuous probabilistic bilevel formulation of coreset selection by learning a probablistic weight for each training sample.
We develop an efficient solver to the bilevel optimization problem via unbiased policy gradient without trouble of implicit differentiation.
arXiv Detail & Related papers (2023-01-24T09:37:00Z) - A Boosting Approach to Constructing an Ensemble Stack [1.0775419935941009]
An approach to evolutionary ensemble learning for classification is proposed in which boosting is used to construct a stack of programs.
Training against a residual dataset actively reduces the cost of training.
Benchmarking studies are conducted to illustrate competitiveness with the prediction accuracy of current state-of-the-art evolutionary ensemble learning algorithms.
arXiv Detail & Related papers (2022-11-28T18:21:36Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - Adaptive Second Order Coresets for Data-efficient Machine Learning [5.362258158646462]
Training machine learning models on datasets incurs substantial computational costs.
We propose AdaCore to extract subsets of the training examples for efficient machine learning.
arXiv Detail & Related papers (2022-07-28T05:43:09Z) - Towards Diverse Evaluation of Class Incremental Learning: A Representation Learning Perspective [67.45111837188685]
Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data.
We experimentally analyze neural network models trained by CIL algorithms using various evaluation protocols in representation learning.
arXiv Detail & Related papers (2022-06-16T11:44:11Z) - Meta-learning One-class Classifiers with Eigenvalue Solvers for
Supervised Anomaly Detection [55.888835686183995]
We propose a neural network-based meta-learning method for supervised anomaly detection.
We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods.
arXiv Detail & Related papers (2021-03-01T01:43:04Z) - Ensemble Wrapper Subsampling for Deep Modulation Classification [70.91089216571035]
Subsampling of received wireless signals is important for relaxing hardware requirements as well as the computational cost of signal processing algorithms.
We propose a subsampling technique to facilitate the use of deep learning for automatic modulation classification in wireless communication systems.
arXiv Detail & Related papers (2020-05-10T06:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.