Related papers: Uncovering Coresets for Classification With Multi-Objective Evolutionary Algorithms

Uncovering Coresets for Classification With Multi-Objective Evolutionary Algorithms

URL: http://arxiv.org/abs/2002.08645v1
Date: Thu, 20 Feb 2020 09:59:56 GMT
Title: Uncovering Coresets for Classification With Multi-Objective Evolutionary Algorithms
Authors: Pietro Barbiero, Giovanni Squillero, Alberto Tonda
Abstract summary: A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data. A novel approach is presented: candidate corsets are iteratively optimized, adding and removing samples. A multi-objective evolutionary algorithm is used to minimize simultaneously the number of points in the set and the classification error.
Score: 0.8057006406834467
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data. Coreset discovery is an active and open line of research as it allows improving training speed for the algorithms and may help human understanding the results. Building on previous works, a novel approach is presented: candidate corsets are iteratively optimized, adding and removing samples. As there is an obvious trade-off between limiting training size and quality of the results, a multi-objective evolutionary algorithm is used to minimize simultaneously the number of points in the set and the classification error. Experimental results on non-trivial benchmarks show that the proposed approach is able to deliver results that allow a classifier to obtain lower error and better ability of generalizing on unseen data than state-of-the-art coreset discovery techniques.

Related papers

Improving Model Classification by Optimizing the Training Dataset [3.987352341101438]
Coresets offer a principled approach to data reduction, enabling efficient learning on large datasets.<n>We present a systematic framework for tuning the coreset generation process to enhance downstream classification quality.
arXiv Detail & Related papers (2025-07-22T16:10:11Z)
Refined Coreset Selection: Towards Minimal Coreset Size under Model Performance Constraints [69.27190330994635]
Coreset selection is powerful in reducing computational costs and accelerating data processing for deep learning algorithms. We propose an innovative method, which maintains optimization priority order over the model performance and coreset size. Empirically, extensive experiments confirm its superiority, often yielding better model performance with smaller coreset sizes.
arXiv Detail & Related papers (2023-11-15T03:43:04Z)
Convolutional autoencoder-based multimodal one-class classification [80.52334952912808]
One-class classification refers to approaches of learning using data from a single class only. We propose a deep learning one-class classification method suitable for multimodal data.
arXiv Detail & Related papers (2023-09-25T12:31:18Z)
Composable Core-sets for Diversity Approximation on Multi-Dataset Streams [4.765131728094872]
Composable core-sets are core-sets with the property that subsets of the core set can be unioned together to obtain an approximation for the original data. We introduce a core-set construction algorithm for constructing composable core-sets to summarize streamed data for use in active learning environments.
arXiv Detail & Related papers (2023-08-10T23:24:51Z)
Probabilistic Bilevel Coreset Selection [24.874967723659022]
We propose a continuous probabilistic bilevel formulation of coreset selection by learning a probablistic weight for each training sample. We develop an efficient solver to the bilevel optimization problem via unbiased policy gradient without trouble of implicit differentiation.
arXiv Detail & Related papers (2023-01-24T09:37:00Z)
A Boosting Approach to Constructing an Ensemble Stack [1.0775419935941009]
An approach to evolutionary ensemble learning for classification is proposed in which boosting is used to construct a stack of programs. Training against a residual dataset actively reduces the cost of training. Benchmarking studies are conducted to illustrate competitiveness with the prediction accuracy of current state-of-the-art evolutionary ensemble learning algorithms.
arXiv Detail & Related papers (2022-11-28T18:21:36Z)
Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting. We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z)
Adaptive Second Order Coresets for Data-efficient Machine Learning [5.362258158646462]
Training machine learning models on datasets incurs substantial computational costs. We propose AdaCore to extract subsets of the training examples for efficient machine learning.
arXiv Detail & Related papers (2022-07-28T05:43:09Z)
Towards Diverse Evaluation of Class Incremental Learning: A Representation Learning Perspective [67.45111837188685]
Class incremental learning (CIL) algorithms aim to continually learn new object classes from incrementally arriving data. We experimentally analyze neural network models trained by CIL algorithms using various evaluation protocols in representation learning.
arXiv Detail & Related papers (2022-06-16T11:44:11Z)
Meta-learning One-class Classifiers with Eigenvalue Solvers for Supervised Anomaly Detection [55.888835686183995]
We propose a neural network-based meta-learning method for supervised anomaly detection. We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods.
arXiv Detail & Related papers (2021-03-01T01:43:04Z)
Ensemble Wrapper Subsampling for Deep Modulation Classification [70.91089216571035]
Subsampling of received wireless signals is important for relaxing hardware requirements as well as the computational cost of signal processing algorithms. We propose a subsampling technique to facilitate the use of deep learning for automatic modulation classification in wireless communication systems.
arXiv Detail & Related papers (2020-05-10T06:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.