Finding the Homology of Decision Boundaries with Active Learning
- URL: http://arxiv.org/abs/2011.09645v1
- Date: Thu, 19 Nov 2020 04:22:06 GMT
- Title: Finding the Homology of Decision Boundaries with Active Learning
- Authors: Weizhi Li, Gautam Dasarathy, Karthikeyan Natesan Ramamurthy, and Visar
Berisha
- Abstract summary: We propose an active learning algorithm to recover the homology of decision boundaries.
Our algorithm sequentially and adaptively selects which samples it requires the labels of.
Experiments on several datasets show the sample complexity improvement in recovering the homology.
- Score: 26.31885403636642
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurately and efficiently characterizing the decision boundary of
classifiers is important for problems related to model selection and
meta-learning. Inspired by topological data analysis, the characterization of
decision boundaries using their homology has recently emerged as a general and
powerful tool. In this paper, we propose an active learning algorithm to
recover the homology of decision boundaries. Our algorithm sequentially and
adaptively selects which samples it requires the labels of. We theoretically
analyze the proposed framework and show that the query complexity of our active
learning algorithm depends naturally on the intrinsic complexity of the
underlying manifold. We demonstrate the effectiveness of our framework in
selecting best-performing machine learning models for datasets just using their
respective homological summaries. Experiments on several standard datasets show
the sample complexity improvement in recovering the homology and demonstrate
the practical utility of the framework for model selection. Source code for our
algorithms and experimental results is available at
https://github.com/wayne0908/Active-Learning-Homology.
Related papers
- OOD-Chameleon: Is Algorithm Selection for OOD Generalization Learnable? [18.801143204410913]
We formalize the task of algorithm selection for OOD generalization and investigate whether it could be approached by learning.
We propose a solution, dubbed OOD-Chameleon that treats the task as a supervised classification over candidate algorithms.
We train the model to predict the relative performance of algorithms given a dataset's characteristics.
arXiv Detail & Related papers (2024-10-03T17:52:42Z) - Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data.
One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is.
This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z) - Towards Automated Imbalanced Learning with Deep Hierarchical
Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class.
Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class.
We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Low-Regret Active learning [64.36270166907788]
We develop an online learning algorithm for identifying unlabeled data points that are most informative for training.
At the core of our work is an efficient algorithm for sleeping experts that is tailored to achieve low regret on predictable (easy) instances.
arXiv Detail & Related papers (2021-04-06T22:53:45Z) - Meta-learning One-class Classifiers with Eigenvalue Solvers for
Supervised Anomaly Detection [55.888835686183995]
We propose a neural network-based meta-learning method for supervised anomaly detection.
We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods.
arXiv Detail & Related papers (2021-03-01T01:43:04Z) - Sampling Approach Matters: Active Learning for Robotic Language
Acquisition [42.69529080098759]
We present an exploration of active learning approaches applied to three grounded language problems of varying complexity.
We report on how characteristics of the underlying task, along with design decisions such as feature selection and classification model, drive the results.
arXiv Detail & Related papers (2020-11-16T15:18:10Z) - Model-Agnostic Explanations using Minimal Forcing Subsets [11.420687735660097]
We propose a new model-agnostic algorithm to identify a minimal set of training samples that are indispensable for a given model's decision.
Our algorithm identifies such a set of "indispensable" samples iteratively by solving a constrained optimization problem.
Results show that our algorithm is an effective and easy-to-comprehend tool that helps to better understand local model behavior.
arXiv Detail & Related papers (2020-11-01T22:45:16Z) - On the Robustness of Active Learning [0.7340017786387767]
Active Learning is concerned with how to identify the most useful samples for a Machine Learning algorithm to be trained with.
We find that it is often applied with not enough care and domain knowledge.
We propose the new "Sum of Squared Logits" method based on the Simpson diversity index and investigate the effect of using the confusion matrix for balancing in sample selection.
arXiv Detail & Related papers (2020-06-18T09:07:23Z) - Fase-AL -- Adaptation of Fast Adaptive Stacking of Ensembles for
Supporting Active Learning [0.0]
This work presents the FASE-AL algorithm which induces classification models with non-labeled instances using Active Learning.
The algorithm achieves promising results in terms of the percentage of correctly classified instances.
arXiv Detail & Related papers (2020-01-30T17:25:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.