Informative Bayesian model selection for RR Lyrae star classifiers
- URL: http://arxiv.org/abs/2105.11531v1
- Date: Mon, 24 May 2021 20:55:04 GMT
- Title: Informative Bayesian model selection for RR Lyrae star classifiers
- Authors: F. P\'erez-Galarce, K. Pichara, P. Huijse, M. Catelan, D. Mery
- Abstract summary: We develop a method based on an informative marginal likelihood to evaluate variable star classifiers.
We perform experiments with a set of Bayesian Logistic Regressions, which are trained to classify RR Lyraes.
Our methodology provides a more rigorous alternative to assess machine learning models using astronomical knowledge.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning has achieved an important role in the automatic
classification of variable stars, and several classifiers have been proposed
over the last decade. These classifiers have achieved impressive performance in
several astronomical catalogues. However, some scientific articles have also
shown that the training data therein contain multiple sources of bias. Hence,
the performance of those classifiers on objects not belonging to the training
data is uncertain, potentially resulting in the selection of incorrect models.
Besides, it gives rise to the deployment of misleading classifiers. An example
of the latter is the creation of open-source labelled catalogues with biased
predictions. In this paper, we develop a method based on an informative
marginal likelihood to evaluate variable star classifiers. We collect
deterministic rules that are based on physical descriptors of RR Lyrae stars,
and then, to mitigate the biases, we introduce those rules into the marginal
likelihood estimation. We perform experiments with a set of Bayesian Logistic
Regressions, which are trained to classify RR Lyraes, and we found that our
method outperforms traditional non-informative cross-validation strategies,
even when penalized models are assessed. Our methodology provides a more
rigorous alternative to assess machine learning models using astronomical
knowledge. From this approach, applications to other classes of variable stars
and algorithmic improvements can be developed.
Related papers
- XAL: EXplainable Active Learning Makes Classifiers Better Low-resource Learners [71.8257151788923]
We propose a novel Explainable Active Learning framework (XAL) for low-resource text classification.
XAL encourages classifiers to justify their inferences and delve into unlabeled data for which they cannot provide reasonable explanations.
Experiments on six datasets show that XAL achieves consistent improvement over 9 strong baselines.
arXiv Detail & Related papers (2023-10-09T08:07:04Z) - D-CALM: A Dynamic Clustering-based Active Learning Approach for
Mitigating Bias [13.008323851750442]
In this paper, we propose a novel adaptive clustering-based active learning algorithm, D-CALM, that dynamically adjusts clustering and annotation efforts.
Experiments on eight datasets for a diverse set of text classification tasks, including emotion, hatespeech, dialog act, and book type detection, demonstrate that our proposed algorithm significantly outperforms baseline AL approaches.
arXiv Detail & Related papers (2023-05-26T15:17:43Z) - Leveraging Angular Information Between Feature and Classifier for
Long-tailed Learning: A Prediction Reformulation Approach [90.77858044524544]
We reformulate the recognition probabilities through included angles without re-balancing the classifier weights.
Inspired by the performance improvement of the predictive form reformulation, we explore the different properties of this angular prediction.
Our method is able to obtain the best performance among peer methods without pretraining on CIFAR10/100-LT and ImageNet-LT.
arXiv Detail & Related papers (2022-12-03T07:52:48Z) - An Exploration of How Training Set Composition Bias in Machine Learning
Affects Identifying Rare Objects [0.0]
It is common to up-weight the examples of the rare class to ensure it isn't ignored.
It is also a frequent practice to train on restricted data where the balance of source types is closer to equal.
Here we show that these practices can bias the model toward over-assigning sources to the rare class.
arXiv Detail & Related papers (2022-07-07T10:26:55Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Evaluating Fairness of Machine Learning Models Under Uncertain and
Incomplete Information [25.739240011015923]
We show that the test accuracy of the attribute classifier is not always correlated with its effectiveness in bias estimation for a downstream model.
Our analysis has surprising and counter-intuitive implications where in certain regimes one might want to distribute the error of the attribute classifier as unevenly as possible.
arXiv Detail & Related papers (2021-02-16T19:02:55Z) - Learning and Evaluating Representations for Deep One-class
Classification [59.095144932794646]
We present a two-stage framework for deep one-class classification.
We first learn self-supervised representations from one-class data, and then build one-class classifiers on learned representations.
In experiments, we demonstrate state-of-the-art performance on visual domain one-class classification benchmarks.
arXiv Detail & Related papers (2020-11-04T23:33:41Z) - Predicting Classification Accuracy When Adding New Unobserved Classes [8.325327265120283]
We study how a classifier's performance can be used to extrapolate its expected accuracy on a larger, unobserved set of classes.
We formulate a robust neural-network-based algorithm, "CleaneX", which learns to estimate the accuracy of such classifiers on arbitrarily large sets of classes.
arXiv Detail & Related papers (2020-10-28T14:37:25Z) - LOGAN: Local Group Bias Detection by Clustering [86.38331353310114]
We argue that evaluating bias at the corpus level is not enough for understanding how biases are embedded in a model.
We propose LOGAN, a new bias detection technique based on clustering.
Experiments on toxicity classification and object classification tasks show that LOGAN identifies bias in a local region.
arXiv Detail & Related papers (2020-10-06T16:42:51Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - ALEX: Active Learning based Enhancement of a Model's Explainability [34.26945469627691]
An active learning (AL) algorithm seeks to construct an effective classifier with a minimal number of labeled examples in a bootstrapping manner.
In the era of data-driven learning, this is an important research direction to pursue.
This paper describes our work-in-progress towards developing an AL selection function that in addition to model effectiveness also seeks to improve on the interpretability of a model during the bootstrapping steps.
arXiv Detail & Related papers (2020-09-02T07:15:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.