Imbalance Learning for Variable Star Classification
- URL: http://arxiv.org/abs/2002.12386v1
- Date: Thu, 27 Feb 2020 19:01:05 GMT
- Title: Imbalance Learning for Variable Star Classification
- Authors: Zafiirah Hosenie, Robert Lyon, Benjamin Stappers, Arrykrishna
Mootoovaloo and Vanessa McBride
- Abstract summary: We develop a hierarchical machine learning classification scheme to overcome imbalanced learning problems.
We use 'data-level' approaches to directly augment the training data so that they better describe under-represented classes.
We find that a higher classification rate is obtained when using $texttGpFit$ in the hierarchical model.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The accurate automated classification of variable stars into their respective
sub-types is difficult. Machine learning based solutions often fall foul of the
imbalanced learning problem, which causes poor generalisation performance in
practice, especially on rare variable star sub-types. In previous work, we
attempted to overcome such deficiencies via the development of a hierarchical
machine learning classifier. This 'algorithm-level' approach to tackling
imbalance, yielded promising results on Catalina Real-Time Survey (CRTS) data,
outperforming the binary and multi-class classification schemes previously
applied in this area. In this work, we attempt to further improve hierarchical
classification performance by applying 'data-level' approaches to directly
augment the training data so that they better describe under-represented
classes. We apply and report results for three data augmentation methods in
particular: $\textit{R}$andomly $\textit{A}$ugmented $\textit{S}$ampled
$\textit{L}$ight curves from magnitude $\textit{E}$rror ($\texttt{RASLE}$),
augmenting light curves with Gaussian Process modelling ($\texttt{GpFit}$) and
the Synthetic Minority Over-sampling Technique ($\texttt{SMOTE}$). When
combining the 'algorithm-level' (i.e. the hierarchical scheme) together with
the 'data-level' approach, we further improve variable star classification
accuracy by 1-4$\%$. We found that a higher classification rate is obtained
when using $\texttt{GpFit}$ in the hierarchical model. Further improvement of
the metric scores requires a better standard set of correctly identified
variable stars and, perhaps enhanced features are needed.
Related papers
- Boosting Commit Classification with Contrastive Learning [0.8655526882770742]
Commit Classification (CC) is an important task in software maintenance.
We propose a contrastive learning-based commit classification framework.
Our framework can solve the CC problem simply but effectively in fewshot scenarios.
arXiv Detail & Related papers (2023-08-16T10:02:36Z) - ProTeCt: Prompt Tuning for Taxonomic Open Set Classification [59.59442518849203]
Few-shot adaptation methods do not fare well in the taxonomic open set (TOS) setting.
We propose a prompt tuning technique that calibrates the hierarchical consistency of model predictions.
A new Prompt Tuning for Hierarchical Consistency (ProTeCt) technique is then proposed to calibrate classification across label set granularities.
arXiv Detail & Related papers (2023-06-04T02:55:25Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - Resource saving taxonomy classification with k-mer distributions and
machine learning [2.0196229393131726]
We propose to use $k$-mer distributions obtained from DNA as features to classify its taxonomic origin.
We show that our approach improves the classification on the genus level and achieves comparable results for the superkingdom and phylum level.
arXiv Detail & Related papers (2023-03-10T08:01:08Z) - Enhancing Classification with Hierarchical Scalable Query on Fusion
Transformer [0.4129225533930965]
This paper proposes a method to boost fine-grained classification through a hierarchical approach via learnable independent query embeddings.
We exploit the idea of hierarchy to learn query embeddings that are scalable across all levels.
Our method is able to outperform the existing methods with an improvement of 11% at the fine-grained classification.
arXiv Detail & Related papers (2023-02-28T11:00:55Z) - Learning Hierarchy Aware Features for Reducing Mistake Severity [3.704832909610283]
We propose a novel approach for learning hierarchy aware features (HAF)
HAF is a training time approach that improves the mistakes while maintaining top-1 error, thereby, addressing the problem of cross-entropy loss that treats all mistakes as equal.
We evaluate HAF on three hierarchical datasets and achieve state-of-the-art results on the iNaturalist-19 and CIFAR-100 datasets.
arXiv Detail & Related papers (2022-07-26T04:24:47Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.
When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new.
In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios.
For dataset bias due to different samplers, we propose shifted batch normalization.
Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z) - Scalable End-to-end Recurrent Neural Network for Variable star
classification [1.2722697496405464]
We propose an end-to-end algorithm that automatically learns the representation of light curves that allows an accurate automatic classification.
Our method uses minimal data preprocessing, can be updated with a low computational cost for new observations and light curves, and can scale up to massive datasets.
arXiv Detail & Related papers (2020-02-03T19:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.