Extension of Transformational Machine Learning: Classification Problems
- URL: http://arxiv.org/abs/2309.16693v1
- Date: Mon, 7 Aug 2023 07:34:18 GMT
- Title: Extension of Transformational Machine Learning: Classification Problems
- Authors: Adnan Mahmud, Oghenejokpeme Orhobor, Ross D. King
- Abstract summary: This study explores the application and performance of Transformational Machine Learning (TML) in drug discovery.
TML, a meta learning algorithm, excels in exploiting common attributes across various domains.
The drug discovery process, which is complex and time-consuming, can benefit greatly from the enhanced prediction accuracy.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study explores the application and performance of Transformational
Machine Learning (TML) in drug discovery. TML, a meta learning algorithm,
excels in exploiting common attributes across various domains, thus developing
composite models that outperform conventional models. The drug discovery
process, which is complex and time-consuming, can benefit greatly from the
enhanced prediction accuracy, improved interpretability and greater
generalizability provided by TML. We explore the efficacy of different machine
learning classifiers, where no individual classifier exhibits distinct
superiority, leading to the consideration of ensemble classifiers such as the
Random Forest.
Our findings show that TML outperforms base Machine Learning (ML) as the
number of training datasets increases, due to its capacity to better
approximate the correct hypothesis, overcome local optima, and expand the space
of representable functions by combining separate classifiers capabilities.
However, this superiority is relative to the resampling methods applied, with
Near Miss demonstrating poorer performance due to noisy data, overlapping
classes, and nonlinear class boundaries. Conversely, Random Over Sampling (ROS)
provides a more robust performance given its resistance to noise and outliers,
improved class overlap management, and suitability for nonlinear class
boundaries.
Related papers
- FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models [50.331708897857574]
We introduce FactorLLM, a novel approach that decomposes well-trained dense FFNs into sparse sub-networks without requiring any further modifications.
FactorLLM achieves comparable performance to the source model securing up to 85% model performance while obtaining over a 30% increase in inference speed.
arXiv Detail & Related papers (2024-08-15T16:45:16Z) - Amortizing intractable inference in large language models [56.92471123778389]
We use amortized Bayesian inference to sample from intractable posterior distributions.
We empirically demonstrate that this distribution-matching paradigm of LLM fine-tuning can serve as an effective alternative to maximum-likelihood training.
As an important application, we interpret chain-of-thought reasoning as a latent variable modeling problem.
arXiv Detail & Related papers (2023-10-06T16:36:08Z) - Learning Efficient Coding of Natural Images with Maximum Manifold
Capacity Representations [4.666056064419346]
The efficient coding hypothesis proposes that the response properties of sensory systems are adapted to the statistics of their inputs.
While elegant, information theoretic properties are notoriously difficult to measure in practical settings or to employ as objective functions in optimization.
Here we outline the assumptions that allow manifold capacity to be optimized directly, yielding Maximum Manifold Capacity Representations (MMCR)
arXiv Detail & Related papers (2023-03-06T17:26:30Z) - HyperImpute: Generalized Iterative Imputation with Automatic Model
Selection [77.86861638371926]
We propose a generalized iterative imputation framework for adaptively and automatically configuring column-wise models.
We provide a concrete implementation with out-of-the-box learners, simulators, and interfaces.
arXiv Detail & Related papers (2022-06-15T19:10:35Z) - Using Explainable Boosting Machine to Compare Idiographic and Nomothetic
Approaches for Ecological Momentary Assessment Data [2.0824228840987447]
This paper explores the use of non-linear interpretable machine learning (ML) models in classification problems.
Various ensembles of trees are compared to linear models using imbalanced synthetic and real-world datasets.
In one of the two real-world datasets, knowledge distillation method achieves improved AUC scores.
arXiv Detail & Related papers (2022-04-04T17:56:37Z) - Soft-margin classification of object manifolds [0.0]
A neural population responding to multiple appearances of a single object defines a manifold in the neural response space.
The ability to classify such manifold is of interest, as object recognition and other computational tasks require a response that is insensitive to variability within a manifold.
Soft-margin classifiers are a larger class of algorithms and provide an additional regularization parameter used in applications to optimize performance outside the training set.
arXiv Detail & Related papers (2022-03-14T12:23:36Z) - IB-GAN: A Unified Approach for Multivariate Time Series Classification
under Class Imbalance [1.854931308524932]
Non-parametric data augmentation with Generative Adversarial Networks (GANs) offers a promising solution.
We propose Imputation Balanced GAN (IB-GAN), a novel method that joins data augmentation and classification in a one-step process via an imputation-balancing approach.
arXiv Detail & Related papers (2021-10-14T15:31:16Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Learning Gaussian Mixtures with Generalised Linear Models: Precise
Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks.
We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation.
We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z) - Continual Learning with Fully Probabilistic Models [70.3497683558609]
We present an approach for continual learning based on fully probabilistic (or generative) models of machine learning.
We propose a pseudo-rehearsal approach using a Gaussian Mixture Model (GMM) instance for both generator and classifier functionalities.
We show that GMR achieves state-of-the-art performance on common class-incremental learning problems at very competitive time and memory complexity.
arXiv Detail & Related papers (2021-04-19T12:26:26Z) - Robustifying Binary Classification to Adversarial Perturbation [45.347651499585055]
In this paper we consider the problem of binary classification with adversarial perturbations.
We introduce a generalization to the max-margin classifier which takes into account the power of the adversary in manipulating the data.
Under some mild assumptions on the loss function, we theoretically show that the gradient descents converge to the RM classifier in its direction.
arXiv Detail & Related papers (2020-10-29T07:20:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.