Related papers: Auto-ADMET: An Effective and Interpretable AutoML Method for Chemical ADMET Property Prediction

Auto-ADMET: An Effective and Interpretable AutoML Method for Chemical ADMET Property Prediction

URL: http://arxiv.org/abs/2502.16378v1
Date: Sat, 22 Feb 2025 22:54:08 GMT
Title: Auto-ADMET: An Effective and Interpretable AutoML Method for Chemical ADMET Property Prediction
Authors: Alex G. C. de Sá, David B. Ascher,
Abstract summary: This work introduces Auto-ADMET, an interpretable evolutionary-based AutoML method for chemical ADMET property prediction.<n>It achieves comparable or better predictive performance against three alternative methods.<n>The use of a Bayesian Network model on Auto-ADMET's evolutionary process assisted in both shaping the search procedure and interpreting the causes of its AutoML performance.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Machine learning (ML) has been playing important roles in drug discovery in the past years by providing (pre-)screening tools for prioritising chemical compounds to pass through wet lab experiments. One of the main ML tasks in drug discovery is to build quantitative structure-activity relationship (QSAR) models, associating the molecular structure of chemical compounds with an activity or property. These properties -- including absorption, distribution, metabolism, excretion and toxicity (ADMET) -- are essential to model compound behaviour, activity and interactions in the organism. Although several methods exist, the majority of them do not provide an appropriate model's personalisation, yielding to bias and lack of generalisation to new data since the chemical space usually shifts from application to application. This fact leads to low predictive performance when completely new data is being tested by the model. The area of Automated Machine Learning (AutoML) emerged aiming to solve this issue, outputting tailored ML algorithms to the data at hand. Although an important task, AutoML has not been practically used to assist cheminformatics and computational chemistry researchers often, with just a few works related to the field. To address these challenges, this work introduces Auto-ADMET, an interpretable evolutionary-based AutoML method for chemical ADMET property prediction. Auto-ADMET employs a Grammar-based Genetic Programming (GGP) method with a Bayesian Network Model to achieve comparable or better predictive performance against three alternative methods -- standard GGP method, pkCSM and XGBOOST model -- on 12 benchmark chemical ADMET property prediction datasets. The use of a Bayesian Network model on Auto-ADMET's evolutionary process assisted in both shaping the search procedure and interpreting the causes of its AutoML performance.

Related papers

EDCA - An Evolutionary Data-Centric AutoML Framework for Efficient Pipelines [0.276240219662896]
This work presents EDCA, an Evolutionary Data Centric AutoML framework. Data quality is usually an overlooked part of AutoML and continues to be a manual and time-consuming task. EDCA was compared to FLAML and TPOT, two frameworks at the top of the AutoML benchmarks.
arXiv Detail & Related papers (2025-03-06T11:46:07Z)
AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline. Recent works have started exploiting large language models (LLM) to lessen such burden. This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z)
Towards Evolutionary-based Automated Machine Learning for Small Molecule Pharmacokinetic Prediction [0.0]
Small molecule properties are crucial in the early stages of drug development. Existing methods lack personalisation and rely on manually crafted ML algorithms or pipelines. We propose a novel evolutionary-based automated ML method (AutoML) specifically designed for predicting small molecule properties.
arXiv Detail & Related papers (2024-08-01T09:46:06Z)
AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning [54.47116888545878]
AutoAct is an automatic agent learning framework for QA. It does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models.
arXiv Detail & Related papers (2024-01-10T16:57:24Z)
Chemist-X: Large Language Model-empowered Agent for Reaction Condition Recommendation in Chemical Synthesis [55.30328162764292]
Chemist-X is a comprehensive AI agent that automates the reaction condition optimization (RCO) task in chemical synthesis. The agent uses retrieval-augmented generation (RAG) technology and AI-controlled wet-lab experiment executions. Results of our automatic wet-lab experiments, achieved by fully LLM-supervised end-to-end operation with no human in the lope, prove Chemist-X's ability in self-driving laboratories.
arXiv Detail & Related papers (2023-11-16T01:21:33Z)
Benchmarking Automated Machine Learning Methods for Price Forecasting Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions. Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part. We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z)
AutoEn: An AutoML method based on ensembles of predefined Machine Learning pipelines for supervised Traffic Forecasting [1.6242924916178283]
Traffic Forecasting (TF) is gaining relevance due to its ability to mitigate traffic congestion by forecasting future traffic states. TF poses one big challenge to the Machine Learning paradigm, known as the Model Selection Problem (MSP) We introduce AutoEn, which is a simple and efficient method for automatically generating multi-classifier ensembles from a predefined set of ML pipelines.
arXiv Detail & Related papers (2023-03-19T18:37:18Z)
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System [85.8338446357469]
We introduce OmniForce, a human-centered AutoML system that yields both human-assisted ML and ML-assisted human techniques. We show how OmniForce can put an AutoML system into practice and build adaptive AI in open-environment scenarios.
arXiv Detail & Related papers (2023-03-01T13:35:22Z)
Improving Molecular Representation Learning with Metric Learning-enhanced Optimal Transport [49.237577649802034]
We develop a novel optimal transport-based algorithm termed MROT to enhance their generalization capability for molecular regression problems. MROT significantly outperforms state-of-the-art models, showing promising potential in accelerating the discovery of new substances.
arXiv Detail & Related papers (2022-02-13T04:56:18Z)
CASTELO: Clustered Atom Subtypes aidEd Lead Optimization -- a combined machine learning and molecular modeling method [2.8381402107366034]
We propose a combined machine learning and molecular modeling approach that automates lead optimization workflow. Our method provides new hints for drug modification hotspots which can be used to improve drug efficacy.
arXiv Detail & Related papers (2020-11-27T15:41:00Z)
Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical Evolution [1.5224436211478214]
This paper describes a novel grammar-based framework that adapts Dynamic Structured Grammatical Evolution (DSGE) to the evolution of Scikit-Learn classification pipelines. The experimental results include comparing AutoML-DSGE to another grammar-based AutoML framework, Resilient ClassificationPipeline Evolution (RECIPE)
arXiv Detail & Related papers (2020-04-01T09:31:34Z)
Predicting drug properties with parameter-free machine learning: Pareto-Optimal Embedded Modeling (POEM) [0.13854111346209866]
We describe a similarity-based method for predicting molecular properties. POEM is a non-parametric, supervised ML algorithm developed to generate reliable predictive models without need for optimization. We benchmark POEM relative to industry-standard ML algorithms and published results across 17 classifications tasks. POEM performs well in all cases and reduces the risk of overfitting.
arXiv Detail & Related papers (2020-02-11T17:20:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.