Related papers: Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?

Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?

URL: http://arxiv.org/abs/2506.10912v2
Date: Wed, 18 Jun 2025 14:00:37 GMT
Title: Breaking Bad Molecules: Are MLLMs Ready for Structure-Level Molecular Detoxification?
Authors: Fei Lin, Ziyang Gong, Cong Wang, Yonglin Tian, Tengchao Zhang, Xue Yang, Gen Luo, Fei-Yue Wang,
Abstract summary: ToxiMol is the first benchmark task for general-purpose Multimodal Large Language Models (MLLMs) focused on molecular toxicity repair.<n>We construct a standardized dataset covering 11 primary tasks and 560 representative toxic molecules spanning diverse mechanisms and granularities.
Score: 19.700175505235876
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Toxicity remains a leading cause of early-stage drug development failure. Despite advances in molecular design and property prediction, the task of molecular toxicity repair - generating structurally valid molecular alternatives with reduced toxicity - has not yet been systematically defined or benchmarked. To fill this gap, we introduce ToxiMol, the first benchmark task for general-purpose Multimodal Large Language Models (MLLMs) focused on molecular toxicity repair. We construct a standardized dataset covering 11 primary tasks and 560 representative toxic molecules spanning diverse mechanisms and granularities. We design a prompt annotation pipeline with mechanism-aware and task-adaptive capabilities, informed by expert toxicological knowledge. In parallel, we propose an automated evaluation framework, ToxiEval, which integrates toxicity endpoint prediction, synthetic accessibility, drug-likeness, and structural similarity into a high-throughput evaluation chain for repair success. We systematically assess nearly 30 mainstream general-purpose MLLMs and design multiple ablation studies to analyze key factors such as evaluation criteria, candidate diversity, and failure attribution. Experimental results show that although current MLLMs still face significant challenges on this task, they begin to demonstrate promising capabilities in toxicity understanding, semantic constraint adherence, and structure-aware molecule editing.

Related papers

CoTox: Chain-of-Thought-Based Molecular Toxicity Reasoning and Prediction [18.693662550601147]
Large language models (LLMs) offer a promising alternative through step-by-step reasoning and integration of textual data.<n>We propose CoTox, a novel framework that integrates LLM with chain-of-thought (CoT) reasoning for multi-toxicity prediction.<n>Using GPT-4o, we show that CoTox outperforms both traditional machine learning and deep learning model.
arXiv Detail & Related papers (2025-08-05T07:04:44Z)
A Multi-Agent System Enables Versatile Information Extraction from the Chemical Literature [8.306442315850878]
We develop a multimodal large language model (MLLM)-based multi-agent system for robust and automated chemical information extraction.<n>Our system achieved an F1 score of 80.8% on a benchmark dataset of sophisticated multimodal chemical reaction graphics from the literature.
arXiv Detail & Related papers (2025-07-27T11:16:57Z)
Conditional Chemical Language Models are Versatile Tools in Drug Discovery [0.0]
We present SAFE-T, a chemical modeling framework that conditions on biological context to prioritize molecules.<n>It supports principled scoring of molecules across tasks such as virtual screening, drug-target interaction prediction, and activity cliff detection.<n>It consistently achieves performance comparable to or better than existing approaches.
arXiv Detail & Related papers (2025-07-14T13:42:39Z)
ChemActor: Enhancing Automated Extraction of Chemical Synthesis Actions with LLM-Generated Data [53.78763789036172]
We present ChemActor, a fully fine-tuned large language model (LLM) as a chemical executor to convert between unstructured experimental procedures and structured action sequences.<n>This framework integrates a data selection module that selects data based on distribution divergence, with a general-purpose LLM, to generate machine-executable actions from a single molecule input.<n>Experiments on reaction-to-description (R2D) and description-to-action (D2A) tasks demonstrate that ChemActor achieves state-of-the-art performance, outperforming the baseline model by 10%.
arXiv Detail & Related papers (2025-06-30T05:11:19Z)
Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations [43.623140005091535]
We introduce ChemCoTBench, a reasoning framework that bridges molecular structure understanding with arithmetic-inspired operations.<n>ChemCoTBench formalizes chemical problem-solving into transparent, step-by-step reasoning.<n>We evaluate models on two high-impact tasks: Molecular Property Optimization and Chemical Reaction Prediction.
arXiv Detail & Related papers (2025-05-27T15:15:44Z)
Aligned Probing: Relating Toxic Behavior and Model Internals [66.49887503194101]
We introduce aligned probing, a novel interpretability framework that aligns the behavior of language models (LMs)<n>Using this framework, we examine over 20 OLMo, Llama, and Mistral models, bridging behavioral and internal perspectives for toxicity for the first time.<n>Our results show that LMs strongly encode information about the toxicity level of inputs and subsequent outputs, particularly in lower layers.
arXiv Detail & Related papers (2025-03-17T17:23:50Z)
Intelligent System for Automated Molecular Patent Infringement Assessment [38.48937966447085]
PatentFinder is a novel multi-agent and tool-enhanced intelligence system that can accurately and comprehensively evaluate small molecules for patent infringement.<n>PatentFinder features five specialized agents that collaboratively analyze patent claims and molecular structures.<n>PatentFinder autonomously generates detailed and interpretable patent infringement reports, showcasing enhanced accuracy and improved interpretability.
arXiv Detail & Related papers (2024-12-10T12:14:38Z)
Pre-trained Molecular Language Models with Random Functional Group Masking [54.900360309677794]
We propose a SMILES-based underlineem Molecular underlineem Language underlineem Model, which randomly masking SMILES subsequences corresponding to specific molecular atoms. This technique aims to compel the model to better infer molecular structures and properties, thus enhancing its predictive capabilities.
arXiv Detail & Related papers (2024-11-03T01:56:15Z)
Smirk: An Atomically Complete Tokenizer for Molecular Foundation Models [0.0]
Existing models are constrained by closed-vocabulary tokenizers which capture only a fraction of molecular space.<n>We introduce n-gram language models as a low-cost proxy and validate their effectiveness by training and fine-tuning 18 RoBERTa-style encoders for molecular property prediction.<n>We propose two new tokenizers -- Smirk and Smirk-GPE -- with full coverage of the OpenSMILES specification.
arXiv Detail & Related papers (2024-09-19T02:36:04Z)
Uncovering the Mechanism of Hepatotoxiciy of PFAS Targeting L-FABP Using GCN and Computational Modeling [1.6249398255272316]
Per- and polyfluoroalkyl substances (PFAS) are persistent environmental pollutants with known toxicity and bioaccumulation issues. This study advances the predictive modeling of PFAS toxicity by combining semi-supervised graph convolutional networks (GCNs) with molecular descriptors and fingerprints.
arXiv Detail & Related papers (2024-09-16T15:13:39Z)
Detoxifying Large Language Models via Knowledge Editing [57.0669577257301]
This paper investigates using knowledge editing techniques to detoxify Large Language Models (LLMs) We construct a benchmark, SafeEdit, which covers nine unsafe categories with various powerful attack prompts. We conduct experiments with several knowledge editing approaches, indicating that knowledge editing has the potential to detoxify LLMs with a limited impact on general performance efficiently.
arXiv Detail & Related papers (2024-03-21T15:18:30Z)
Toxicity Detection in Drug Candidates using Simplified Molecular-Input Line-Entry System [0.0]
The need for analysis of toxicity in new drug candidates and the requirement of doing it fast have asked the consideration of scientists. This paper goes for the study of simplified Molecular Input Line-Entry System (SMILES) as a parameter to develop Long short term memory (LSTM) based models.
arXiv Detail & Related papers (2021-01-21T07:02:21Z)
Optimizing Molecules using Efficient Queries from Property Evaluations [66.66290256377376]
We propose QMO, a generic query-based molecule optimization framework. QMO improves the desired properties of an input molecule based on efficient queries. We show that QMO outperforms existing methods in the benchmark tasks of optimizing small organic molecules.
arXiv Detail & Related papers (2020-11-03T18:51:18Z)
Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost Functions [80.12620331438052]
deep learning has become an important tool for rapid screening of billions of molecules in silico for potential hits containing desired chemical features. Despite its importance, substantial challenges persist in training these models, such as severe class imbalance, high decision thresholds, and lack of ground truth labels in some datasets. We argue in favor of directly optimizing the receiver operating characteristic (ROC) in such cases, due to its robustness to class imbalance.
arXiv Detail & Related papers (2020-06-25T08:46:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.