Chem42: a Family of chemical Language Models for Target-aware Ligand Generation
- URL: http://arxiv.org/abs/2503.16563v1
- Date: Thu, 20 Mar 2025 07:07:30 GMT
- Title: Chem42: a Family of chemical Language Models for Target-aware Ligand Generation
- Authors: Aahan Singh, Engin Tekin, Maryam Nadeem, Nancy A. ElNaker, Mohammad Amaan Sayeed, Natalia Vassilieva, Boulbaba Ben Amor,
- Abstract summary: Chem42 is a cutting-edge family of generative chemical Language Models.<n>It achieves a sophisticated cross-modal representation of molecular structures, interactions, and binding patterns.<n>By reducing the search space of viable drug candidates, Chem42 could accelerate the drug discovery pipeline.
- Score: 3.2039076408339353
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Revolutionizing drug discovery demands more than just understanding molecular interactions - it requires generative models that can design novel ligands tailored to specific biological targets. While chemical Language Models (cLMs) have made strides in learning molecular properties, most fail to incorporate target-specific insights, restricting their ability to drive de-novo ligand generation. Chem42, a cutting-edge family of generative chemical Language Models, is designed to bridge this gap. By integrating atomic-level interactions with multimodal inputs from Prot42, a complementary protein Language Model, Chem42 achieves a sophisticated cross-modal representation of molecular structures, interactions, and binding patterns. This innovative framework enables the creation of structurally valid, synthetically accessible ligands with enhanced target specificity. Evaluations across diverse protein targets confirm that Chem42 surpasses existing approaches in chemical validity, target-aware design, and predicted binding affinity. By reducing the search space of viable drug candidates, Chem42 could accelerate the drug discovery pipeline, offering a powerful generative AI tool for precision medicine. Our Chem42 models set a new benchmark in molecule property prediction, conditional molecule generation, and target-aware ligand design. The models are publicly available at huggingface.co/inceptionai.
Related papers
- Molecule Generation for Target Protein Binding with Hierarchical Consistency Diffusion Model [17.885767456439215]
Atom-Motif Consistency Diffusion Model (AMDiff) is a hierarchical diffusion architecture that integrates both atom- and motif-level views of molecules.<n>Compared to existing approaches, AMDiff exhibits superior validity and novelty in generating molecules tailored to fit various protein pockets.
arXiv Detail & Related papers (2025-03-02T17:54:30Z) - GraphXForm: Graph transformer for computer-aided molecular design [73.1842164721868]
We present GraphXForm, a decoder-only graph transformer architecture, which is pretrained on existing compounds.
We evaluate it on various drug design tasks, demonstrating superior objective scores compared to state-of-the-art molecular design approaches.
arXiv Detail & Related papers (2024-11-03T19:45:15Z) - Aligning Target-Aware Molecule Diffusion Models with Exact Energy Optimization [147.7899503829411]
AliDiff is a novel framework to align pretrained target diffusion models with preferred functional properties.
It can generate molecules with state-of-the-art binding energies with up to -7.07 Avg. Vina Score.
arXiv Detail & Related papers (2024-07-01T06:10:29Z) - An Equivariant Generative Framework for Molecular Graph-Structure
Co-Design [54.92529253182004]
We present MolCode, a machine learning-based generative framework for underlineMolecular graph-structure underlineCo-design.
In MolCode, 3D geometric information empowers the molecular 2D graph generation, which in turn helps guide the prediction of molecular 3D structure.
Our investigation reveals that the 2D topology and 3D geometry contain intrinsically complementary information in molecule design.
arXiv Detail & Related papers (2023-04-12T13:34:22Z) - PrefixMol: Target- and Chemistry-aware Molecule Design via Prefix
Embedding [34.27649279751879]
We develop a novel generative model that considers both the targeted pocket's circumstances and a variety of chemical properties.
Experiments show that our model exhibits good controllability in both single and multi-conditional molecular generation.
arXiv Detail & Related papers (2023-02-14T15:27:47Z) - Domain-Agnostic Molecular Generation with Chemical Feedback [44.063584808910896]
MolGen is a pre-trained molecular language model tailored specifically for molecule generation.
It internalizes structural and grammatical insights through the reconstruction of over 100 million molecular SELFIES.
Our chemical feedback paradigm steers the model away from molecular hallucinations, ensuring alignment between the model's estimated probabilities and real-world chemical preferences.
arXiv Detail & Related papers (2023-01-26T17:52:56Z) - Retrieval-based Controllable Molecule Generation [63.44583084888342]
We propose a new retrieval-based framework for controllable molecule generation.
We use a small set of molecules to steer the pre-trained generative model towards synthesizing molecules that satisfy the given design criteria.
Our approach is agnostic to the choice of generative models and requires no task-specific fine-tuning.
arXiv Detail & Related papers (2022-08-23T17:01:16Z) - Chemistry42: An AI-based platform for de novo molecular design [48.40662244096031]
Chemistry42 is a software platform for de novo small molecule design.
It integrates Artificial Intelligence (AI) techniques with computational and medicinal chemistry methods.
arXiv Detail & Related papers (2021-01-22T10:49:26Z) - ChemoVerse: Manifold traversal of latent spaces for novel molecule
discovery [0.7742297876120561]
It is essential to identify molecular structures with the desired chemical properties.
Recent advances in generative models using neural networks and machine learning are being widely used to design virtual libraries of drug-like compounds.
arXiv Detail & Related papers (2020-09-29T12:11:40Z) - CogMol: Target-Specific and Selective Drug Design for COVID-19 Using
Deep Generative Models [74.58583689523999]
We propose an end-to-end framework, named CogMol, for designing new drug-like small molecules targeting novel viral proteins.
CogMol combines adaptive pre-training of a molecular SMILES Variational Autoencoder (VAE) and an efficient multi-attribute controlled sampling scheme.
CogMol handles multi-constraint design of synthesizable, low-toxic, drug-like molecules with high target specificity and selectivity.
arXiv Detail & Related papers (2020-04-02T18:17:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.