Matcha: Multi-Stage Riemannian Flow Matching for Accurate and Physically Valid Molecular Docking
- URL: http://arxiv.org/abs/2510.14586v2
- Date: Tue, 21 Oct 2025 12:13:54 GMT
- Title: Matcha: Multi-Stage Riemannian Flow Matching for Accurate and Physically Valid Molecular Docking
- Authors: Daria Frolova, Talgat Daulbaev, Egor Sevriugov, Sergei A. Nikolenko, Dmitry N. Ivankov, Ivan Oseledets, Marina A. Pak,
- Abstract summary: We introduce Matcha, a novel molecular docking pipeline that combines multi-stage flow matching with learned scoring and physical validity filtering.<n>Compared to various approaches, Matcha demonstrates superior performance on Astex and PDBbind test sets in terms of docking success rate and physical plausibility.
- Score: 2.073256312593672
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Accurate prediction of protein-ligand binding poses is crucial for structure-based drug design, yet existing methods struggle to balance speed, accuracy, and physical plausibility. We introduce Matcha, a novel molecular docking pipeline that combines multi-stage flow matching with learned scoring and physical validity filtering. Our approach consists of three sequential stages applied consecutively to refine docking predictions, each implemented as a flow matching model operating on appropriate geometric spaces ($\mathbb{R}^3$, $\mathrm{SO}(3)$, and $\mathrm{SO}(2)$). We enhance the prediction quality through a dedicated scoring model and apply unsupervised physical validity filters to eliminate unrealistic poses. Compared to various approaches, Matcha demonstrates superior performance on Astex and PDBbind test sets in terms of docking success rate and physical plausibility. Moreover, our method works approximately 25 times faster than modern large-scale co-folding models. The model weights and inference code to reproduce our results are available at https://github.com/LigandPro/Matcha.
Related papers
- Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching [66.39914384073145]
We propose a self-consistency framework that turns cheap diffusion-sampled reasoning into a reusable pool of step-level candidates.<n>We find that step-level recombination is most beneficial on harder problems.<n>Our training-free framework improves average accuracy by up to 2 across six math and coding tasks.
arXiv Detail & Related papers (2026-02-26T11:08:39Z) - RoMa v2: Harder Better Faster Denser Feature Matching [56.71494120301684]
Dense feature matching aims to estimate all correspondences between two images of a 3D scene.<n>Existing dense matchers fail or perform poorly for many hard real-world scenarios.<n>In this paper, we attack these weaknesses on a wide front through a series of systematic improvements.
arXiv Detail & Related papers (2025-11-19T18:59:38Z) - Test-Time Consistency in Vision Language Models [26.475993408532304]
Vision-Language Models (VLMs) have achieved impressive performance across a wide range of multimodal tasks.<n>Recent benchmarks, such as MM-R3, highlight that even state-of-the-art VLMs can produce divergent predictions across semantically equivalent inputs.<n>We propose a simple and effective test-time consistency framework that enhances semantic consistency without supervised re-training.
arXiv Detail & Related papers (2025-06-27T17:09:44Z) - Contrastive Flow Matching [61.60002028726023]
We introduce Contrastive Flow Matching, an extension to the flow matching objective that explicitly enforces uniqueness across all conditional flows.<n>Our approach adds a contrastive objective that maximizes dissimilarities between predicted flows from arbitrary sample pairs.<n>We find that training models with Contrastive Flow Matching (1) improves training speed by a factor of up to 9x, (2) requires up to 5x fewer de-noising steps and (3) lowers FID by up to 8.9 compared to training the same models with flow matching.
arXiv Detail & Related papers (2025-06-05T17:59:58Z) - FABind: Fast and Accurate Protein-Ligand Binding [127.7790493202716]
$mathbfFABind$ is an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding.
Our proposed model demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods.
arXiv Detail & Related papers (2023-10-10T16:39:47Z) - DiffComplete: Diffusion-based Generative 3D Shape Completion [114.43353365917015]
We introduce a new diffusion-based approach for shape completion on 3D range scans.
We strike a balance between realism, multi-modality, and high fidelity.
DiffComplete sets a new SOTA performance on two large-scale 3D shape completion benchmarks.
arXiv Detail & Related papers (2023-06-28T16:07:36Z) - Warp Consistency for Unsupervised Learning of Dense Correspondences [116.56251250853488]
Key challenge in learning dense correspondences is lack of ground-truth matches for real image pairs.
We propose Warp Consistency, an unsupervised learning objective for dense correspondence regression.
Our approach sets a new state-of-the-art on several challenging benchmarks, including MegaDepth, RobotCar and TSS.
arXiv Detail & Related papers (2021-04-07T17:58:22Z) - Modeling Score Distributions and Continuous Covariates: A Bayesian
Approach [8.772459063453285]
We develop a generative model of the match and non-match score distributions over continuous covariates.
We use mixture models to capture arbitrary distributions and local basis functions.
Three experiments demonstrate the accuracy and effectiveness of our approach.
arXiv Detail & Related papers (2020-09-21T02:41:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.