InstructPLM-mu: 1-Hour Fine-Tuning of ESM2 Beats ESM3 in Protein Mutation Predictions
- URL: http://arxiv.org/abs/2510.03370v2
- Date: Fri, 10 Oct 2025 02:14:50 GMT
- Title: InstructPLM-mu: 1-Hour Fine-Tuning of ESM2 Beats ESM3 in Protein Mutation Predictions
- Authors: Junde Xu, Yapin Shi, Lijun Lang, Taoyong Cui, Zhiming Zhang, Guangyong Chen, Jiezhong Qiu, Pheng-Ann Heng,
- Abstract summary: We propose a fine-tuning framework called InstructPLM-mu.<n>We show that fine-tuning ESM2 with structural inputs can reach performance comparable to ESM3.
- Score: 54.19843038028329
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multimodal protein language models deliver strong performance on mutation-effect prediction, but training such models from scratch demands substantial computational resources. In this paper, we propose a fine-tuning framework called InstructPLM-mu and try to answer a question: \textit{Can multimodal fine-tuning of a pretrained, sequence-only protein language model match the performance of models trained end-to-end? } Surprisingly, our experiments show that fine-tuning ESM2 with structural inputs can reach performance comparable to ESM3. To understand how this is achieved, we systematically compare three different feature-fusion designs and fine-tuning recipes. Our results reveal that both the fusion method and the tuning strategy strongly affect final accuracy, indicating that the fine-tuning process is not trivial. We hope this work offers practical guidance for injecting structure into pretrained protein language models and motivates further research on better fusion mechanisms and fine-tuning protocols.
Related papers
- Understanding protein function with a multimodal retrieval-augmented foundation model [4.281723404774888]
PoET-2 is a retrieval-augmented protein foundation model that incorporates in-context learning of family-specific evolutionary constraints.<n>PoET-2 achieves state-of-the-art performance on zero-shot variant effect prediction.
arXiv Detail & Related papers (2025-08-05T15:11:25Z) - SeqProFT: Applying LoRA Finetuning for Sequence-only Protein Property Predictions [8.112057136324431]
This study employs the LoRA method to perform end-to-end fine-tuning of the ESM-2 model.
A multi-head attention mechanism is integrated into the downstream network to combine sequence features with contact map information.
arXiv Detail & Related papers (2024-11-18T12:40:39Z) - Structure Language Models for Protein Conformation Generation [66.42864253026053]
Traditional physics-based simulation methods often struggle with sampling equilibrium conformations.<n>Deep generative models have shown promise in generating protein conformations as a more efficient alternative.<n>We introduce Structure Language Modeling as a novel framework for efficient protein conformation generation.
arXiv Detail & Related papers (2024-10-24T03:38:51Z) - Large Language Model Agent as a Mechanical Designer [7.136205674624813]
We propose a framework that leverages a pretrained Large Language Model (LLM) in conjunction with an FEM module to autonomously generate, evaluate, and refine structural designs.<n>LLM operates without domain-specific fine-tuning, using general reasoning to propose design candidates, interpret FEM-derived performance metrics, and apply structurally sound modifications.<n>Compared to Non- Sorting Genetic Algorithm II (NSGA-II), our method achieves faster convergence and fewer FEM evaluations.
arXiv Detail & Related papers (2024-04-26T16:41:24Z) - xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein [74.64101864289572]
We propose a unified protein language model, xTrimoPGLM, to address protein understanding and generation tasks simultaneously.<n>xTrimoPGLM significantly outperforms other advanced baselines in 18 protein understanding benchmarks across four categories.<n>It can also generate de novo protein sequences following the principles of natural ones, and can perform programmable generation after supervised fine-tuning.
arXiv Detail & Related papers (2024-01-11T15:03:17Z) - Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting
Pre-trained Language Models [22.977852629450346]
We propose a method that combines two popular research areas by injecting linguistic structures into pre-trained language models.
In our approach, parallel adapter modules encoding different linguistic structures are combined using a novel Mixture-of-Linguistic-Experts architecture.
Our experiment results show that our approach can outperform state-of-the-art PEFT methods with a comparable number of parameters.
arXiv Detail & Related papers (2023-10-24T23:29:06Z) - Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for
Large Language Models [125.91897197446379]
We find that MoE models benefit more from instruction tuning than dense models.
Our most powerful model, FLAN-MOE-32B, surpasses the performance of FLAN-PALM-62B on four benchmark tasks.
arXiv Detail & Related papers (2023-05-24T04:22:26Z) - Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis [84.12658971655253]
We propose Adapted Multimodal BERT, a BERT-based architecture for multimodal tasks.
adapter adjusts the pretrained language model for the task at hand, while the fusion layers perform task-specific, layer-wise fusion of audio-visual information with textual BERT representations.
In our ablations we see that this approach leads to efficient models, that can outperform their fine-tuned counterparts and are robust to input noise.
arXiv Detail & Related papers (2022-12-01T17:31:42Z) - AlphaFold Distillation for Protein Design [25.190210443632825]
Inverse protein folding is crucial in bio-engineering and drug discovery.
Forward folding models like AlphaFold offer a potential solution by accurately predicting structures from sequences.
We propose using knowledge distillation on folding model confidence metrics to create a faster and end-to-end differentiable distilled model.
arXiv Detail & Related papers (2022-10-05T19:43:06Z) - EBM-Fold: Fully-Differentiable Protein Folding Powered by Energy-based
Models [53.17320541056843]
We propose a fully-differentiable approach for protein structure optimization, guided by a data-driven generative network.
Our EBM-Fold approach can efficiently produce high-quality decoys, compared against traditional Rosetta-based structure optimization routines.
arXiv Detail & Related papers (2021-05-11T03:40:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.