InterPLM: Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders
- URL: http://arxiv.org/abs/2412.12101v1
- Date: Wed, 13 Nov 2024 18:51:21 GMT
- Title: InterPLM: Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders
- Authors: Elana Simon, James Zou,
- Abstract summary: Protein language models (PLMs) have demonstrated remarkable success in protein modeling and design.<n>We present a systematic approach to extract and analyze interpretable features from PLMs using sparse autoencoders.<n>As practical applications, we demonstrate how these latent features can fill in missing annotations in protein databases.
- Score: 24.150250149027883
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Protein language models (PLMs) have demonstrated remarkable success in protein modeling and design, yet their internal mechanisms for predicting structure and function remain poorly understood. Here we present a systematic approach to extract and analyze interpretable features from PLMs using sparse autoencoders (SAEs). By training SAEs on embeddings from the PLM ESM-2, we identify up to 2,548 human-interpretable latent features per layer that strongly correlate with up to 143 known biological concepts such as binding sites, structural motifs, and functional domains. In contrast, examining individual neurons in ESM-2 reveals up to 46 neurons per layer with clear conceptual alignment across 15 known concepts, suggesting that PLMs represent most concepts in superposition. Beyond capturing known annotations, we show that ESM-2 learns coherent concepts that do not map onto existing annotations and propose a pipeline using language models to automatically interpret novel latent features learned by the SAEs. As practical applications, we demonstrate how these latent features can fill in missing annotations in protein databases and enable targeted steering of protein sequence generation. Our results demonstrate that PLMs encode rich, interpretable representations of protein biology and we propose a systematic framework to extract and analyze these latent features. In the process, we recover both known biology and potentially new protein motifs. As community resources, we introduce InterPLM (interPLM.ai), an interactive visualization platform for exploring and analyzing learned PLM features, and release code for training and analysis at github.com/ElanaPearl/interPLM.
Related papers
- PLM-eXplain: Divide and Conquer the Protein Embedding Space [0.0]
We present an explainable adapter layer - PLM-eXplain (PLM-X)
PLM-X bridges the gap by factoring PLM embeddings into two components: an interpretable subspace based on established biochemical features, and a residual subspace that preserves the model's predictive power.
We demonstrate the effectiveness of our approach across three protein-level classification tasks.
arXiv Detail & Related papers (2025-04-09T10:46:24Z) - Towards Interpretable Protein Structure Prediction with Sparse Autoencoders [0.0]
Matryoshka SAEs learn hierarchically organized features by forcing nested groups of latents to reconstruct inputs independently.
We scale SAEs to ESM2-3B, the base model for ESMFold, enabling mechanistic interpretability of protein structure prediction for the first time.
We show that SAEs trained on ESM2-3B significantly outperform those trained on smaller models for both biological concept discovery and contact map prediction.
arXiv Detail & Related papers (2025-03-11T17:57:29Z) - SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories [52.57696897619189]
We introduce the Human-Like Mask Modeling Task (HLMAT), a new paradigm where MLLMs mimic human annotators using interactive segmentation tools.
HLMAT enables MLLMs to iteratively generate text-based click points, achieving high-quality masks without architectural changes or implicit tokens.
HLMAT provides a protocol for assessing fine-grained pixel understanding in MLLMs and introduces a vision-centric, multi-step decision-making task.
arXiv Detail & Related papers (2025-03-11T17:08:54Z) - Interpreting and Steering Protein Language Models through Sparse Autoencoders [0.9208007322096533]
This paper explores the application of sparse autoencoders to interpret the internal representations of protein language models.
By performing a statistical analysis on each latent component's relevance to distinct protein annotations, we identify potential interpretations linked to various protein characteristics.
We then leverage these insights to guide sequence generation, shortlisting the relevant latent components that can steer the model towards desired targets.
arXiv Detail & Related papers (2025-02-13T10:11:36Z) - Prot2Chat: Protein LLM with Early Fusion of Sequence and Structure [7.9473027178525975]
Prot2Chat is a novel framework that integrates multimodal protein representations with natural language through a unified module.
Our model incorporates a modified ProteinMPNN encoder, which encodes protein sequence and structural information in a unified manner, and a protein-text adapter with cross-attention mechanisms.
arXiv Detail & Related papers (2025-02-07T05:23:16Z) - Long-context Protein Language Model [76.95505296417866]
Self-supervised training of language models (LMs) has seen great success for protein sequences in learning meaningful representations and for generative drug design.
Most protein LMs are based on the Transformer architecture trained on individual proteins with short context lengths.
We propose LC-PLM based on an alternative protein LM architecture, BiMamba-S, built off selective structured state-space models.
We also introduce its graph-contextual variant, LC-PLM-G, which contextualizes protein-protein interaction graphs for a second stage of training.
arXiv Detail & Related papers (2024-10-29T16:43:28Z) - Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding [43.811432723460534]
We introduce Structure-Enhanced Protein Instruction Tuning (SEPIT) framework to bridge this gap.
Our approach integrates a noval structure-aware module into pLMs to inform them with structural knowledge, and then connects these enhanced pLMs to large language models (LLMs) to generate understanding of proteins.
We construct the largest and most comprehensive protein instruction dataset to date, which allows us to train and evaluate the general-purpose protein understanding model.
arXiv Detail & Related papers (2024-10-04T16:02:50Z) - Recent advances in interpretable machine learning using structure-based protein representations [30.907048279915312]
Recent advancements in machine learning (ML) are transforming the field of structural biology.
We present various methods for representing protein 3D structures from low to high-resolution.
We show how interpretable ML methods can support tasks such as predicting protein structures, protein function, and protein-protein interactions.
arXiv Detail & Related papers (2024-09-26T10:56:27Z) - Diffusion Language Models Are Versatile Protein Learners [75.98083311705182]
This paper introduces diffusion protein language model (DPLM), a versatile protein language model that demonstrates strong generative and predictive capabilities for protein sequences.
We first pre-train scalable DPLMs from evolutionary-scale protein sequences within a generative self-supervised discrete diffusion probabilistic framework.
After pre-training, DPLM exhibits the ability to generate structurally plausible, novel, and diverse protein sequences for unconditional generation.
arXiv Detail & Related papers (2024-02-28T18:57:56Z) - xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein [74.64101864289572]
We propose a unified protein language model, xTrimoPGLM, to address protein understanding and generation tasks simultaneously.<n>xTrimoPGLM significantly outperforms other advanced baselines in 18 protein understanding benchmarks across four categories.<n>It can also generate de novo protein sequences following the principles of natural ones, and can perform programmable generation after supervised fine-tuning.
arXiv Detail & Related papers (2024-01-11T15:03:17Z) - A Systematic Study of Joint Representation Learning on Protein Sequences
and Structures [38.94729758958265]
Learning effective protein representations is critical in a variety of tasks in biology such as predicting protein functions.
Recent sequence representation learning methods based on Protein Language Models (PLMs) excel in sequence-based tasks, but their direct adaptation to tasks involving protein structures remains a challenge.
Our study undertakes a comprehensive exploration of joint protein representation learning by integrating a state-of-the-art PLM with distinct structure encoders.
arXiv Detail & Related papers (2023-03-11T01:24:10Z) - Linguistically inspired roadmap for building biologically reliable
protein language models [0.5412332666265471]
We argue that guidance drawn from linguistics can aid with building more interpretable protein LMs.
We provide a linguistics-based roadmap for protein LM pipeline choices with regard to training data, tokenization, token embedding, sequence embedding, and model interpretation.
arXiv Detail & Related papers (2022-07-03T08:42:44Z) - Learning multi-scale functional representations of proteins from
single-cell microscopy data [77.34726150561087]
We show that simple convolutional networks trained on localization classification can learn protein representations that encapsulate diverse functional information.
We also propose a robust evaluation strategy to assess quality of protein representations across different scales of biological function.
arXiv Detail & Related papers (2022-05-24T00:00:07Z) - A journey in ESN and LSTM visualisations on a language task [77.34726150561087]
We trained ESNs and LSTMs on a Cross-Situationnal Learning (CSL) task.
The results are of three kinds: performance comparison, internal dynamics analyses and visualization of latent space.
arXiv Detail & Related papers (2020-12-03T08:32:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.