SPARC: Concept-Aligned Sparse Autoencoders for Cross-Model and Cross-Modal Interpretability
- URL: http://arxiv.org/abs/2507.06265v1
- Date: Mon, 07 Jul 2025 22:29:00 GMT
- Title: SPARC: Concept-Aligned Sparse Autoencoders for Cross-Model and Cross-Modal Interpretability
- Authors: Ali Nasiri-Sarvi, Hassan Rivaz, Mahdi S. Hosseini,
- Abstract summary: We introduce (Sparse Autoencoders for Aligned Representation of Concepts) a new framework that learns a single, unified latent space shared across diverse architectures.<n>On Open Images, dramatically improves concept alignment, achieving a Jaccard similarity of 0.80, more than tripling the alignment compared to previous methods.
- Score: 9.90112908284836
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding how different AI models encode the same high-level concepts, such as objects or attributes, remains challenging because each model typically produces its own isolated representation. Existing interpretability methods like Sparse Autoencoders (SAEs) produce latent concepts individually for each model, resulting in incompatible concept spaces and limiting cross-model interpretability. To address this, we introduce SPARC (Sparse Autoencoders for Aligned Representation of Concepts), a new framework that learns a single, unified latent space shared across diverse architectures and modalities (e.g., vision models like DINO, and multimodal models like CLIP). SPARC's alignment is enforced through two key innovations: (1) a Global TopK sparsity mechanism, ensuring all input streams activate identical latent dimensions for a given concept; and (2) a Cross-Reconstruction Loss, which explicitly encourages semantic consistency between models. On Open Images, SPARC dramatically improves concept alignment, achieving a Jaccard similarity of 0.80, more than tripling the alignment compared to previous methods. SPARC creates a shared sparse latent space where individual dimensions often correspond to similar high-level concepts across models and modalities, enabling direct comparison of how different architectures represent identical concepts without requiring manual alignment or model-specific analysis. As a consequence of this aligned representation, SPARC also enables practical applications such as text-guided spatial localization in vision-only models and cross-model/cross-modal retrieval. Code and models are available at https://github.com/AtlasAnalyticsLab/SPARC.
Related papers
- Escaping Plato's Cave: JAM for Aligning Independently Trained Vision and Language Models [27.091366887354063]
We introduce a framework that trains modality-specific autoencoders on latent representations of single modality models.<n>By analogy, this framework serves as a method to escape Plato's Cave, enabling the emergence of shared structure from disjoint inputs.
arXiv Detail & Related papers (2025-07-01T21:43:50Z) - Cross-architecture universal feature coding via distribution alignment [88.73189953617594]
We introduce a new research problem: cross-architecture universal feature coding (CAUFC)<n>We propose a two-step distribution alignment method. First, we design the format alignment method that CNN and Transformer features into a consistent 2D token format. Second, we propose the feature value alignment method that harmonizes statistical distributions via truncation and normalization.<n>As a first attempt to study CAUFC, we evaluate our method on the image classification task. Experimental results demonstrate that our method achieves superior rate-accuracy trade-offs compared to the architecture-specific baseline.
arXiv Detail & Related papers (2025-06-15T06:14:02Z) - Structural Similarity-Inspired Unfolding for Lightweight Image Super-Resolution [88.20464308588889]
We propose a Structural Similarity-Inspired Unfolding (SSIU) method for efficient image SR.<n>This method is designed through unfolding an SR optimization function constrained by structural similarity.<n>Our model outperforms current state-of-the-art models, boasting lower parameter counts and reduced memory consumption.
arXiv Detail & Related papers (2025-06-13T14:29:40Z) - Interpreting the linear structure of vision-language model embedding spaces [12.846590038965774]
We train and release sparse autoencoders (SAEs) on the embedding spaces of four vision-language models.<n>We find that SAEs are better at reconstructing the real embeddings, while also able to retain the most sparsity.<n>We also show that the key commonly-activating concepts extracted by SAEs are remarkably stable across runs.
arXiv Detail & Related papers (2025-04-16T01:40:06Z) - Model Assembly Learning with Heterogeneous Layer Weight Merging [57.8462476398611]
We introduce Model Assembly Learning (MAL), a novel paradigm for model merging.<n>MAL integrates parameters from diverse models in an open-ended model zoo to enhance the base model's capabilities.
arXiv Detail & Related papers (2025-03-27T16:21:53Z) - Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment [6.614005142754584]
Universal Sparse Autoencoders (USAEs) are a framework for uncovering and aligning interpretable concepts spanning multiple deep neural networks.<n>USAEs learn a universal concept space that can reconstruct and interpret the internal activations of multiple models at once.
arXiv Detail & Related papers (2025-02-06T02:06:16Z) - Text-To-Concept (and Back) via Cross-Model Alignment [48.133333356834186]
We show that mapping between an image's representation in one model to its representation in another can be learned surprisingly well with just a linear layer.
We convert fixed off-the-shelf vision encoders to surprisingly strong zero-shot classifiers for free.
We show other immediate use-cases of text-to-concept, like building concept bottleneck models with no concept supervision.
arXiv Detail & Related papers (2023-05-10T18:01:06Z) - Universal Information Extraction as Unified Semantic Matching [54.19974454019611]
We decouple information extraction into two abilities, structuring and conceptualizing, which are shared by different tasks and schemas.
Based on this paradigm, we propose to universally model various IE tasks with Unified Semantic Matching framework.
In this way, USM can jointly encode schema and input text, uniformly extract substructures in parallel, and controllably decode target structures on demand.
arXiv Detail & Related papers (2023-01-09T11:51:31Z) - Multimodal hierarchical Variational AutoEncoders with Factor Analysis latent space [45.418113011182186]
This study proposes a novel method to address limitations by combining Variational AutoEncoders (VAEs) with a Factor Analysis latent space (FA-VAE)
The proposed FA-VAE method employs multiple VAEs to learn a private representation for each heterogeneous data view in a continuous latent space.
arXiv Detail & Related papers (2022-07-19T10:46:02Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - Decoupled Multi-task Learning with Cyclical Self-Regulation for Face
Parsing [71.19528222206088]
We propose a novel Decoupled Multi-task Learning with Cyclical Self-Regulation for face parsing.
Specifically, DML-CSR designs a multi-task model which comprises face parsing, binary edge, and category edge detection.
Our method achieves the new state-of-the-art performance on the Helen, CelebA-HQ, and LapaMask datasets.
arXiv Detail & Related papers (2022-03-28T02:12:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.