Incorporating Hierarchical Semantics in Sparse Autoencoder Architectures
- URL: http://arxiv.org/abs/2506.01197v1
- Date: Sun, 01 Jun 2025 22:20:07 GMT
- Title: Incorporating Hierarchical Semantics in Sparse Autoencoder Architectures
- Authors: Mark Muchane, Sean Richardson, Kiho Park, Victor Veitch,
- Abstract summary: We introduce a modified SAE architecture that explicitly models a semantic hierarchy of concepts.<n> Application of this architecture to the internal representations of large language models shows both that semantic hierarchy can be learned, and that doing so improves both reconstruction and interpretability.
- Score: 10.919461859475268
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sparse dictionary learning (and, in particular, sparse autoencoders) attempts to learn a set of human-understandable concepts that can explain variation on an abstract space. A basic limitation of this approach is that it neither exploits nor represents the semantic relationships between the learned concepts. In this paper, we introduce a modified SAE architecture that explicitly models a semantic hierarchy of concepts. Application of this architecture to the internal representations of large language models shows both that semantic hierarchy can be learned, and that doing so improves both reconstruction and interpretability. Additionally, the architecture leads to significant improvements in computational efficiency.
Related papers
- Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures [49.19753720526998]
We derive theoretical scaling laws for neural network performance on synthetic datasets.<n>We validate that convolutional networks, whose structure aligns with that of the generative process through locality and weight sharing, enjoy a faster scaling of performance.<n>This finding clarifies the architectural biases underlying neural scaling laws and highlights how representation learning is shaped by the interaction between model architecture and the statistical properties of data.
arXiv Detail & Related papers (2025-05-11T17:44:14Z) - Retrieval-Augmented Semantic Parsing: Improving Generalization with Lexical Knowledge [6.948555996661213]
We introduce Retrieval-Augmented Semantic Parsing (RASP), a simple yet effective approach that integrates external symbolic knowledge into the parsing process.<n>Our experiments show that LLMs outperform previous encoder-decoder baselines for semantic parsing.<n>RASP further enhances their ability to predict unseen concepts, nearly doubling the performance of previous models on out-of-distribution concepts.
arXiv Detail & Related papers (2024-12-13T15:30:20Z) - Classification and Reconstruction Processes in Deep Predictive Coding
Networks: Antagonists or Allies? [0.0]
Predictive coding-inspired deep networks for visual computing integrate classification and reconstruction processes in shared intermediate layers.
We take a critical look at how classifying and reconstructing interact in deep learning architectures.
Our findings underscore a significant challenge: Classification-driven information diminishes reconstruction-driven information in intermediate layers' shared representations.
arXiv Detail & Related papers (2024-01-17T14:34:32Z) - A Recursive Bateson-Inspired Model for the Generation of Semantic Formal
Concepts from Spatial Sensory Data [77.34726150561087]
This paper presents a new symbolic-only method for the generation of hierarchical concept structures from complex sensory data.
The approach is based on Bateson's notion of difference as the key to the genesis of an idea or a concept.
The model is able to produce fairly rich yet human-readable conceptual representations without training.
arXiv Detail & Related papers (2023-07-16T15:59:13Z) - Imitation Learning-based Implicit Semantic-aware Communication Networks:
Multi-layer Representation and Collaborative Reasoning [68.63380306259742]
Despite its promising potential, semantic communications and semantic-aware networking are still at their infancy.
We propose a novel reasoning-based implicit semantic-aware communication network architecture that allows multiple tiers of CDC and edge servers to collaborate.
We introduce a new multi-layer representation of semantic information taking into consideration both the hierarchical structure of implicit semantics as well as the personalized inference preference of individual users.
arXiv Detail & Related papers (2022-10-28T13:26:08Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Decoupled Context Processing for Context Augmented Language Modeling [33.89636308731306]
Language models can be augmented with a context retriever to incorporate knowledge from large external databases.
By leveraging retrieved context, the neural network does not have to memorize the massive amount of world knowledge within its internal parameters, leading to better efficiency, interpretability and modularity.
arXiv Detail & Related papers (2022-10-11T20:05:09Z) - A Review of Sparse Expert Models in Deep Learning [23.721204843236006]
Sparse expert models are a thirty-year old concept re-emerging as a popular architecture in deep learning.
We review the concept of sparse expert models, provide a basic description of the common algorithms, and contextualize the advances in the deep learning era.
arXiv Detail & Related papers (2022-09-04T18:00:29Z) - Learning Interpretable Models Through Multi-Objective Neural
Architecture Search [0.9990687944474739]
We propose a framework to optimize for both task performance and "introspectability," a surrogate metric for aspects of interpretability.
We demonstrate that jointly optimizing for task error and introspectability leads to more disentangled and debuggable architectures that perform within error.
arXiv Detail & Related papers (2021-12-16T05:50:55Z) - Unsupervised Distillation of Syntactic Information from Contextualized
Word Representations [62.230491683411536]
We tackle the task of unsupervised disentanglement between semantics and structure in neural language representations.
To this end, we automatically generate groups of sentences which are structurally similar but semantically different.
We demonstrate that our transformation clusters vectors in space by structural properties, rather than by lexical semantics.
arXiv Detail & Related papers (2020-10-11T15:13:18Z) - Compositional Generalization in Semantic Parsing: Pre-training vs.
Specialized Architectures [1.8434042562191812]
We show that pre-training leads to significant improvements in performance vs. comparable non-pre-trained models.
We establish a new state of the art on the CFQ compositional generalization benchmark using pre-training together with an intermediate representation.
arXiv Detail & Related papers (2020-07-17T13:34:49Z) - Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions.
We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.