Symbolic Rule Extraction from Attention-Guided Sparse Representations in Vision Transformers
- URL: http://arxiv.org/abs/2505.06745v1
- Date: Sat, 10 May 2025 19:45:15 GMT
- Title: Symbolic Rule Extraction from Attention-Guided Sparse Representations in Vision Transformers
- Authors: Parth Padalkar, Gopal Gupta,
- Abstract summary: Recent neuro-symbolic approaches have successfully extracted symbolic rule-sets from CNN-based models to enhance interpretability.<n>We propose a framework for symbolic rule extraction from Vision Transformers (ViTs) by introducing a sparse concept layer inspired by Sparse Autoencoders (SAEs)<n>Our method achieves a 5.14% better classification accuracy than the standard ViT while enabling symbolic reasoning.
- Score: 1.3812010983144802
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent neuro-symbolic approaches have successfully extracted symbolic rule-sets from CNN-based models to enhance interpretability. However, applying similar techniques to Vision Transformers (ViTs) remains challenging due to their lack of modular concept detectors and reliance on global self-attention mechanisms. We propose a framework for symbolic rule extraction from ViTs by introducing a sparse concept layer inspired by Sparse Autoencoders (SAEs). This linear layer operates on attention-weighted patch representations and learns a disentangled, binarized representation in which individual neurons activate for high-level visual concepts. To encourage interpretability, we apply a combination of L1 sparsity, entropy minimization, and supervised contrastive loss. These binarized concept activations are used as input to the FOLD-SE-M algorithm, which generates a rule-set in the form of logic programs. Our method achieves a 5.14% better classification accuracy than the standard ViT while enabling symbolic reasoning. Crucially, the extracted rule-set is not merely post-hoc but acts as a logic-based decision layer that operates directly on the sparse concept representations. The resulting programs are concise and semantically meaningful. This work is the first to extract executable logic programs from ViTs using sparse symbolic representations. It bridges the gap between transformer-based vision models and symbolic logic programming, providing a step forward in interpretable and verifiable neuro-symbolic AI.
Related papers
- Neuro-Symbolic Synergy for Interactive World Modeling [20.07686289460334]
We propose Neuro-Symbolic Synergy (NeSyS), a framework that integrates the probabilistic semantic priors of large language models with executable symbolic rules.<n>NeSyS alternates training between the two models using trajectories inadequately explained by the other.
arXiv Detail & Related papers (2026-02-11T03:36:18Z) - Beyond Pixels: Visual Metaphor Transfer via Schema-Driven Agentic Reasoning [56.24016465596292]
A visual metaphor constitutes a high-order form of human creativity, employing cross-domain semantic fusion to transform abstract concepts into impactful visual rhetoric.<n>We introduce the task of Visual Metaphor Transfer (VMT), which challenges models to autonomously decouple the "creative essence" from a reference image and re-materialize that abstract logic onto a user-specified subject.<n>Our method significantly outperforms SOTA baselines in metaphor consistency, analogy appropriateness, and visual creativity, paving the way for automated high-impact creative applications in advertising and media.
arXiv Detail & Related papers (2026-02-01T17:01:36Z) - ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation [64.84095852784714]
Residual Tokenizer (ResTok) is a 1D visual tokenizer that builds hierarchical residuals for both image tokens and latent tokens.<n>We show that restoring hierarchical residual priors in visual tokenization significantly improves AR image generation, achieving a gFID of 2.34 on ImageNet-256 with only 9 sampling steps.
arXiv Detail & Related papers (2026-01-07T14:09:18Z) - Attention as Binding: A Vector-Symbolic Perspective on Transformer Reasoning [0.0]
Transformer-based language models display impressive reasoning-like behavior, yet remain brittle on tasks that require stable symbolic manipulation.<n>This paper develops a unified perspective on these phenomena by interpreting self-attention and residual streams as implementing an approximate Vector Symbolic Architecture (VSA)<n>In this view, queries and keys define role spaces, values encode fillers, attention weights perform soft unbinding, and residual connections realize superposition of many bound structures.
arXiv Detail & Related papers (2025-12-08T05:38:24Z) - Hierarchical Process Reward Models are Symbolic Vision Learners [56.94353087007494]
Symbolic computer vision represents diagrams through explicit logical rules and structured representations, enabling interpretable understanding in machine vision.<n>This requires fundamentally different learning paradigms from pixel-based visual models.<n>We propose a novel self-supervised auto-encoder that encodes diagrams into primitives and decodes them through our executable engine to reconstruct input diagrams.
arXiv Detail & Related papers (2025-12-02T18:46:40Z) - Concept-RuleNet: Grounded Multi-Agent Neurosymbolic Reasoning in Vision Language Models [41.6338086518055]
Concept-RuleNet is a multi-agent system that reinstates visual grounding while retaining transparent reasoning.<n>Our system augments state-of-the-art neurosymbolic baselines by an average of 5% while also reducing the occurrence of hallucinated symbols in rules by up to 50%.
arXiv Detail & Related papers (2025-11-13T18:13:56Z) - VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set [80.50996301430108]
The alignment of vision-language representations endows current Vision-Language Models with strong multi-modal reasoning capabilities.<n>We propose VL-SAE, a sparse autoencoder that encodes vision-language representations into its hidden activations.<n>For interpretation, the alignment between vision and language representations can be understood by comparing their semantics with concepts.
arXiv Detail & Related papers (2025-10-24T10:29:31Z) - Current Practices for Building LLM-Powered Reasoning Tools Are Ad Hoc -- and We Can Do Better [0.0]
I propose Neurosymbolic Transition Systems as a principled computational model that can underlie infrastructure for building neurosymbolic AR tools.<n>In this model, symbolic state is paired with intuition, and state transitions operate over symbols and intuition in parallel.<n>I argue why this new paradigm can scale logical reasoning beyond current capabilities while retaining the strong guarantees of symbolic algorithms.
arXiv Detail & Related papers (2025-07-08T11:19:09Z) - Pre-Training Meta-Rule Selection Policy for Visual Generative Abductive Learning [24.92602845948049]
We propose a pre-training method for obtaining meta-rule selection policy for visual generative learning approach AbdGen.<n>The pre-training process is done on pure symbol data, not involving symbol grounding learning of raw visual inputs.<n>Our method is able to effectively address the meta-rule selection problem for visual abduction, boosting the efficiency of visual generative abductive learning.
arXiv Detail & Related papers (2025-03-09T03:41:11Z) - Compositional Generalization Across Distributional Shifts with Sparse Tree Operations [77.5742801509364]
We introduce a unified neurosymbolic architecture called the Differentiable Tree Machine.<n>We significantly increase the model's efficiency through the use of sparse vector representations of symbolic structures.<n>We enable its application beyond the restricted set of tree2tree problems to the more general class of seq2seq problems.
arXiv Detail & Related papers (2024-12-18T17:20:19Z) - Mechanisms of Symbol Processing for In-Context Learning in Transformer Networks [78.54913566111198]
Large Language Models (LLMs) have demonstrated impressive abilities in symbol processing through in-context learning (ICL)
We seek to understand the mechanisms that can enable robust symbol processing in transformer networks.
We develop a high-level language, PSL, that allows us to write symbolic programs to do complex, abstract symbol processing.
arXiv Detail & Related papers (2024-10-23T01:38:10Z) - Interpretable end-to-end Neurosymbolic Reinforcement Learning agents [20.034972354302788]
This work places itself within the neurosymbolic AI paradigm, blending the strengths of neural networks with symbolic AI.
We present the first implementation of an end-to-end trained SCoBot, separately evaluate of its components, on different Atari games.
arXiv Detail & Related papers (2024-10-18T10:59:13Z) - LOGICSEG: Parsing Visual Semantics with Neural Logic Learning and
Reasoning [73.98142349171552]
LOGICSEG is a holistic visual semantic that integrates neural inductive learning and logic reasoning with both rich data and symbolic knowledge.
During fuzzy logic-based continuous relaxation, logical formulae are grounded onto data and neural computational graphs, hence enabling logic-induced network training.
These designs together make LOGICSEG a general and compact neural-logic machine that is readily integrated into existing segmentation models.
arXiv Detail & Related papers (2023-09-24T05:43:19Z) - Symbolic Visual Reinforcement Learning: A Scalable Framework with
Object-Level Abstraction and Differentiable Expression Search [63.3745291252038]
We propose DiffSES, a novel symbolic learning approach that discovers discrete symbolic policies.
By using object-level abstractions instead of raw pixel-level inputs, DiffSES is able to leverage the simplicity and scalability advantages of symbolic expressions.
Our experiments demonstrate that DiffSES is able to generate symbolic policies that are simpler and more scalable than state-of-the-art symbolic RL methods.
arXiv Detail & Related papers (2022-12-30T17:50:54Z) - Neuro-Symbolic Inductive Logic Programming with Logical Neural Networks [65.23508422635862]
We propose learning rules with the recently proposed logical neural networks (LNN)
Compared to others, LNNs offer strong connection to classical Boolean logic.
Our experiments on standard benchmarking tasks confirm that LNN rules are highly interpretable.
arXiv Detail & Related papers (2021-12-06T19:38:30Z) - Rule Extraction from Binary Neural Networks with Convolutional Rules for
Model Validation [16.956140135868733]
We introduce the concept of first-order convolutional rules, which are logical rules that can be extracted using a convolutional neural network (CNN)
Our approach is based on rule extraction from binary neural networks with local search.
Our experiments show that the proposed approach is able to model the functionality of the neural network while at the same time producing interpretable logical rules.
arXiv Detail & Related papers (2020-12-15T17:55:53Z) - Closed Loop Neural-Symbolic Learning via Integrating Neural Perception,
Grammar Parsing, and Symbolic Reasoning [134.77207192945053]
Prior methods learn the neural-symbolic models using reinforcement learning approaches.
We introduce the textbfgrammar model as a textitsymbolic prior to bridge neural perception and symbolic reasoning.
We propose a novel textbfback-search algorithm which mimics the top-down human-like learning procedure to propagate the error.
arXiv Detail & Related papers (2020-06-11T17:42:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.