Related papers: Neural Concept Verifier: Scaling Prover-Verifier Games via Concept Encodings

Neural Concept Verifier: Scaling Prover-Verifier Games via Concept Encodings

URL: http://arxiv.org/abs/2507.07532v2
Date: Fri, 11 Jul 2025 10:29:39 GMT
Title: Neural Concept Verifier: Scaling Prover-Verifier Games via Concept Encodings
Authors: Berkant Turan, Suhrab Asadulla, David Steinmann, Wolfgang Stammer, Sebastian Pokutta,
Abstract summary: We introduce the Neural Concept Verifier (NCV), a unified framework combining PVGs with concept encodings for interpretable, nonlinear classification in high-dimensional settings.<n>NCV achieves this by utilizing recent minimally supervised concept discovery models to extract structured concept encodings from raw inputs.
Score: 20.59727124775316
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While Prover-Verifier Games (PVGs) offer a promising path toward verifiability in nonlinear classification models, they have not yet been applied to complex inputs such as high-dimensional images. Conversely, Concept Bottleneck Models (CBMs) effectively translate such data into interpretable concepts but are limited by their reliance on low-capacity linear predictors. In this work, we introduce the Neural Concept Verifier (NCV), a unified framework combining PVGs with concept encodings for interpretable, nonlinear classification in high-dimensional settings. NCV achieves this by utilizing recent minimally supervised concept discovery models to extract structured concept encodings from raw inputs. A prover then selects a subset of these encodings, which a verifier -- implemented as a nonlinear predictor -- uses exclusively for decision-making. Our evaluations show that NCV outperforms CBM and pixel-based PVG classifier baselines on high-dimensional, logically complex datasets and also helps mitigate shortcut behavior. Overall, we demonstrate NCV as a promising step toward performative, verifiable AI.

Related papers

Cross-Layer Discrete Concept Discovery for Interpreting Language Models [13.842670153893977]
Cross-layer VQ-VAE is a framework that uses vector quantization to map representations across layers.<n>Our approach uniquely combines top-k temperature-based sampling during quantization with EMA codebook updates.
arXiv Detail & Related papers (2025-06-24T22:43:36Z)
Enhancing the conformal predictability of context-aware recommendation systems by using Deep Autoencoders [4.3012765978447565]
We introduce a framework that combines neural contextual matrix factorization with autoencoders to predict user ratings for items.<n>We conduct experiments on various real-world datasets and compare the results against state-of-the-art approaches.
arXiv Detail & Related papers (2024-11-30T18:24:42Z)
Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery [52.498055901649025]
Concept Bottleneck Models (CBMs) have been proposed to address the 'black-box' problem of deep neural networks. We propose a novel CBM approach -- called Discover-then-Name-CBM (DN-CBM) -- that inverts the typical paradigm. Our concept extraction strategy is efficient, since it is agnostic to the downstream task, and uses concepts already known to the model.
arXiv Detail & Related papers (2024-07-19T17:50:11Z)
Neural Concept Binder [22.074896812195437]
We introduce the Neural Concept Binder (NCB), a framework for deriving both discrete and continuous concept representations. The structured nature of NCB's concept representations allows for intuitive inspection and the straightforward integration of external knowledge. We validate the effectiveness of NCB through evaluations on our newly introduced CLEVR-Sudoku dataset.
arXiv Detail & Related papers (2024-06-14T11:52:09Z)
Local Concept Embeddings for Analysis of Concept Distributions in Vision DNN Feature Spaces [1.0923877073891446]
Insights into the learned latent representations are imperative for verifying deep neural networks (DNNs) in computer vision tasks.<n>We propose a novel local concept analysis framework to allow exploration of learned concept distributions.<n>Despite its context sensitivity, our method's concept segmentation performance is competitive to global baselines.
arXiv Detail & Related papers (2023-11-24T12:22:00Z)
Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks [78.11734286268455]
We study the performance of ConvResNeXts, trained with weight decay from the perspective of nonparametric classification.<n>Our analysis allows for infinitely many building blocks in ConvResNeXts, and shows that weight decay implicitly enforces sparsity on these blocks.
arXiv Detail & Related papers (2023-07-04T11:08:03Z)
Vector Quantized Wasserstein Auto-Encoder [57.29764749855623]
We study learning deep discrete representations from the generative viewpoint. We endow discrete distributions over sequences of codewords and learn a deterministic decoder that transports the distribution over the sequences of codewords to the data distribution. We develop further theories to connect it with the clustering viewpoint of WS distance, allowing us to have a better and more controllable clustering solution.
arXiv Detail & Related papers (2023-02-12T13:51:36Z)
Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence [13.618809162030486]
Concept Activation Vectors (CAVs) have emerged as a popular tool for modeling human-understandable concepts in the latent space.<n>In this paper we show that such a separability-oriented leads to solutions, which may diverge from the actual goal of precisely modeling the concept direction.<n>We introduce pattern-based CAVs, solely focussing on concept signals, thereby providing more accurate concept directions.
arXiv Detail & Related papers (2022-02-07T19:40:20Z)
Optimising for Interpretability: Convolutional Dynamic Alignment Networks [108.83345790813445]
We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA Nets) Their core building blocks are Dynamic Alignment Units (DAUs), which are optimised to transform their inputs with dynamically computed weight vectors that align with task-relevant patterns. CoDA Nets model the classification prediction through a series of input-dependent linear transformations, allowing for linear decomposition of the output into individual input contributions.
arXiv Detail & Related papers (2021-09-27T12:39:46Z)
Convolutional Dynamic Alignment Networks for Interpretable Classifications [108.83345790813445]
We introduce a new family of neural network models called Convolutional Dynamic Alignment Networks (CoDA-Nets) Their core building blocks are Dynamic Alignment Units (DAUs), which linearly transform their input with weight vectors that dynamically align with task-relevant patterns. CoDA-Nets model the classification prediction through a series of input-dependent linear transformations, allowing for linear decomposition of the output into individual input contributions.
arXiv Detail & Related papers (2021-03-31T18:03:53Z)
Invertible Concept-based Explanations for CNN Models with Non-negative Concept Activation Vectors [24.581839689833572]
Convolutional neural network (CNN) models for computer vision are powerful but lack explainability in their most basic form. Recent work on explanations through feature importance of approximate linear models has moved from input-level features to features from mid-layer feature maps in the form of concept activation vectors (CAVs) In this work, we rethink the ACE algorithm of Ghorbani etal., proposing an alternative invertible concept-based explanation (ICE) framework to overcome its shortcomings.
arXiv Detail & Related papers (2020-06-27T17:57:26Z)
MetaSDF: Meta-learning Signed Distance Functions [85.81290552559817]
Generalizing across shapes with neural implicit representations amounts to learning priors over the respective function space. We formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task.
arXiv Detail & Related papers (2020-06-17T05:14:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.