Related papers: Provenance Networks: End-to-End Exemplar-Based Explainability

Provenance Networks: End-to-End Exemplar-Based Explainability

URL: http://arxiv.org/abs/2510.03361v1
Date: Fri, 03 Oct 2025 01:48:38 GMT
Title: Provenance Networks: End-to-End Exemplar-Based Explainability
Authors: Ali Kayyam, Anusha Madan Gopal, M. Anthony Lewis,
Abstract summary: We introduce provenance networks, a novel class of neural models designed to provide end-to-end, training-data-driven explainability.<n>Provenance networks learn to link each prediction directly to its supporting training examples as part of the model's normal operation.<n>It addresses critical challenges in modern deep learning, including model opaqueness, hallucination, and the assignment of credit to data contributors.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce provenance networks, a novel class of neural models designed to provide end-to-end, training-data-driven explainability. Unlike conventional post-hoc methods, provenance networks learn to link each prediction directly to its supporting training examples as part of the model's normal operation, embedding interpretability into the architecture itself. Conceptually, the model operates similarly to a learned KNN, where each output is justified by concrete exemplars weighted by relevance in the feature space. This approach facilitates systematic investigations of the trade-off between memorization and generalization, enables verification of whether a given input was included in the training set, aids in the detection of mislabeled or anomalous data points, enhances resilience to input perturbations, and supports the identification of similar inputs contributing to the generation of a new data point. By jointly optimizing the primary task and the explainability objective, provenance networks offer insights into model behavior that traditional deep networks cannot provide. While the model introduces additional computational cost and currently scales to moderately sized datasets, it provides a complementary approach to existing explainability techniques. In particular, it addresses critical challenges in modern deep learning, including model opaqueness, hallucination, and the assignment of credit to data contributors, thereby improving transparency, robustness, and trustworthiness in neural models.

Related papers

A Theory of the Mechanics of Information: Generalization Through Measurement of Uncertainty (Learning is Measuring) [0.0]
We introduce a model-free framework using surprisal (information theoretic uncertainty) to analyze and perform inferences from raw data.<n>It eliminates distribution modeling, reducing bias, and enabling efficient updates including direct edits and deletion of training data.<n>It emphasizes traceability, interpretability, and data-driven decision making, offering a unified, human-understandable framework for machine learning.
arXiv Detail & Related papers (2025-10-26T19:45:25Z)
Concept-Based Mechanistic Interpretability Using Structured Knowledge Graphs [3.429783703166407]
Our framework enables a global dissection of model behavior by analyzing how high-level semantic attributes emerge, interact, and propagate through internal model components.<n>A key innovation is our visualization platform that we named BAGEL, which presents these insights in a structured knowledge graph.<n>Our framework is model-agnostic, scalable, and contributes to a deeper understanding of how deep learning models generalize (or fail to) in the presence of dataset biases.
arXiv Detail & Related papers (2025-07-08T09:30:20Z)
Neural Network Reprogrammability: A Unified Theme on Model Reprogramming, Prompt Tuning, and Prompt Instruction [55.914891182214475]
We introduce neural network reprogrammability as a unifying framework for model adaptation.<n>We present a taxonomy that categorizes such information manipulation approaches across four key dimensions.<n>We also analyze remaining technical challenges and ethical considerations.
arXiv Detail & Related papers (2025-06-05T05:42:27Z)
A Comprehensive Survey on Self-Interpretable Neural Networks [36.0575431131253]
Self-interpretable neural networks inherently reveal the prediction rationale through the model structures.<n>We first collect and review existing works on self-interpretable neural networks and provide a structured summary of their methodologies.<n>We also present concrete, visualized examples of model explanations and discuss their applicability across diverse scenarios.
arXiv Detail & Related papers (2025-01-26T18:50:16Z)
Deep Learning Through A Telescoping Lens: A Simple Model Provides Empirical Insights On Grokking, Gradient Boosting & Beyond [61.18736646013446]
In pursuit of a deeper understanding of its surprising behaviors, we investigate the utility of a simple yet accurate model of a trained neural network. Across three case studies, we illustrate how it can be applied to derive new empirical insights on a diverse range of prominent phenomena.
arXiv Detail & Related papers (2024-10-31T22:54:34Z)
Surprisal Driven $k$-NN for Robust and Interpretable Nonparametric Learning [1.4293924404819704]
We shed new light on the traditional nearest neighbors algorithm from the perspective of information theory. We propose a robust and interpretable framework for tasks such as classification, regression, density estimation, and anomaly detection using a single model. Our work showcases the architecture's versatility by achieving state-of-the-art results in classification and anomaly detection.
arXiv Detail & Related papers (2023-11-17T00:35:38Z)
Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning. Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z)
Refining Neural Networks with Compositional Explanations [31.84868477264624]
We propose to refine a learned model by collecting human-provided compositional explanations on the models' failure cases. We demonstrate the effectiveness of the proposed approach on two text classification tasks.
arXiv Detail & Related papers (2021-03-18T17:48:54Z)
Explainable Adversarial Attacks in Deep Neural Networks Using Activation Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples. We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z)
Anomaly Detection on Attributed Networks via Contrastive Self-Supervised Learning [50.24174211654775]
We present a novel contrastive self-supervised learning framework for anomaly detection on attributed networks. Our framework fully exploits the local information from network data by sampling a novel type of contrastive instance pair. A graph neural network-based contrastive learning model is proposed to learn informative embedding from high-dimensional attributes and local structure.
arXiv Detail & Related papers (2021-02-27T03:17:20Z)
Accurate and Robust Feature Importance Estimation under Distribution Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method. We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.