Enhancing Interpretability in Generative AI Through Search-Based Data Influence Analysis
- URL: http://arxiv.org/abs/2504.01771v1
- Date: Wed, 02 Apr 2025 14:29:37 GMT
- Title: Enhancing Interpretability in Generative AI Through Search-Based Data Influence Analysis
- Authors: Theodoros Aivalis, Iraklis A. Klampanos, Antonis Troumpoukis, Joemon M. Jose,
- Abstract summary: Generative AI models offer powerful capabilities but often lack transparency, making it difficult to interpret their output.<n>This work introduces a search-inspired approach to improve the interpretability of these models by analysing the influence of training data on their outputs.
- Score: 2.7096399168532015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative AI models offer powerful capabilities but often lack transparency, making it difficult to interpret their output. This is critical in cases involving artistic or copyrighted content. This work introduces a search-inspired approach to improve the interpretability of these models by analysing the influence of training data on their outputs. Our method provides observational interpretability by focusing on a model's output rather than on its internal state. We consider both raw data and latent-space embeddings when searching for the influence of data items in generated content. We evaluate our method by retraining models locally and by demonstrating the method's ability to uncover influential subsets in the training data. This work lays the groundwork for future extensions, including user-based evaluations with domain experts, which is expected to improve observational interpretability further.
Related papers
- Explaining the Unexplained: Revealing Hidden Correlations for Better Interpretability [1.8274323268621635]
Real Explainer (RealExp) is an interpretability method that decouples the Shapley Value into individual feature importance and feature correlation importance.<n>RealExp enhances interpretability by precisely quantifying both individual feature contributions and their interactions.
arXiv Detail & Related papers (2024-12-02T10:50:50Z) - Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts.
We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z) - Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods.
We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z) - Towards Explainable Artificial Intelligence (XAI): A Data Mining
Perspective [35.620874971064765]
This work takes a "data-centric" view, examining how data collection, processing, and analysis contribute to explainable AI (XAI)
We categorize existing work into three categories subject to their purposes: interpretations of deep models, influences of training data, and insights of domain knowledge.
Specifically, we distill XAI methodologies into data mining operations on training and testing data across modalities.
arXiv Detail & Related papers (2024-01-09T06:27:09Z) - Unveiling the Potential of Probabilistic Embeddings in Self-Supervised
Learning [4.124934010794795]
Self-supervised learning has played a pivotal role in advancing machine learning by allowing models to acquire meaningful representations from unlabeled data.
We investigate the impact of probabilistic modeling on the information bottleneck, shedding light on a trade-off between compression and preservation of information in both representation and loss space.
Our findings suggest that introducing an additional bottleneck in the loss space can significantly enhance the ability to detect out-of-distribution examples.
arXiv Detail & Related papers (2023-10-27T12:01:16Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - On Counterfactual Data Augmentation Under Confounding [30.76982059341284]
Counterfactual data augmentation has emerged as a method to mitigate confounding biases in the training data.
These biases arise due to various observed and unobserved confounding variables in the data generation process.
We show how our simple augmentation method helps existing state-of-the-art methods achieve good results.
arXiv Detail & Related papers (2023-05-29T16:20:23Z) - Striving for data-model efficiency: Identifying data externalities on
group performance [75.17591306911015]
Building trustworthy, effective, and responsible machine learning systems hinges on understanding how differences in training data and modeling decisions interact to impact predictive performance.
We focus on a particular type of data-model inefficiency, in which adding training data from some sources can actually lower performance evaluated on key sub-groups of the population.
Our results indicate that data-efficiency is a key component of both accurate and trustworthy machine learning.
arXiv Detail & Related papers (2022-11-11T16:48:27Z) - Addressing Bias in Visualization Recommenders by Identifying Trends in
Training Data: Improving VizML Through a Statistical Analysis of the Plotly
Community Feed [55.41644538483948]
Machine learning is a promising approach to visualization recommendation due to its high scalability and representational power.
Our research project aims to address training bias in machine learning visualization recommendation systems by identifying trends in the training data through statistical analysis.
arXiv Detail & Related papers (2022-03-09T18:36:46Z) - Generative Counterfactuals for Neural Networks via Attribute-Informed
Perturbation [51.29486247405601]
We design a framework to generate counterfactuals for raw data instances with the proposed Attribute-Informed Perturbation (AIP)
By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently.
Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework.
arXiv Detail & Related papers (2021-01-18T08:37:13Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Interpreting Deep Models through the Lens of Data [5.174367472975529]
This paper presents an in-depth analysis of the methods which attempt to identify the influence of these data points on the resulting classifier.
We show that some interpretability methods can detect mislabels better than using a random approach, however, the sample selection based on the training loss showed a superior performance.
arXiv Detail & Related papers (2020-05-05T07:59:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.