Related papers: PAMI: partition input and aggregate outputs for model interpretation

PAMI: partition input and aggregate outputs for model interpretation

URL: http://arxiv.org/abs/2302.03318v2
Date: Wed, 8 Feb 2023 15:29:12 GMT
Title: PAMI: partition input and aggregate outputs for model interpretation
Authors: Wei Shi, Wentao Zhang, Weishi Zheng, Ruixuan Wang
Abstract summary: In this study, a simple yet effective visualization framework called PAMI is proposed based on the observation that deep learning models often aggregate features from local regions for model predictions. The basic idea is to mask majority of the input and use the corresponding model output as the relative contribution of the preserved input part to the original model prediction. Extensive experiments on multiple tasks confirm the proposed method performs better than existing visualization approaches in more precisely finding class-specific input regions.
Score: 69.42924964776766
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: There is an increasing demand for interpretation of model predictions especially in high-risk applications. Various visualization approaches have been proposed to estimate the part of input which is relevant to a specific model prediction. However, most approaches require model structure and parameter details in order to obtain the visualization results, and in general much effort is required to adapt each approach to multiple types of tasks particularly when model backbone and input format change over tasks. In this study, a simple yet effective visualization framework called PAMI is proposed based on the observation that deep learning models often aggregate features from local regions for model predictions. The basic idea is to mask majority of the input and use the corresponding model output as the relative contribution of the preserved input part to the original model prediction. For each input, since only a set of model outputs are collected and aggregated, PAMI does not require any model detail and can be applied to various prediction tasks with different model backbones and input formats. Extensive experiments on multiple tasks confirm the proposed method performs better than existing visualization approaches in more precisely finding class-specific input regions, and when applied to different model backbones and input formats. The source code will be released publicly.

Related papers

Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach [25.927323251675386]
We leverage the meta-features associated with each entity as a source of worldly knowledge and employ entity representations from the models. We propose using the consistency between these representations and the meta-features as a metric for evaluating pre-trained models. Our method's effectiveness is demonstrated across various domains, including models with relational datasets, large language models and image models.
arXiv Detail & Related papers (2024-01-02T17:08:26Z)
Predictable MDP Abstraction for Unsupervised Model-Based RL [93.91375268580806]
We propose predictable MDP abstraction (PMA) Instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space. We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised model-based RL approaches.
arXiv Detail & Related papers (2023-02-08T07:37:51Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models. This creates a barrier to fusing knowledge across individual models to yield a better single model. We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
Post-Selection Confidence Bounds for Prediction Performance [2.28438857884398]
In machine learning, the selection of a promising model from a potentially large number of competing models and the assessment of its generalization performance are critical tasks. We propose an algorithm how to compute valid lower confidence bounds for multiple models that have been selected based on their prediction performances in the evaluation set.
arXiv Detail & Related papers (2022-10-24T13:28:43Z)
Model ensemble instead of prompt fusion: a sample-specific knowledge transfer method for few-shot prompt tuning [85.55727213502402]
We focus on improving the few-shot performance of prompt tuning by transferring knowledge from soft prompts of source tasks. We propose Sample-specific Ensemble of Source Models (SESoM) SESoM learns to adjust the contribution of each source model for each target sample separately when ensembling source model outputs.
arXiv Detail & Related papers (2022-10-23T01:33:16Z)
A Generative Language Model for Few-shot Aspect-Based Sentiment Analysis [90.24921443175514]
We focus on aspect-based sentiment analysis, which involves extracting aspect term, category, and predicting their corresponding polarities. We propose to reformulate the extraction and prediction tasks into the sequence generation task, using a generative language model with unidirectional attention. Our approach outperforms the previous state-of-the-art (based on BERT) on average performance by a large margins in few-shot and full-shot settings.
arXiv Detail & Related papers (2022-04-11T18:31:53Z)
Attentional Prototype Inference for Few-Shot Segmentation [128.45753577331422]
We propose attentional prototype inference (API), a probabilistic latent variable framework for few-shot segmentation. We define a global latent variable to represent the prototype of each object category, which we model as a probabilistic distribution. We conduct extensive experiments on four benchmarks, where our proposal obtains at least competitive and often better performance than state-of-the-art prototype-based methods.
arXiv Detail & Related papers (2021-05-14T06:58:44Z)
Gaussian Function On Response Surface Estimation [12.35564140065216]
We propose a new framework for interpreting (features and samples) black-box machine learning models via a metamodeling technique. The metamodel can be estimated from data generated via a trained complex model by running the computer experiment on samples of data in the region of interest.
arXiv Detail & Related papers (2021-01-04T04:47:00Z)
Predictive process mining by network of classifiers and clusterers: the PEDF model [0.0]
The PEDF model learns based on events' sequences, durations, and extra features. The model requires to extract two sets of data from log files.
arXiv Detail & Related papers (2020-11-22T23:27:19Z)
What do we expect from Multiple-choice QA Systems? [70.86513724662302]
We consider a top performing model on several Multiple Choice Question Answering (MCQA) datasets. We evaluate it against a set of expectations one might have from such a model, using a series of zero-information perturbations of the model's inputs.
arXiv Detail & Related papers (2020-11-20T21:27:10Z)
Pattern Similarity-based Machine Learning Methods for Mid-term Load Forecasting: A Comparative Study [0.0]
We use pattern similarity-based methods for forecasting monthly electricity demand expressing annual seasonality. An integral part of the models is the time series representation using patterns of time series sequences. We consider four such models: nearest neighbor model, fuzzy neighborhood model, kernel regression model and general regression neural network.
arXiv Detail & Related papers (2020-03-03T12:14:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.