Effective Guidance for Model Attention with Simple Yes-no Annotations
- URL: http://arxiv.org/abs/2410.22312v2
- Date: Fri, 15 Nov 2024 22:53:42 GMT
- Title: Effective Guidance for Model Attention with Simple Yes-no Annotations
- Authors: Seongmin Lee, Ali Payani, Duen Horng Chau,
- Abstract summary: CRAYON achieves state-of-the-art performance, outperforming 12 methods across 3 benchmark datasets.
We present CRAYON (Correcting Reasoning with s of Yes Or No), offering effective, scalable, and practical solutions to rectify model attention.
- Score: 22.186065674971083
- License:
- Abstract: Modern deep learning models often make predictions by focusing on irrelevant areas, leading to biased performance and limited generalization. Existing methods aimed at rectifying model attention require explicit labels for irrelevant areas or complex pixel-wise ground truth attention maps. We present CRAYON (Correcting Reasoning with Annotations of Yes Or No), offering effective, scalable, and practical solutions to rectify model attention using simple yes-no annotations. CRAYON empowers classical and modern model interpretation techniques to identify and guide model reasoning: CRAYON-ATTENTION directs classic interpretations based on saliency maps to focus on relevant image regions, while CRAYON-PRUNING removes irrelevant neurons identified by modern concept-based methods to mitigate their influence. Through extensive experiments with both quantitative and human evaluation, we showcase CRAYON's effectiveness, scalability, and practicality in refining model attention. CRAYON achieves state-of-the-art performance, outperforming 12 methods across 3 benchmark datasets, surpassing approaches that require more complex annotations.
Related papers
- An Attention-based Framework for Fair Contrastive Learning [2.1605931466490795]
We propose a new method for fair contrastive learning that employs an attention mechanism to model bias-causing interactions.
Our attention mechanism avoids bias-causing samples that confound the model and focuses on bias-reducing samples that help learn semantically meaningful representations.
arXiv Detail & Related papers (2024-11-22T07:11:35Z) - On Feature Decorrelation in Cloth-Changing Person Re-identification [32.27835236681253]
Cloth-changing person re-identification (CC-ReID) poses a significant challenge in computer vision.
Traditional methods to achieve this involve integrating multi-modality data or employing manually annotated clothing labels.
We introduce a novel regularization technique based on density ratio estimation.
arXiv Detail & Related papers (2024-10-07T22:25:37Z) - Diffusion Model with Cross Attention as an Inductive Bias for Disentanglement [58.9768112704998]
Disentangled representation learning strives to extract the intrinsic factors within observed data.
We introduce a new perspective and framework, demonstrating that diffusion models with cross-attention can serve as a powerful inductive bias.
This is the first work to reveal the potent disentanglement capability of diffusion models with cross-attention, requiring no complex designs.
arXiv Detail & Related papers (2024-02-15T05:07:54Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Studying How to Efficiently and Effectively Guide Models with Explanations [52.498055901649025]
'Model guidance' is the idea of regularizing the models' explanations to ensure that they are "right for the right reasons"
We conduct an in-depth evaluation across various loss functions, attribution methods, models, and 'guidance depths' on the PASCAL VOC 2007 and MS COCO 2014 datasets.
Specifically, we guide the models via bounding box annotations, which are much cheaper to obtain than the commonly used segmentation masks.
arXiv Detail & Related papers (2023-03-21T15:34:50Z) - Entity-Conditioned Question Generation for Robust Attention Distribution
in Neural Information Retrieval [51.53892300802014]
We show that supervised neural information retrieval models are prone to learning sparse attention patterns over passage tokens.
Using a novel targeted synthetic data generation method, we teach neural IR to attend more uniformly and robustly to all entities in a given passage.
arXiv Detail & Related papers (2022-04-24T22:36:48Z) - Learning Self-Modulating Attention in Continuous Time Space with
Applications to Sequential Recommendation [102.24108167002252]
We propose a novel attention network, named self-modulating attention, that models the complex and non-linearly evolving dynamic user preferences.
We empirically demonstrate the effectiveness of our method on top-N sequential recommendation tasks, and the results on three large-scale real-world datasets show that our model can achieve state-of-the-art performance.
arXiv Detail & Related papers (2022-03-30T03:54:11Z) - Quadratic mutual information regularization in real-time deep CNN models [51.66271681532262]
Regularization method motivated by the Quadratic Mutual Information is proposed.
Experiments on various binary classification problems are performed, indicating the effectiveness of the proposed models.
arXiv Detail & Related papers (2021-08-26T13:14:24Z) - Improve the Interpretability of Attention: A Fast, Accurate, and
Interpretable High-Resolution Attention Model [6.906621279967867]
We propose a novel Bilinear Representative Non-Parametric Attention (BR-NPA) strategy that captures the task-relevant human-interpretable information.
The proposed model can be easily adapted in a wide variety of modern deep models, where classification is involved.
It is also more accurate, faster, and with a smaller memory footprint than usual neural attention modules.
arXiv Detail & Related papers (2021-06-04T15:57:37Z) - Attention-gating for improved radio galaxy classification [0.0]
We introduce attention as a state of the art mechanism for classification of radio galaxies using convolutional neural networks.
We show how the selection of normalisation and aggregation methods used in attention-gating can affect the output of individual models.
The resulting attention maps can be used to interpret the classification choices made by the model.
arXiv Detail & Related papers (2020-12-02T14:49:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.