Decoding Layer Saliency in Language Transformers
- URL: http://arxiv.org/abs/2308.05219v1
- Date: Wed, 9 Aug 2023 20:53:22 GMT
- Title: Decoding Layer Saliency in Language Transformers
- Authors: Elizabeth M. Hou, Gregory Castanon
- Abstract summary: In visual networks where saliency is more well-studied, saliency is naturally localized through the convolutional layers of the network.
We adapt gradient-based saliency methods for these networks, propose a method for evaluating the degree of semantic coherence of each layer, and demonstrate consistent improvement on multiple benchmark classification datasets.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce a strategy for identifying textual saliency in
large-scale language models applied to classification tasks. In visual networks
where saliency is more well-studied, saliency is naturally localized through
the convolutional layers of the network; however, the same is not true in
modern transformer-stack networks used to process natural language. We adapt
gradient-based saliency methods for these networks, propose a method for
evaluating the degree of semantic coherence of each layer, and demonstrate
consistent improvement over numerous other methods for textual saliency on
multiple benchmark classification datasets. Our approach requires no additional
training or access to labelled data, and is comparatively very computationally
efficient.
Related papers
- Mind the Gap: A Generalized Approach for Cross-Modal Embedding Alignment [0.0]
Retrieval-Augmented Generation (RAG) systems retrieve context across different text modalities due to semantic gaps.
We introduce a generalized projection-based method, inspired by adapter modules in transfer learning, that efficiently bridges these gaps.
Our approach emphasizes speed, accuracy, and data efficiency, requiring minimal resources for training and inference.
arXiv Detail & Related papers (2024-10-30T20:28:10Z) - PRIOR: Prototype Representation Joint Learning from Medical Images and
Reports [19.336988866061294]
We present a prototype representation learning framework incorporating both global and local alignment between medical images and reports.
In contrast to standard global multi-modality alignment methods, we employ a local alignment module for fine-grained representation.
A sentence-wise prototype memory bank is constructed, enabling the network to focus on low-level localized visual and high-level clinical linguistic features.
arXiv Detail & Related papers (2023-07-24T07:49:01Z) - SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation [94.11915008006483]
We propose SemAffiNet for point cloud semantic segmentation.
We conduct extensive experiments on the ScanNetV2 and NYUv2 datasets.
arXiv Detail & Related papers (2022-05-26T17:00:23Z) - Learning Consistency from High-quality Pseudo-labels for Weakly
Supervised Object Localization [7.602783618330373]
We propose a two-stage approach to learn more consistent localization.
In the first stage, we propose a mask-based pseudo label generator algorithm, and use the pseudo-supervised learning method to initialize an object localization network.
In the second stage, we propose a simple and effective method for evaluating the confidence of pseudo-labels based on classification discrimination.
arXiv Detail & Related papers (2022-03-18T09:05:51Z) - Learning Contrastive Representation for Semantic Correspondence [150.29135856909477]
We propose a multi-level contrastive learning approach for semantic matching.
We show that image-level contrastive learning is a key component to encourage the convolutional features to find correspondence between similar objects.
arXiv Detail & Related papers (2021-09-22T18:34:14Z) - Cross-lingual Text Classification with Heterogeneous Graph Neural
Network [2.6936806968297913]
Cross-lingual text classification aims at training a classifier on the source language and transferring the knowledge to target languages.
Recent multilingual pretrained language models (mPLM) achieve impressive results in cross-lingual classification tasks.
We propose a simple yet effective method to incorporate heterogeneous information within and across languages for cross-lingual text classification.
arXiv Detail & Related papers (2021-05-24T12:45:42Z) - Context Decoupling Augmentation for Weakly Supervised Semantic
Segmentation [53.49821324597837]
Weakly supervised semantic segmentation is a challenging problem that has been deeply studied in recent years.
We present a Context Decoupling Augmentation ( CDA) method to change the inherent context in which the objects appear.
To validate the effectiveness of the proposed method, extensive experiments on PASCAL VOC 2012 dataset with several alternative network architectures demonstrate that CDA can boost various popular WSSS methods to the new state-of-the-art by a large margin.
arXiv Detail & Related papers (2021-03-02T15:05:09Z) - Transformer Interpretability Beyond Attention Visualization [87.96102461221415]
Self-attention techniques, and specifically Transformers, are dominating the field of text processing.
In this work, we propose a novel way to compute relevancy for Transformer networks.
arXiv Detail & Related papers (2020-12-17T18:56:33Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning [86.45526827323954]
Weakly-supervised semantic segmentation is a challenging task as no pixel-wise label information is provided for training.
We propose an iterative algorithm to learn such pairwise relations.
We show that the proposed algorithm performs favorably against the state-of-the-art methods.
arXiv Detail & Related papers (2020-02-19T10:32:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.