Exploiting Local and Global Features in Transformer-based Extreme
Multi-label Text Classification
- URL: http://arxiv.org/abs/2204.00933v1
- Date: Sat, 2 Apr 2022 19:55:23 GMT
- Title: Exploiting Local and Global Features in Transformer-based Extreme
Multi-label Text Classification
- Authors: Ruohong Zhang, Yau-Shian Wang, Yiming Yang, Tom Vu, Likun Lei
- Abstract summary: We propose an approach that combines both the local and global features produced by Transformer models to improve the prediction power of the classifier.
Our experiments show that the proposed model either outperforms or is comparable to the state-of-the-art methods on benchmark datasets.
- Score: 28.28186933768281
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Extreme multi-label text classification (XMTC) is the task of tagging each
document with the relevant labels from a very large space of predefined
categories. Recently, large pre-trained Transformer models have made
significant performance improvements in XMTC, which typically use the embedding
of the special CLS token to represent the entire document semantics as a global
feature vector, and match it against candidate labels. However, we argue that
such a global feature vector may not be sufficient to represent different
granularity levels of semantics in the document, and that complementing it with
the local word-level features could bring additional gains. Based on this
insight, we propose an approach that combines both the local and global
features produced by Transformer models to improve the prediction power of the
classifier. Our experiments show that the proposed model either outperforms or
is comparable to the state-of-the-art methods on benchmark datasets.
Related papers
- Embracing Diversity: Interpretable Zero-shot classification beyond one vector per class [16.101460010750458]
We argue that to represent diversity within a class, zero-shot classification should move beyond a single vector.
We propose a method to encode and account for diversity within a class using inferred attributes, still in the zero-shot setting without retraining.
We find our method consistently outperforms standard zero-shot classification over a large suite of datasets.
arXiv Detail & Related papers (2024-04-25T16:29:06Z) - MCTformer+: Multi-Class Token Transformer for Weakly Supervised Semantic
Segmentation [90.73815426893034]
We propose a transformer-based framework that aims to enhance weakly supervised semantic segmentation.
We introduce a Multi-Class Token transformer, which incorporates multiple class tokens to enable class-aware interactions with the patch tokens.
A Contrastive-Class-Token (CCT) module is proposed to enhance the learning of discriminative class tokens.
arXiv Detail & Related papers (2023-08-06T03:30:20Z) - TART: Improved Few-shot Text Classification Using Task-Adaptive
Reference Transformation [23.02986307143718]
We propose a novel Task-Adaptive Reference Transformation (TART) network to enhance the generalization.
Our model surpasses the state-of-the-art method by 7.4% and 5.4% in 1-shot and 5-shot classification on the 20 Newsgroups dataset.
arXiv Detail & Related papers (2023-06-03T18:38:02Z) - Retrieval-augmented Multi-label Text Classification [20.100081284294973]
Multi-label text classification is a challenging task in settings of large label sets.
Retrieval augmentation aims to improve the sample efficiency of classification models.
We evaluate this approach on four datasets from the legal and biomedical domains.
arXiv Detail & Related papers (2023-05-22T14:16:23Z) - HGFormer: Hierarchical Grouping Transformer for Domain Generalized
Semantic Segmentation [113.6560373226501]
This work studies semantic segmentation under the domain generalization setting.
We propose a novel hierarchical grouping transformer (HGFormer) to explicitly group pixels to form part-level masks and then whole-level masks.
Experiments show that HGFormer yields more robust semantic segmentation results than per-pixel classification methods and flat grouping transformers.
arXiv Detail & Related papers (2023-05-22T13:33:41Z) - Adversarial Adaptation for French Named Entity Recognition [21.036698406367115]
We propose a Transformer-based NER approach for French, using adversarial adaptation to similar domain or general corpora.
Our approach allows learning better features using large-scale unlabeled corpora from the same domain or mixed domains.
We also show that adversarial adaptation to large-scale unlabeled corpora can help mitigate the performance dip incurred on using Transformer models pre-trained on smaller corpora.
arXiv Detail & Related papers (2023-01-12T18:58:36Z) - Multi-class Token Transformer for Weakly Supervised Semantic
Segmentation [94.78965643354285]
We propose a new transformer-based framework to learn class-specific object localization maps as pseudo labels for weakly supervised semantic segmentation (WSSS)
Inspired by the fact that the attended regions of the one-class token in the standard vision transformer can be leveraged to form a class-agnostic localization map, we investigate if the transformer model can also effectively capture class-specific attention for more discriminative object localization.
The proposed framework is shown to fully complement the Class Activation Mapping (CAM) method, leading to remarkably superior WSSS results on the PASCAL VOC and MS COCO datasets.
arXiv Detail & Related papers (2022-03-06T07:18:23Z) - Scaling up Multi-domain Semantic Segmentation with Sentence Embeddings [81.09026586111811]
We propose an approach to semantic segmentation that achieves state-of-the-art supervised performance when applied in a zero-shot setting.
This is achieved by replacing each class label with a vector-valued embedding of a short paragraph that describes the class.
The resulting merged semantic segmentation dataset of over 2 Million images enables training a model that achieves performance equal to that of state-of-the-art supervised methods on 7 benchmark datasets.
arXiv Detail & Related papers (2022-02-04T07:19:09Z) - Generalized Funnelling: Ensemble Learning and Heterogeneous Document
Embeddings for Cross-Lingual Text Classification [78.83284164605473]
emphFunnelling (Fun) is a recently proposed method for cross-lingual text classification.
We describe emphGeneralized Funnelling (gFun) as a generalization of Fun.
We show that gFun substantially improves over Fun and over state-of-the-art baselines.
arXiv Detail & Related papers (2021-09-17T23:33:04Z) - Learning to Predict Context-adaptive Convolution for Semantic
Segmentation [66.27139797427147]
Long-range contextual information is essential for achieving high-performance semantic segmentation.
We propose a Context-adaptive Convolution Network (CaC-Net) to predict a spatially-varying feature weighting vector.
Our CaC-Net achieves superior segmentation performance on three public datasets.
arXiv Detail & Related papers (2020-04-17T13:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.