BERT-Flow-VAE: A Weakly-supervised Model for Multi-Label Text
Classification
- URL: http://arxiv.org/abs/2210.15225v1
- Date: Thu, 27 Oct 2022 07:18:56 GMT
- Title: BERT-Flow-VAE: A Weakly-supervised Model for Multi-Label Text
Classification
- Authors: Ziwen Liu, Josep Grau-Bove, Scott Allan Orr
- Abstract summary: We propose BERT-Flow-VAE (BFV), a Weakly-Supervised Multi-Label Text Classification model that reduces the need for full supervision.
Experimental results on 6 multi-label datasets show that BFV can substantially outperform other baseline WSMLTC models in key metrics.
- Score: 0.5156484100374058
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-label Text Classification (MLTC) is the task of categorizing documents
into one or more topics. Considering the large volumes of data and varying
domains of such tasks, fully supervised learning requires manually fully
annotated datasets which is costly and time-consuming. In this paper, we
propose BERT-Flow-VAE (BFV), a Weakly-Supervised Multi-Label Text
Classification (WSMLTC) model that reduces the need for full supervision. This
new model (1) produces BERT sentence embeddings and calibrates them using a
flow model, (2) generates an initial topic-document matrix by averaging results
of a seeded sparse topic model and a textual entailment model which only
require surface name of topics and 4-6 seed words per topic, and (3) adopts a
VAE framework to reconstruct the embeddings under the guidance of the
topic-document matrix. Finally, (4) it uses the means produced by the encoder
model in the VAE architecture as predictions for MLTC. Experimental results on
6 multi-label datasets show that BFV can substantially outperform other
baseline WSMLTC models in key metrics and achieve approximately 84% performance
of a fully-supervised model.
Related papers
- VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks [60.5257456681402]
We build universal embedding models capable of handling a wide range of downstream tasks.
Our contributions are twofold: (1) MMEB (Massive Multimodal Embedding Benchmark), which covers 4 meta-tasks (i.e. classification, visual question answering, multimodal retrieval, and visual grounding) and 36 datasets, including 20 training and 16 evaluation datasets, and (2) VLM2Vec (Vision-Language Model -> Vector), a contrastive training framework that converts any state-of-the-art vision-language model into an embedding model via training on MMEB.
arXiv Detail & Related papers (2024-10-07T16:14:05Z) - A Small Claims Court for the NLP: Judging Legal Text Classification Strategies With Small Datasets [0.0]
This paper investigates the best strategies for optimizing the use of a small labeled dataset and large amounts of unlabeled data.
We use the records of demands to a Brazilian Public Prosecutor's Office aiming to assign the descriptions in one of the subjects.
The best result was obtained with Unsupervised Data Augmentation (UDA), which jointly uses BERT, data augmentation, and strategies of semi-supervised learning.
arXiv Detail & Related papers (2024-09-09T18:10:05Z) - TextSquare: Scaling up Text-Centric Visual Instruction Tuning [64.55339431760727]
We introduce a new approach for creating a massive, high-quality instruction-tuning dataset, Square-10M.
Our model, TextSquare, considerably surpasses open-source previous state-of-the-art Text-centric MLLMs.
It even outperforms top-tier models like GPT4V and Gemini in 6 of 10 text-centric benchmarks.
arXiv Detail & Related papers (2024-04-19T11:38:08Z) - TAT-LLM: A Specialized Language Model for Discrete Reasoning over Tabular and Textual Data [73.29220562541204]
We consider harnessing the amazing power of language models (LLMs) to solve our task.
We develop a TAT-LLM language model by fine-tuning LLaMA 2 with the training data generated automatically from existing expert-annotated datasets.
arXiv Detail & Related papers (2024-01-24T04:28:50Z) - FLIP: Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
Click-through rate (CTR) prediction plays as a core function module in personalized online services.
Traditional ID-based models for CTR prediction take as inputs the one-hot encoded ID features of tabular modality.
Pretrained Language Models(PLMs) has given rise to another paradigm, which takes as inputs the sentences of textual modality.
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models(FLIP) for CTR prediction.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - Attention is Not Always What You Need: Towards Efficient Classification
of Domain-Specific Text [1.1508304497344637]
For large-scale IT corpora with hundreds of classes organized in a hierarchy, the task of accurate classification of classes at the higher level in the hierarchies is crucial.
In the business world, an efficient and explainable ML model is preferred over an expensive black-box model, especially if the performance increase is marginal.
Despite the widespread use of PLMs, there is a lack of a clear and well-justified need to as why these models are being employed for domain-specific text classification.
arXiv Detail & Related papers (2023-03-31T03:17:23Z) - Benchmarking Multimodal AutoML for Tabular Data with Text Fields [83.43249184357053]
We assemble 18 multimodal data tables that each contain some text fields.
Our benchmark enables researchers to evaluate their own methods for supervised learning with numeric, categorical, and text features.
arXiv Detail & Related papers (2021-11-04T09:29:16Z) - PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document
Summarization [16.830963601598242]
We propose PRIMER, a pre-trained model for multi-document representation with focus on summarization.
Specifically, we adopt the Longformer architecture with proper input transformation and global attention to fit for multi-document inputs.
Our model, PRIMER, outperforms current state-of-the-art models on most of these settings with large margins.
arXiv Detail & Related papers (2021-10-16T07:22:24Z) - Students Need More Attention: BERT-based AttentionModel for Small Data
with Application to AutomaticPatient Message Triage [65.7062363323781]
We propose a novel framework based on BioBERT (Bidirectional Representations from Transformers forBiomedical TextMining)
We introduce Label Embeddings for Self-Attention in each layer of BERT, which we call LESA-BERT, and (ii) by distilling LESA-BERT to smaller variants, we aim to reduce overfitting and model size when working on small datasets.
As an application, our framework is utilized to build a model for patient portal message triage that classifies the urgency of a message into three categories: non-urgent, medium and urgent.
arXiv Detail & Related papers (2020-06-22T03:39:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.