BERT-Flow-VAE: A Weakly-supervised Model for Multi-Label Text
Classification
- URL: http://arxiv.org/abs/2210.15225v1
- Date: Thu, 27 Oct 2022 07:18:56 GMT
- Title: BERT-Flow-VAE: A Weakly-supervised Model for Multi-Label Text
Classification
- Authors: Ziwen Liu, Josep Grau-Bove, Scott Allan Orr
- Abstract summary: We propose BERT-Flow-VAE (BFV), a Weakly-Supervised Multi-Label Text Classification model that reduces the need for full supervision.
Experimental results on 6 multi-label datasets show that BFV can substantially outperform other baseline WSMLTC models in key metrics.
- Score: 0.5156484100374058
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-label Text Classification (MLTC) is the task of categorizing documents
into one or more topics. Considering the large volumes of data and varying
domains of such tasks, fully supervised learning requires manually fully
annotated datasets which is costly and time-consuming. In this paper, we
propose BERT-Flow-VAE (BFV), a Weakly-Supervised Multi-Label Text
Classification (WSMLTC) model that reduces the need for full supervision. This
new model (1) produces BERT sentence embeddings and calibrates them using a
flow model, (2) generates an initial topic-document matrix by averaging results
of a seeded sparse topic model and a textual entailment model which only
require surface name of topics and 4-6 seed words per topic, and (3) adopts a
VAE framework to reconstruct the embeddings under the guidance of the
topic-document matrix. Finally, (4) it uses the means produced by the encoder
model in the VAE architecture as predictions for MLTC. Experimental results on
6 multi-label datasets show that BFV can substantially outperform other
baseline WSMLTC models in key metrics and achieve approximately 84% performance
of a fully-supervised model.
Related papers
- TextSquare: Scaling up Text-Centric Visual Instruction Tuning [64.55339431760727]
We introduce a new approach for creating a massive, high-quality instruction-tuning dataset, Square-10M.
Our model, TextSquare, considerably surpasses open-source previous state-of-the-art Text-centric MLLMs.
It even outperforms top-tier models like GPT4V and Gemini in 6 of 10 text-centric benchmarks.
arXiv Detail & Related papers (2024-04-19T11:38:08Z) - Multi-modal Auto-regressive Modeling via Visual Words [96.25078866446053]
We propose the concept of visual words, which maps the visual features to probability distributions over Large Multi-modal Models' vocabulary.
We further explore the distribution of visual features in the semantic space within LMM and the possibility of using text embeddings to represent visual information.
arXiv Detail & Related papers (2024-03-12T14:58:52Z) - TAT-LLM: A Specialized Language Model for Discrete Reasoning over
Tabular and Textual Data [77.66158066013924]
We consider harnessing the amazing power of language models (LLMs) to solve our task.
We develop a TAT-LLM language model by fine-tuning LLaMA 2 with the training data generated automatically from existing expert-annotated datasets.
arXiv Detail & Related papers (2024-01-24T04:28:50Z) - TDeLTA: A Light-weight and Robust Table Detection Method based on
Learning Text Arrangement [34.73880086005418]
We propose a novel, light-weighted and robust Table Detection method based on Learning Text Arrangement, namely TDeLTA.
To locate the tables precisely, we design a text-classification task, classifying the text blocks into 4 categories according to their semantic roles in the tables.
Compared to several state-of-the-art methods, TDeLTA achieves competitive results with only 3.1M model parameters on the large-scale public datasets.
arXiv Detail & Related papers (2023-12-18T09:18:43Z) - FLIP: Towards Fine-grained Alignment between ID-based Models and Pretrained Language Models for CTR Prediction [49.510163437116645]
We propose to conduct Fine-grained feature-level ALignment between ID-based Models and Pretrained Language Models (FLIP) for click-through rate (CTR) prediction.
Specifically, the masked data of one modality (i.e., tokens or features) has to be recovered with the help of the other modality, which establishes the feature-level interaction and alignment.
Experiments on three real-world datasets demonstrate that FLIP outperforms SOTA baselines, and is highly compatible for various ID-based models and PLMs.
arXiv Detail & Related papers (2023-10-30T11:25:03Z) - Attention is Not Always What You Need: Towards Efficient Classification
of Domain-Specific Text [1.1508304497344637]
For large-scale IT corpora with hundreds of classes organized in a hierarchy, the task of accurate classification of classes at the higher level in the hierarchies is crucial.
In the business world, an efficient and explainable ML model is preferred over an expensive black-box model, especially if the performance increase is marginal.
Despite the widespread use of PLMs, there is a lack of a clear and well-justified need to as why these models are being employed for domain-specific text classification.
arXiv Detail & Related papers (2023-03-31T03:17:23Z) - Benchmarking Multimodal AutoML for Tabular Data with Text Fields [83.43249184357053]
We assemble 18 multimodal data tables that each contain some text fields.
Our benchmark enables researchers to evaluate their own methods for supervised learning with numeric, categorical, and text features.
arXiv Detail & Related papers (2021-11-04T09:29:16Z) - PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document
Summarization [16.830963601598242]
We propose PRIMER, a pre-trained model for multi-document representation with focus on summarization.
Specifically, we adopt the Longformer architecture with proper input transformation and global attention to fit for multi-document inputs.
Our model, PRIMER, outperforms current state-of-the-art models on most of these settings with large margins.
arXiv Detail & Related papers (2021-10-16T07:22:24Z) - VAULT: VAriable Unified Long Text Representation for Machine Reading
Comprehension [31.639069657951747]
Existing models on Machine Reading require complex model architecture for modeling long texts with paragraph representation and classification.
We propose VAULT: a light-weight and parallel-efficient paragraph representation for MRC based on contextualized representation from long document input.
arXiv Detail & Related papers (2021-05-07T13:03:43Z) - Students Need More Attention: BERT-based AttentionModel for Small Data
with Application to AutomaticPatient Message Triage [65.7062363323781]
We propose a novel framework based on BioBERT (Bidirectional Representations from Transformers forBiomedical TextMining)
We introduce Label Embeddings for Self-Attention in each layer of BERT, which we call LESA-BERT, and (ii) by distilling LESA-BERT to smaller variants, we aim to reduce overfitting and model size when working on small datasets.
As an application, our framework is utilized to build a model for patient portal message triage that classifies the urgency of a message into three categories: non-urgent, medium and urgent.
arXiv Detail & Related papers (2020-06-22T03:39:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.