Related papers: Weakly Supervised Text Classification using Supervision Signals from a Language Model

Weakly Supervised Text Classification using Supervision Signals from a Language Model

URL: http://arxiv.org/abs/2205.06604v1
Date: Fri, 13 May 2022 12:57:15 GMT
Title: Weakly Supervised Text Classification using Supervision Signals from a Language Model
Authors: Ziqian Zeng, Weimin Ni, Tianqing Fang, Xiang Li, Xinran Zhao and Yangqiu Song
Abstract summary: We design a prompt which combines the document itself and "this article is talking about [MASK]" A masked language model can generate words for the [MASK] token. The generated words which summarize the content of a document can be utilized as supervision signals.
Score: 33.5830441120473
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Solving text classification in a weakly supervised manner is important for real-world applications where human annotations are scarce. In this paper, we propose to query a masked language model with cloze style prompts to obtain supervision signals. We design a prompt which combines the document itself and "this article is talking about [MASK]." A masked language model can generate words for the [MASK] token. The generated words which summarize the content of a document can be utilized as supervision signals. We propose a latent variable model to learn a word distribution learner which associates generated words to pre-defined categories and a document classifier simultaneously without using any annotated data. Evaluation on three datasets, AGNews, 20Newsgroups, and UCINews, shows that our method can outperform baselines by 2%, 4%, and 3%.

Related papers

NextLevelBERT: Masked Language Modeling with Higher-Level Representations for Long Documents [17.94934249657174]
NextLevelBERT is a Masked Language Model operating not on tokens, but on higher-level semantic representations in the form of text embeddings. We find that next-level Masked Language Modeling is an effective technique to tackle long-document use cases and can outperfor much larger embedding models as long as the required level of detail of semantic information is not too fine.
arXiv Detail & Related papers (2024-02-27T16:56:30Z)
Token Prediction as Implicit Classification to Identify LLM-Generated Text [37.89852204279844]
This paper introduces a novel approach for identifying the possible large language models (LLMs) involved in text generation. Instead of adding an additional classification layer to a base LM, we reframe the classification task as a next-token prediction task. We utilize the Text-to-Text Transfer Transformer (T5) model as the backbone for our experiments.
arXiv Detail & Related papers (2023-11-15T06:33:52Z)
Multi-Modal Classifiers for Open-Vocabulary Object Detection [104.77331131447541]
The goal of this paper is open-vocabulary object detection (OVOD) We adopt a standard two-stage object detector architecture. We explore three ways via: language descriptions, image exemplars, or a combination of the two.
arXiv Detail & Related papers (2023-06-08T18:31:56Z)
I2MVFormer: Large Language Model Generated Multi-View Document Supervision for Zero-Shot Image Classification [108.83932812826521]
Large Language Models (LLM) trained on web-scale text show impressive abilities to repurpose their learned knowledge for a multitude of tasks. Our proposed model, I2MVFormer, learns multi-view semantic embeddings for zero-shot image classification with these class views. I2MVFormer establishes a new state-of-the-art on three public benchmark datasets for zero-shot image classification with unsupervised semantic embeddings.
arXiv Detail & Related papers (2022-12-05T14:11:36Z)
Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model. In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z)
Generalized Funnelling: Ensemble Learning and Heterogeneous Document Embeddings for Cross-Lingual Text Classification [78.83284164605473]
emphFunnelling (Fun) is a recently proposed method for cross-lingual text classification. We describe emphGeneralized Funnelling (gFun) as a generalization of Fun. We show that gFun substantially improves over Fun and over state-of-the-art baselines.
arXiv Detail & Related papers (2021-09-17T23:33:04Z)
Text Classification Using Label Names Only: A Language Model Self-Training Approach [80.63885282358204]
Current text classification methods typically require a good number of human-labeled documents as training data. We show that our model achieves around 90% accuracy on four benchmark datasets including topic and sentiment classification.
arXiv Detail & Related papers (2020-10-14T17:06:41Z)
Watch, read and lookup: learning to spot signs from multiple supervisors [99.50956498009094]
Given a video of an isolated sign, our task is to identify whether and where it has been signed in a continuous, co-articulated sign language video. We train a model using multiple types of available supervision by: (1) watching existing sparsely labelled footage; (2) reading associated subtitles which provide additional weak-supervision; and (3) looking up words in visual sign language dictionaries. These three tasks are integrated into a unified learning framework using the principles of Noise Contrastive Estimation and Multiple Instance Learning.
arXiv Detail & Related papers (2020-10-08T14:12:56Z)
Keyphrase Extraction with Span-based Feature Representations [13.790461555410747]
Keyphrases are capable of providing semantic metadata characterizing documents. Three approaches to address keyphrase extraction: (i) traditional two-step ranking method, (ii) sequence labeling and (iii) generation using neural networks. In this paper, we propose a novelty Span Keyphrase Extraction model that extracts span-based feature representation of keyphrase directly from all the content tokens.
arXiv Detail & Related papers (2020-02-13T09:48:31Z)
Adapting Deep Learning for Sentiment Classification of Code-Switched Informal Short Text [1.6752182911522517]
We present a labeled dataset called MultiSenti for sentiment classification of code-switched informal short text. We propose a deep learning-based model for sentiment classification of code-switched informal short text.
arXiv Detail & Related papers (2020-01-04T06:31:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.