Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning
- URL: http://arxiv.org/abs/2501.09214v1
- Date: Thu, 16 Jan 2025 00:26:15 GMT
- Title: Boosting Short Text Classification with Multi-Source Information Exploration and Dual-Level Contrastive Learning
- Authors: Yonghao Liu, Mengyu Li, Wei Pang, Fausto Giunchiglia, Lan Huang, Xiaoyue Feng, Renchu Guan,
- Abstract summary: We propose a novel model named MI-DELIGHT for short text classification.
It first performs multi-source information exploration to alleviate the sparsity issues.
Then, the graph learning approach is adopted to learn the representation of short texts.
- Score: 12.377363857246602
- License:
- Abstract: Short text classification, as a research subtopic in natural language processing, is more challenging due to its semantic sparsity and insufficient labeled samples in practical scenarios. We propose a novel model named MI-DELIGHT for short text classification in this work. Specifically, it first performs multi-source information (i.e., statistical information, linguistic information, and factual information) exploration to alleviate the sparsity issues. Then, the graph learning approach is adopted to learn the representation of short texts, which are presented in graph forms. Moreover, we introduce a dual-level (i.e., instance-level and cluster-level) contrastive learning auxiliary task to effectively capture different-grained contrastive information within massive unlabeled data. Meanwhile, previous models merely perform the main task and auxiliary tasks in parallel, without considering the relationship among tasks. Therefore, we introduce a hierarchical architecture to explicitly model the correlations between tasks. We conduct extensive experiments across various benchmark datasets, demonstrating that MI-DELIGHT significantly surpasses previous competitive models. It even outperforms popular large language models on several datasets.
Related papers
- A Simple Graph Contrastive Learning Framework for Short Text Classification [23.36436403062214]
We propose a Simple graph contrastive learning framework for Short Text Classification (SimSTC)
Our method eliminates the need for data augmentation operations to generate contrastive views while still leveraging the benefits of multi-view contrastive learning.
Despite its simplicity, our model achieves outstanding performance, surpassing large language models on various datasets.
arXiv Detail & Related papers (2025-01-16T00:35:56Z) - Investigating Self-Supervised Methods for Label-Efficient Learning [27.029542823306866]
We study different self supervised pretext tasks, namely contrastive learning, clustering, and masked image modelling for their low-shot capabilities.
We introduce a framework involving both mask image modelling and clustering as pretext tasks, which performs better across all low-shot downstream tasks.
When testing the model on full scale datasets, we show performance gains in multi-class classification, multi-label classification and semantic segmentation.
arXiv Detail & Related papers (2024-06-25T10:56:03Z) - Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings.
An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts)
This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z) - Diffusion Model is an Effective Planner and Data Synthesizer for
Multi-Task Reinforcement Learning [101.66860222415512]
Multi-Task Diffusion Model (textscMTDiff) is a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis.
For generative planning, we find textscMTDiff outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D.
arXiv Detail & Related papers (2023-05-29T05:20:38Z) - Ensemble Transfer Learning for Multilingual Coreference Resolution [60.409789753164944]
A problem that frequently occurs when working with a non-English language is the scarcity of annotated training data.
We design a simple but effective ensemble-based framework that combines various transfer learning techniques.
We also propose a low-cost TL method that bootstraps coreference resolution models by utilizing Wikipedia anchor texts.
arXiv Detail & Related papers (2023-01-22T18:22:55Z) - Mitigating Data Sparsity for Short Text Topic Modeling by Topic-Semantic
Contrastive Learning [19.7066703371736]
We propose a novel short text topic modeling framework, Topic-Semantic Contrastive Topic Model (TSCTM)
Our TSCTM outperforms state-of-the-art baselines regardless of the data augmentation availability, producing high-quality topics and topic distributions.
arXiv Detail & Related papers (2022-11-23T11:33:43Z) - VLCDoC: Vision-Language Contrastive Pre-Training Model for Cross-Modal
Document Classification [3.7798600249187295]
Multimodal learning from document data has achieved great success lately as it allows to pre-train semantically meaningful features as a prior into a learnable downstream task.
In this paper, we approach the document classification problem by learning cross-modal representations through language and vision cues.
The proposed method exploits high-level interactions and learns relevant semantic information from effective attention flows within and across modalities.
arXiv Detail & Related papers (2022-05-24T12:28:12Z) - Generating More Pertinent Captions by Leveraging Semantics and Style on
Multi-Source Datasets [56.018551958004814]
This paper addresses the task of generating fluent descriptions by training on a non-uniform combination of data sources.
Large-scale datasets with noisy image-text pairs provide a sub-optimal source of supervision.
We propose to leverage and separate semantics and descriptive style through the incorporation of a style token and keywords extracted through a retrieval component.
arXiv Detail & Related papers (2021-11-24T19:00:05Z) - Hierarchical Heterogeneous Graph Representation Learning for Short Text
Classification [60.233529926965836]
We propose a new method called SHINE, which is based on graph neural network (GNN) for short text classification.
First, we model the short text dataset as a hierarchical heterogeneous graph consisting of word-level component graphs.
Then, we dynamically learn a short document graph that facilitates effective label propagation among similar short texts.
arXiv Detail & Related papers (2021-10-30T05:33:05Z) - Topic Adaptation and Prototype Encoding for Few-Shot Visual Storytelling [81.33107307509718]
We propose a topic adaptive storyteller to model the ability of inter-topic generalization.
We also propose a prototype encoding structure to model the ability of intra-topic derivation.
Experimental results show that topic adaptation and prototype encoding structure mutually bring benefit to the few-shot model.
arXiv Detail & Related papers (2020-08-11T03:55:11Z) - BATS: A Spectral Biclustering Approach to Single Document Topic Modeling
and Segmentation [17.003488045214972]
Existing topic modeling and text segmentation methodologies generally require large datasets for training, limiting their capabilities when only small collections of text are available.
In developing a methodology to handle single documents, we face two major challenges.
First is sparse information: with access to only one document, we cannot train traditional topic models or deep learning algorithms.
Second is significant noise: a considerable portion of words in any single document will produce only noise and not help discern topics or segments.
arXiv Detail & Related papers (2020-08-05T16:34:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.