Related papers: GLiClass: Generalist Lightweight Model for Sequence Classification Tasks

GLiClass: Generalist Lightweight Model for Sequence Classification Tasks

URL: http://arxiv.org/abs/2508.07662v1
Date: Mon, 11 Aug 2025 06:22:25 GMT
Title: GLiClass: Generalist Lightweight Model for Sequence Classification Tasks
Authors: Ihor Stepanov, Mykhailo Shtopko, Dmytro Vodianytskyi, Oleksandr Lukashov, Alexander Yavorskyi, Mykyta Yaroshenko,
Abstract summary: We propose GLiClass, a novel method that adapts the GLiNER architecture for sequence classification tasks.<n>Our approach achieves strong accuracy and efficiency comparable to embedding-based methods, while maintaining the flexibility needed for zero-shot and few-shot learning scenarios.
Score: 49.2639069781367
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Classification is one of the most widespread tasks in AI applications, serving often as the first step in filtering, sorting, and categorizing data. Since modern AI systems must handle large volumes of input data and early pipeline stages can propagate errors downstream, achieving high efficiency and accuracy is critical. Moreover, classification requirements can change dynamically based on user needs, necessitating models with strong zero-shot capabilities. While generative LLMs have become mainstream for zero-shot classification due to their versatility, they suffer from inconsistent instruction following and computational inefficiency. Cross-encoders, commonly used as rerankers in RAG pipelines, face a different bottleneck: they must process text-label pairs sequentially, significantly reducing efficiency with large label sets. Embedding-based approaches offer good efficiency but struggle with complex scenarios involving logical and semantic constraints. We propose GLiClass, a novel method that adapts the GLiNER architecture for sequence classification tasks. Our approach achieves strong accuracy and efficiency comparable to embedding-based methods, while maintaining the flexibility needed for zero-shot and few-shot learning scenarios. Additionally, we adapted proximal policy optimization (PPO) for multi-label text classification, enabling training classifiers in data-sparse conditions or from human feedback.

Related papers

Curate-Train-Refine: A Closed-Loop Agentic Framework for Zero Shot Classification [2.1937565888932653]
Large language models (LLMs) and high-capacity encoders have advanced zero and few-shot classification, but their inference cost and latency limit practical deployment.<n>We propose training lightweight text classifiers using dynamically generated supervision from an LLM.<n>Our method employs an iterative, agentic loop in which the LLM curates training data, analyzes model successes and failures, and synthesizes targeted examples to address observed errors.
arXiv Detail & Related papers (2026-01-23T08:04:09Z)
Fine-Tuning Causal LLMs for Text Classification: Embedding-Based vs. Instruction-Based Approaches [0.0]
We explore strategies to fine-tune decoder-only Large Language Models (LLMs) for downstream text classification under resource constraints.<n>Two approaches are investigated: (1) attaching a classification head to a pre-trained causal LLM and fine-tuning on the task, and (2) instruction-tuning the LLM in a prompt->response format for classification.
arXiv Detail & Related papers (2025-12-14T13:02:06Z)
Enhancing Transformer-Based Rerankers with Synthetic Data and LLM-Based Supervision [0.13999481573773073]
Large Language Models (LLMs) excel at reranking due to their deep semantic understanding and reasoning.<n>Fine-tuning smaller, task-specific models is a more efficient alternative but typically on scarce, manually labeled data.<n>We propose a novel pipeline that eliminates the need for human-labeled query-document pairs.
arXiv Detail & Related papers (2025-09-23T09:47:27Z)
LAMDAS: LLM as an Implicit Classifier for Domain-specific Data Selection [32.35731324386828]
Adapting large language models (LLMs) to specific domains often faces a critical bottleneck: the scarcity of high-quality, human-curated data.<n>Existing approaches, categorized as similarity-based and direct optimization methods, struggle to simultaneously achieve these goals.<n>We introduce LAMDAS, a novel approach that leverages the pre-trained LLM itself as an implicit classifier.
arXiv Detail & Related papers (2025-09-08T10:30:58Z)
Towards Privacy-Preserving Fine-Grained Visual Classification via Hierarchical Learning from Label Proportions [25.974006393027228]
This paper aims to enable accurate fine-grained recognition without direct access to instance labels.<n>Unlike existing LLP-based methods, our framework explicitly exploits the hierarchical nature of fine-grained datasets.
arXiv Detail & Related papers (2025-05-29T03:18:25Z)
Applying LLMs to Active Learning: Towards Cost-Efficient Cross-Task Text Classification without Manually Labeled Data [0.0]
We propose an approach that integrates large language models (LLMs) into an active learning framework to achieve high cross-task text classification performance.<n>Our approach retains over 93% of its classification performance while requiring only approximately 6% of the computational time and monetary cost.<n>These findings provide new insights into the efficient utilization of LLMs and active learning algorithms in text classification tasks, paving the way for their broader application.
arXiv Detail & Related papers (2025-02-24T06:43:19Z)
Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation [2.024620791810963]
This study benchmarks the performance of Prompt Tuning and baselines for multi-label text classification. It is applied to classifying companies into an investment firm's proprietary industry taxonomy. We confirm that the model's performance is consistent across both well-known and less-known companies.
arXiv Detail & Related papers (2023-09-21T13:45:32Z)
Dynamic Perceiver for Efficient Visual Recognition [87.08210214417309]
We propose Dynamic Perceiver (Dyn-Perceiver) to decouple the feature extraction procedure and the early classification task. A feature branch serves to extract image features, while a classification branch processes a latent code assigned for classification tasks. Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.
arXiv Detail & Related papers (2023-06-20T03:00:22Z)
Dynamic Conceptional Contrastive Learning for Generalized Category Discovery [76.82327473338734]
Generalized category discovery (GCD) aims to automatically cluster partially labeled data. Unlabeled data contain instances that are not only from known categories of the labeled data but also from novel categories. One effective way for GCD is applying self-supervised learning to learn discriminate representation for unlabeled data. We propose a Dynamic Conceptional Contrastive Learning framework, which can effectively improve clustering accuracy.
arXiv Detail & Related papers (2023-03-30T14:04:39Z)
Enhancing Classification with Hierarchical Scalable Query on Fusion Transformer [0.4129225533930965]
This paper proposes a method to boost fine-grained classification through a hierarchical approach via learnable independent query embeddings. We exploit the idea of hierarchy to learn query embeddings that are scalable across all levels. Our method is able to outperform the existing methods with an improvement of 11% at the fine-grained classification.
arXiv Detail & Related papers (2023-02-28T11:00:55Z)
Fast Few-Shot Classification by Few-Iteration Meta-Learning [173.32497326674775]
We introduce a fast optimization-based meta-learning method for few-shot classification. Our strategy enables important aspects of the base learner objective to be learned during meta-training. We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach.
arXiv Detail & Related papers (2020-10-01T15:59:31Z)
Prior Guided Feature Enrichment Network for Few-Shot Segmentation [64.91560451900125]
State-of-the-art semantic segmentation methods require sufficient labeled data to achieve good results. Few-shot segmentation is proposed to tackle this problem by learning a model that quickly adapts to new classes with a few labeled support samples. Theses frameworks still face the challenge of generalization ability reduction on unseen classes due to inappropriate use of high-level semantic information.
arXiv Detail & Related papers (2020-08-04T10:41:32Z)
Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label. Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data. This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z)
Fase-AL -- Adaptation of Fast Adaptive Stacking of Ensembles for Supporting Active Learning [0.0]
This work presents the FASE-AL algorithm which induces classification models with non-labeled instances using Active Learning. The algorithm achieves promising results in terms of the percentage of correctly classified instances.
arXiv Detail & Related papers (2020-01-30T17:25:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.