Related papers: Fine-Tuning Causal LLMs for Text Classification: Embedding-Based vs. Instruction-Based Approaches

Fine-Tuning Causal LLMs for Text Classification: Embedding-Based vs. Instruction-Based Approaches

URL: http://arxiv.org/abs/2512.12677v1
Date: Sun, 14 Dec 2025 13:02:06 GMT
Title: Fine-Tuning Causal LLMs for Text Classification: Embedding-Based vs. Instruction-Based Approaches
Authors: Amirhossein Yousefiramandi, Ciaran Cooney,
Abstract summary: We explore strategies to fine-tune decoder-only Large Language Models (LLMs) for downstream text classification under resource constraints.<n>Two approaches are investigated: (1) attaching a classification head to a pre-trained causal LLM and fine-tuning on the task, and (2) instruction-tuning the LLM in a prompt->response format for classification.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We explore efficient strategies to fine-tune decoder-only Large Language Models (LLMs) for downstream text classification under resource constraints. Two approaches are investigated: (1) attaching a classification head to a pre-trained causal LLM and fine-tuning on the task (using the LLM's final token embedding as a sequence representation), and (2) instruction-tuning the LLM in a prompt->response format for classification. To enable single-GPU fine-tuning of models up to 8B parameters, we combine 4-bit model quantization with Low-Rank Adaptation (LoRA) for parameter-efficient training. Experiments on two datasets - a proprietary single-label dataset and the public WIPO-Alpha patent dataset (extreme multi-label classification) - show that the embedding-based method significantly outperforms the instruction-tuned method in F1-score, and is very competitive with - even surpassing - fine-tuned domain-specific models (e.g. BERT) on the same tasks. These results demonstrate that directly leveraging the internal representations of causal LLMs, along with efficient fine-tuning techniques, yields impressive classification performance under limited computational resources. We discuss the advantages of each approach while outlining practical guidelines and future directions for optimizing LLM fine-tuning in classification scenarios.

Related papers

Can LLM Annotations Replace User Clicks for Learning to Rank? [112.2254432364736]
Large-scale supervised data is essential for training modern ranking models, but obtaining high-quality human annotations is costly.<n>Click data has been widely used as a low-cost alternative, and with recent advances in large language models (LLMs), LLM-based relevance annotation has emerged as another promising annotation.<n> Experiments on both a public dataset, TianGong-ST, and an industrial dataset, Baidu-Click, show that click-supervised models perform better on high-frequency queries.<n>We explore two training strategies -- data scheduling and frequency-aware multi-objective learning -- that integrate both supervision signals.
arXiv Detail & Related papers (2025-11-10T02:26:14Z)
LAMDAS: LLM as an Implicit Classifier for Domain-specific Data Selection [32.35731324386828]
Adapting large language models (LLMs) to specific domains often faces a critical bottleneck: the scarcity of high-quality, human-curated data.<n>Existing approaches, categorized as similarity-based and direct optimization methods, struggle to simultaneously achieve these goals.<n>We introduce LAMDAS, a novel approach that leverages the pre-trained LLM itself as an implicit classifier.
arXiv Detail & Related papers (2025-09-08T10:30:58Z)
GLiClass: Generalist Lightweight Model for Sequence Classification Tasks [49.2639069781367]
We propose GLiClass, a novel method that adapts the GLiNER architecture for sequence classification tasks.<n>Our approach achieves strong accuracy and efficiency comparable to embedding-based methods, while maintaining the flexibility needed for zero-shot and few-shot learning scenarios.
arXiv Detail & Related papers (2025-08-11T06:22:25Z)
Don't Just Demo, Teach Me the Principles: A Principle-Based Multi-Agent Prompting Strategy for Text Classification [4.811763060654019]
We present PRINCIPLE-BASED PROMPTING, a simple but effective multi-agent prompting strategy for text classification.<n>Our approach achieves substantial performance gains (1.55% - 19.37%) over zero-shot prompting on macro-F1 score.<n>Our multi-agent PRINCIPLE-BASED PROMPTING approach also shows on-par or better performance compared to demonstration-based few-shot prompting approaches.
arXiv Detail & Related papers (2025-02-11T01:10:13Z)
When Do LLMs Help With Node Classification? A Comprehensive Analysis [21.120619437937382]
We develop a comprehensive and testbed for node classification using Large Language Models (LLMs)<n>It includes 10 homophilic datasets, 4 heterophilic datasets, 8 LLM-based algorithms, 8 classic baselines, and 3 learning paradigms.<n>Our findings uncover 8 insights, e.g., (1) LLM-based methods can significantly outperform traditional methods in a semi-supervised setting, while the advantage is marginal in a supervised setting.
arXiv Detail & Related papers (2025-02-02T15:56:05Z)
Data Quality Enhancement on the Basis of Diversity with Large Language Models for Text Classification: Uncovered, Difficult, and Noisy [5.225010551503337]
This paper proposes a data quality enhancement (DQE) method for text classification based on large language models (LLMs)<n> Experimental results demonstrate that our method effectively enhances the performance of LLMs in text classification tasks.<n>Our method has achieved state-of-the-art performance in several open-source classification tasks.
arXiv Detail & Related papers (2024-12-09T15:28:39Z)
How to Make LLMs Strong Node Classifiers? [70.14063765424012]
Language Models (LMs) are challenging the dominance of domain-specific models, such as Graph Neural Networks (GNNs) and Graph Transformers (GTs)<n>We propose a novel approach that empowers off-the-shelf LMs to achieve performance comparable to state-of-the-art (SOTA) GNNs on node classification tasks.
arXiv Detail & Related papers (2024-10-03T08:27:54Z)
Bridging LLMs and KGs without Fine-Tuning: Intermediate Probing Meets Subgraph-Aware Entity Descriptions [49.36683223327633]
Large Language Models (LLMs) encapsulate extensive world knowledge and exhibit powerful context modeling capabilities.<n>We propose a novel framework that synergizes the strengths of LLMs with robust knowledge representation to enable effective and efficient KGC.<n>We achieve a 47% relative improvement over previous methods based on non-fine-tuned LLMs and, to our knowledge, are the first to achieve classification performance comparable to fine-tuned LLMs.
arXiv Detail & Related papers (2024-08-13T10:15:55Z)
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models [38.41524186248607]
We introduce NV-Embed, incorporating architectural designs, training procedures, and curated datasets.<n>For model architecture, we propose a latent attention layer to obtain pooled embeddings.<n>For training algorithm, we introduce a two-stage contrastive instruction-tuning method.
arXiv Detail & Related papers (2024-05-27T17:59:45Z)
CALRec: Contrastive Alignment of Generative LLMs for Sequential Recommendation [18.986613405565514]
Large Language Models (LLMs) are pretrained on vast corpora of text for sequential recommendation. We propose a two-stage LLM finetuning framework that finetunes a pretrained LLM in a two-tower fashion using a mixture of two contrastive losses and a language modeling loss. Our model significantly outperforms many state-of-the-art baselines.
arXiv Detail & Related papers (2024-05-03T18:51:19Z)
CLAMP: Contrastive LAnguage Model Prompt-tuning [89.96914454453791]
We show that large language models can achieve good image classification performance when adapted this way. Our approach beats state-of-the-art mLLMs by 13% and slightly outperforms contrastive learning with a custom text model.
arXiv Detail & Related papers (2023-12-04T05:13:59Z)
Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks. Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients. We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.