Prompting and Fine-Tuning Open-Sourced Large Language Models for Stance
Classification
- URL: http://arxiv.org/abs/2309.13734v2
- Date: Tue, 5 Mar 2024 21:26:54 GMT
- Title: Prompting and Fine-Tuning Open-Sourced Large Language Models for Stance
Classification
- Authors: Iain J. Cruickshank and Lynnette Hui Xian Ng
- Abstract summary: Stance classification has long been a focal point of research in domains ranging from social science to machine learning.
Current stance detection methods rely predominantly on manual annotation of sentences, followed by training a supervised machine learning model.
We investigate the use of Large Language Models as a stance detection methodology that can reduce or even eliminate the need for manual annotations.
- Score: 1.6317061277457001
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stance classification, the task of predicting the viewpoint of an author on a
subject of interest, has long been a focal point of research in domains ranging
from social science to machine learning. Current stance detection methods rely
predominantly on manual annotation of sentences, followed by training a
supervised machine learning model. However, this manual annotation process
requires laborious annotation effort, and thus hampers its potential to
generalize across different contexts. In this work, we investigate the use of
Large Language Models (LLMs) as a stance detection methodology that can reduce
or even eliminate the need for manual annotations. We investigate 10
open-source models and 7 prompting schemes, finding that LLMs are competitive
with in-domain supervised models but are not necessarily consistent in their
performance. We also fine-tuned the LLMs, but discovered that fine-tuning
process does not necessarily lead to better performance. In general, we
discover that LLMs do not routinely outperform their smaller supervised machine
learning models, and thus call for stance detection to be a benchmark for which
LLMs also optimize for. The code used in this study is available at
\url{https://github.com/ijcruic/LLM-Stance-Labeling}
Related papers
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Soft Prompting for Unlearning in Large Language Models [11.504012974208466]
This work focuses on investigating machine unlearning for Large Language Models motivated by data protection regulations.
We propose a framework textbfSoft textbfPrompting for textbfUntextbflearning (SPUL) that learns prompt tokens that can be appended to an arbitrary query to induce unlearning.
arXiv Detail & Related papers (2024-06-17T19:11:40Z) - Show, Don't Tell: Aligning Language Models with Demonstrated Feedback [54.10302745921713]
Demonstration ITerated Task Optimization (DITTO) directly aligns language model outputs to a user's demonstrated behaviors.
We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts.
arXiv Detail & Related papers (2024-06-02T23:13:56Z) - Are you still on track!? Catching LLM Task Drift with Activations [55.75645403965326]
Large Language Models (LLMs) are routinely used in retrieval-augmented applications to orchestrate tasks and process inputs from users and other sources.
This opens the door to prompt injection attacks, where the LLM receives and acts upon instructions from supposedly data-only sources, thus deviating from the user's original instructions.
We define this as task drift, and we propose to catch it by scanning and analyzing the LLM's activations.
We show that this approach generalizes surprisingly well to unseen task domains, such as prompt injections, jailbreaks, and malicious instructions, without being trained on any of these attacks.
arXiv Detail & Related papers (2024-06-02T16:53:21Z) - LLM-augmented Preference Learning from Natural Language [19.700169351688768]
Large Language Models (LLMs) are equipped to deal with larger context lengths.
LLMs can consistently outperform the SotA when the target text is large.
Few-shot learning yields better performance than zero-shot learning.
arXiv Detail & Related papers (2023-10-12T17:17:27Z) - Scaling Sentence Embeddings with Large Language Models [43.19994568210206]
In this work, we propose an in-context learning-based method aimed at improving sentence embeddings performance.
Our approach involves adapting the previous prompt-based representation method for autoregressive models.
By scaling model size, we find scaling to more than tens of billion parameters harms the performance on semantic textual similarity tasks.
arXiv Detail & Related papers (2023-07-31T13:26:03Z) - TIM: Teaching Large Language Models to Translate with Comparison [78.66926087162672]
We propose a novel framework using examples in comparison to teach LLMs to learn translation.
Our approach involves presenting the model with examples of correct and incorrect translations and using a preference loss to guide the model's learning.
Our findings offer a new perspective on fine-tuning LLMs for translation tasks and provide a promising solution for generating high-quality translations.
arXiv Detail & Related papers (2023-07-10T08:15:40Z) - Language models are weak learners [71.33837923104808]
We show that prompt-based large language models can operate effectively as weak learners.
We incorporate these models into a boosting approach, which can leverage the knowledge within the model to outperform traditional tree-based boosting.
Results illustrate the potential for prompt-based LLMs to function not just as few-shot learners themselves, but as components of larger machine learning pipelines.
arXiv Detail & Related papers (2023-06-25T02:39:19Z) - Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models [75.75038268227554]
Self-Checker is a framework comprising a set of plug-and-play modules that facilitate fact-checking.
This framework provides a fast and efficient way to construct fact-checking systems in low-resource environments.
arXiv Detail & Related papers (2023-05-24T01:46:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.