Related papers: Can Large Language Models Address Open-Target Stance Detection?

Can Large Language Models Address Open-Target Stance Detection?

URL: http://arxiv.org/abs/2409.00222v4
Date: Mon, 30 Sep 2024 17:37:16 GMT
Title: Can Large Language Models Address Open-Target Stance Detection?
Authors: Abu Ubaida Akash, Ahmed Fahmy, Amine Trabelsi,
Abstract summary: Open-Target Stance Detection (OTSD) is the most realistic task where targets are neither seen during training nor provided as input. We evaluate Large Language Models (LLMs) GPT-4o, GPT-3.5, Llama-3, and Mistral, comparing their performance to the only existing work, Target-Stance Extraction (TSE) Our experiments reveal that LLMs outperform TSE in target generation when the real target is explicitly and not explicitly mentioned in the text.
Score: 0.7032245866317618
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Stance detection (SD) identifies a text's position towards a target, typically labeled as favor, against, or none. We introduce Open-Target Stance Detection (OTSD), the most realistic task where targets are neither seen during training nor provided as input. We evaluate Large Language Models (LLMs) GPT-4o, GPT-3.5, Llama-3, and Mistral, comparing their performance to the only existing work, Target-Stance Extraction (TSE), which benefits from predefined targets. Unlike TSE, OTSD removes the dependency of a predefined list, making target generation and evaluation more challenging. We also provide a metric for evaluating target quality that correlates well with human judgment. Our experiments reveal that LLMs outperform TSE in target generation when the real target is explicitly and not explicitly mentioned in the text. Likewise, for stance detection, LLMs excel in explicit cases with comparable performance in non-explicit in general.

Related papers

Evaluating the Goal-Directedness of Large Language Models [17.08087240111954]
We evaluate goal-directedness on tasks that require information gathering, cognitive effort, and plan execution. Our evaluations of LLMs from Google DeepMind, OpenAI, and Anthropic show that goal-directedness is relatively consistent across tasks.
arXiv Detail & Related papers (2025-04-16T08:07:08Z)
A Framework for Evaluating LLMs Under Task Indeterminacy [49.298107503257036]
Large language model (LLM) evaluations often assume there is a single correct response -- a gold label -- for each item in the evaluation corpus. We develop a framework for evaluating LLMs under task indeterminacy.
arXiv Detail & Related papers (2024-11-21T00:15:44Z)
Fine-tuned Large Language Models (LLMs): Improved Prompt Injection Attacks Detection [6.269725911814401]
Large language models (LLMs) are becoming a popular tool as they have significantly advanced in their capability to tackle a wide range of language-based tasks. However, LLMs applications are highly vulnerable to prompt injection attacks, which poses a critical problem. This project explores the security vulnerabilities in relation to prompt injection attacks.
arXiv Detail & Related papers (2024-10-28T00:36:21Z)
Stanceformer: Target-Aware Transformer for Stance Detection [59.69858080492586]
Stance Detection involves discerning the stance expressed in a text towards a specific subject or target. Prior works have relied on existing transformer models that lack the capability to prioritize targets effectively. We introduce Stanceformer, a target-aware transformer model that incorporates enhanced attention towards the targets during both training and inference.
arXiv Detail & Related papers (2024-10-09T17:24:28Z)
Predicting User Stances from Target-Agnostic Information using Large Language Models [6.9337465525334405]
Large Language Models' (LLMs) ability to predict a user's stance on a target given a collection of his/her target-agnostic social media posts is investigated.
arXiv Detail & Related papers (2024-09-22T11:21:16Z)
Chain of Stance: Stance Detection with Large Language Models [3.528201746844624]
Stance detection is an active task in natural language processing (NLP) We propose a new prompting method, called textitChain of Stance (CoS)
arXiv Detail & Related papers (2024-08-03T16:30:51Z)
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors [64.9938658716425]
Existing evaluations of large language models' (LLMs) ability to recognize and reject unsafe user requests face three limitations. First, existing methods often use coarse-grained of unsafe topics, and are over-representing some fine-grained topics. Second, linguistic characteristics and formatting of prompts are often overlooked, like different languages, dialects, and more -- which are only implicitly considered in many evaluations. Third, existing evaluations rely on large LLMs for evaluation, which can be expensive.
arXiv Detail & Related papers (2024-06-20T17:56:07Z)
Towards Effective Evaluations and Comparisons for LLM Unlearning Methods [97.2995389188179]
This paper seeks to refine the evaluation of machine unlearning for large language models. It addresses two key challenges -- the robustness of evaluation metrics and the trade-offs between competing goals.
arXiv Detail & Related papers (2024-06-13T14:41:00Z)
Envisioning Outlier Exposure by Large Language Models for Out-of-Distribution Detection [71.93411099797308]
Out-of-distribution (OOD) samples are crucial when deploying machine learning models in open-world scenarios. We propose to tackle this constraint by leveraging the expert knowledge and reasoning capability of large language models (LLM) to potential Outlier Exposure, termed EOE. EOE can be generalized to different tasks, including far, near, and fine-language OOD detection. EOE achieves state-of-the-art performance across different OOD tasks and can be effectively scaled to the ImageNet-1K dataset.
arXiv Detail & Related papers (2024-06-02T17:09:48Z)
Target Span Detection for Implicit Harmful Content [18.84674403712032]
We focus on identifying implied targets of hate speech, essential for recognizing subtler hate speech and enhancing the detection of harmful content on digital platforms. We collect and annotate target spans in three prominent implicit hate speech datasets: SBIC, DynaHate, and IHC. Our experiments indicate that Implicit-Target-Span provides a challenging test bed for target span detection methods.
arXiv Detail & Related papers (2024-03-28T21:15:15Z)
The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition [74.04775677110179]
In-context Learning (ICL) has emerged as a powerful paradigm for performing natural language tasks with Large Language Models (LLM) We show that LLMs have strong yet inconsistent priors in emotion recognition that ossify their predictions. Our results suggest that caution is needed when using ICL with larger LLMs for affect-centered tasks outside their pre-training domain.
arXiv Detail & Related papers (2024-03-25T19:07:32Z)
Don't Go To Extremes: Revealing the Excessive Sensitivity and Calibration Limitations of LLMs in Implicit Hate Speech Detection [29.138463029748547]
This paper explores the capability of Large Language Models to detect implicit hate speech and express confidence in their responses. Our findings highlight that LLMs exhibit two extremes: (1) LLMs display excessive sensitivity towards groups or topics that may cause fairness issues, resulting in misclassifying benign statements as hate speech.
arXiv Detail & Related papers (2024-02-18T00:04:40Z)
Can We Identify Stance Without Target Arguments? A Study for Rumour Stance Classification [10.19051099694573]
We show that rumour stance classification datasets contain a considerable amount of real-world data whose stance could be naturally inferred directly from the replies. We propose a simple yet effective framework to enhance reasoning with the targets, achieving state-of-the-art performance on two benchmark datasets.
arXiv Detail & Related papers (2023-03-22T15:44:15Z)
Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information [100.03188187735624]
We introduce a novel approach based on PLMs and pointwise V-information (PVI), a metric that can measure the usefulness of a datapoint for training a model. Our method first fine-tunes a PLM on a small seed of training data and then synthesizes new datapoints - utterances that correspond to given intents. Our method is thus able to leverage the expressive power of large language models to produce diverse training data.
arXiv Detail & Related papers (2023-02-10T07:37:49Z)
Few-Shot Stance Detection via Target-Aware Prompt Distillation [48.40269795901453]
This paper is inspired by the potential capability of pre-trained language models (PLMs) serving as knowledge bases and few-shot learners. PLMs can provide essential contextual information for the targets and enable few-shot learning via prompts. Considering the crucial role of the target in stance detection task, we design target-aware prompts and propose a novel verbalizer.
arXiv Detail & Related papers (2022-06-27T12:04:14Z)
Generative multitask learning mitigates target-causing confounding [61.21582323566118]
We propose a simple and scalable approach to causal representation learning for multitask learning. The improvement comes from mitigating unobserved confounders that cause the targets, but not the input. Our results on the Attributes of People and Taskonomy datasets reflect the conceptual improvement in robustness to prior probability shift.
arXiv Detail & Related papers (2022-02-08T20:42:14Z)
Meta-Learning with Context-Agnostic Initialisations [86.47040878540139]
We introduce a context-adversarial component into the meta-learning process. This produces an initialisation for fine-tuning to target which is context-agnostic and task-generalised. We evaluate our approach on three commonly used meta-learning algorithms and two problems.
arXiv Detail & Related papers (2020-07-29T08:08:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.