Related papers: Are You Sure You're Positive? Consolidating Chain-of-Thought Agents with Uncertainty Quantification for Aspect-Category Sentiment Analysis

Are You Sure You're Positive? Consolidating Chain-of-Thought Agents with Uncertainty Quantification for Aspect-Category Sentiment Analysis

URL: http://arxiv.org/abs/2508.17258v1
Date: Sun, 24 Aug 2025 08:51:16 GMT
Title: Are You Sure You're Positive? Consolidating Chain-of-Thought Agents with Uncertainty Quantification for Aspect-Category Sentiment Analysis
Authors: Filippos Ventirozos, Peter Appleby, Matthew Shardlow,
Abstract summary: We argue that leveraging large language models in a zero-shot setting is beneficial where the time and resources required for dataset annotation are limited.<n>We propose novel techniques that combine multiple chain-of-thought agents by leveraging large language models' token-level uncertainty scores.
Score: 4.14197005718384
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Aspect-category sentiment analysis provides granular insights by identifying specific themes within product reviews that are associated with particular opinions. Supervised learning approaches dominate the field. However, data is scarce and expensive to annotate for new domains. We argue that leveraging large language models in a zero-shot setting is beneficial where the time and resources required for dataset annotation are limited. Furthermore, annotation bias may lead to strong results using supervised methods but transfer poorly to new domains in contexts that lack annotations and demand reproducibility. In our work, we propose novel techniques that combine multiple chain-of-thought agents by leveraging large language models' token-level uncertainty scores. We experiment with the 3B and 70B+ parameter size variants of Llama and Qwen models, demonstrating how these approaches can fulfil practical needs and opening a discussion on how to gauge accuracy in label-scarce conditions.

Related papers

Normalisation of SWIFT Message Counterparties with Feature Extraction and Clustering [0.0]
We propose a hybrid string similarity, topic modelling, hierarchical clustering and rule-based pipeline to facilitate clustering of transaction counterparties.<n>The approach retains most of the interpretability found in rule-based systems, as the former adds an additional level of cluster refinement to the latter.<n>When only a subset of the population needs to be investigated, such as in sanctions investigations, the approach allows for better control of the risks of missing entity variations.
arXiv Detail & Related papers (2025-08-24T12:41:44Z)
IDEAL: Influence-Driven Selective Annotations Empower In-Context Learners in Large Language Models [63.15355173909631]
This paper introduces an influence-driven selective annotation method.<n>It aims to minimize annotation costs while improving the quality of in-context examples.<n> Experiments confirm the superiority of the proposed method on various benchmarks.
arXiv Detail & Related papers (2023-10-16T22:53:54Z)
Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges. Our model is trained on user queries and LLM-generated responses under massive real-world scenarios. Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z)
Combining Data Generation and Active Learning for Low-Resource Question Answering [23.755283239897132]
We propose a novel approach that combines data augmentation via question-answer generation with Active Learning to improve performance in low-resource settings. Our findings show that our novel approach, where humans are incorporated in a data generation approach, boosts performance in the low-resource, domain-specific setting.
arXiv Detail & Related papers (2022-11-27T16:31:33Z)
Domain Adaptive Segmentation of Electron Microscopy with Sparse Point Annotations [2.5137859989323537]
We develop a highly annotation-efficient approach with competitive performance. We focus on weakly-supervised domain adaptation (WDA) with a type of extremely sparse and weak annotation. We show that our model with only 15% point annotations can achieve comparable performance as supervised models.
arXiv Detail & Related papers (2022-10-24T10:50:37Z)
Mention Annotations Alone Enable Efficient Domain Adaptation for Coreference Resolution [8.08448832546021]
We show that annotating mentions alone is nearly twice as fast as annotating full coreference chains. Our approach facilitates annotation-efficient transfer and results in a 7-14% improvement in average F1 without increasing annotator time.
arXiv Detail & Related papers (2022-10-14T07:57:27Z)
Entity Disambiguation with Entity Definitions [50.01142092276296]
Local models have recently attained astounding performances in Entity Disambiguation (ED) Previous works limited their studies to using, as the textual representation of each candidate, only its Wikipedia title. In this paper, we address this limitation and investigate to what extent more expressive textual representations can mitigate it. We report a new state of the art on 2 out of 6 benchmarks we consider and strongly improve the generalization capability over unseen patterns.
arXiv Detail & Related papers (2022-10-11T17:46:28Z)
Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future [63.99570204416711]
We reimplement 18 methods for detecting potential annotation errors and evaluate them on 9 English datasets. We define a uniform evaluation setup including a new formalization of the annotation error detection task. We release our datasets and implementations in an easy-to-use and open source software package.
arXiv Detail & Related papers (2022-06-05T22:31:45Z)
Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases [55.45617404586874]
We propose a few-shot instruction-based method for prompting pre-trained language models (LMs) We show that large LMs can detect different types of fine-grained biases with similar and sometimes superior accuracy to fine-tuned models.
arXiv Detail & Related papers (2021-12-15T04:19:52Z)
Adaptive Active Learning for Coreference Resolution [37.261220564076964]
Recent developments in incremental coreference resolution allow for a novel approach to active learning in this setting. By lowering the data barrier for coreference, coreference resolvers can rapidly adapt to a series of previously unconsidered domains.
arXiv Detail & Related papers (2021-04-15T17:21:51Z)
Analysis and Evaluation of Language Models for Word Sense Disambiguation [18.001457030065712]
Transformer-based language models have taken many fields in NLP by storm. BERT can accurately capture high-level sense distinctions, even when a limited number of examples is available for each word sense. BERT and its derivatives dominate most of the existing evaluation benchmarks.
arXiv Detail & Related papers (2020-08-26T15:07:07Z)
UniT: Unified Knowledge Transfer for Any-shot Object Detection and Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training. We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.