Multi-label and Multi-target Sampling of Machine Annotation for
Computational Stance Detection
- URL: http://arxiv.org/abs/2311.04495v1
- Date: Wed, 8 Nov 2023 06:54:34 GMT
- Title: Multi-label and Multi-target Sampling of Machine Annotation for
Computational Stance Detection
- Authors: Zhengyuan Liu, Hai Leong Chieu, Nancy F. Chen
- Abstract summary: We introduce a multi-label and multi-target sampling strategy to optimize the annotation quality.
Experimental results on the benchmark stance detection corpora show that our method can significantly improve performance and learning efficacy.
- Score: 44.90471123149513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data collection from manual labeling provides domain-specific and
task-aligned supervision for data-driven approaches, and a critical mass of
well-annotated resources is required to achieve reasonable performance in
natural language processing tasks. However, manual annotations are often
challenging to scale up in terms of time and budget, especially when domain
knowledge, capturing subtle semantic features, and reasoning steps are needed.
In this paper, we investigate the efficacy of leveraging large language models
on automated labeling for computational stance detection. We empirically
observe that while large language models show strong potential as an
alternative to human annotators, their sensitivity to task-specific
instructions and their intrinsic biases pose intriguing yet unique challenges
in machine annotation. We introduce a multi-label and multi-target sampling
strategy to optimize the annotation quality. Experimental results on the
benchmark stance detection corpora show that our method can significantly
improve performance and learning efficacy.
Related papers
- Leveraging Mixture of Experts for Improved Speech Deepfake Detection [53.69740463004446]
Speech deepfakes pose a significant threat to personal security and content authenticity.
We introduce a novel approach for enhancing speech deepfake detection performance using a Mixture of Experts architecture.
arXiv Detail & Related papers (2024-09-24T13:24:03Z) - Leveraging Large Language Models for Mobile App Review Feature Extraction [4.879919005707447]
This study explores the hypothesis that encoder-only large language models can enhance feature extraction from mobile app reviews.
By leveraging crowdsourced annotations from an industrial context, we redefine feature extraction as a supervised token classification task.
Empirical evaluations demonstrate that this method improves the precision and recall of extracted features and enhances performance efficiency.
arXiv Detail & Related papers (2024-08-02T07:31:57Z) - Weakly-Supervised Cross-Domain Segmentation of Electron Microscopy with Sparse Point Annotation [1.124958340749622]
We introduce a multitask learning framework to leverage correlations among the counting, detection, and segmentation tasks.
We develop a cross-position cut-and-paste for label augmentation and an entropy-based pseudo-label selection.
The proposed model is capable of significantly outperforming UDA methods and produces comparable performance as the supervised counterpart.
arXiv Detail & Related papers (2024-03-31T12:22:23Z) - An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models [55.01592097059969]
Supervised finetuning on instruction datasets has played a crucial role in achieving the remarkable zero-shot generalization capabilities.
Active learning is effective in identifying useful subsets of samples to annotate from an unlabeled pool.
We propose using experimental design to circumvent the computational bottlenecks of active learning.
arXiv Detail & Related papers (2024-01-12T16:56:54Z) - Deep Active Learning with Noisy Oracle in Object Detection [5.5165579223151795]
We propose a composite active learning framework including a label review module for deep object detection.
We show that utilizing part of the annotation budget to correct the noisy annotations partially in the active dataset leads to early improvements in model performance.
In our experiments we achieve improvements of up to 4.5 mAP points of object detection performance by incorporating label reviews at equal annotation budget.
arXiv Detail & Related papers (2023-09-30T13:28:35Z) - ALLSH: Active Learning Guided by Local Sensitivity and Hardness [98.61023158378407]
We propose to retrieve unlabeled samples with a local sensitivity and hardness-aware acquisition function.
Our method achieves consistent gains over the commonly used active learning strategies in various classification tasks.
arXiv Detail & Related papers (2022-05-10T15:39:11Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z) - Pretext Tasks selection for multitask self-supervised speech
representation learning [23.39079406674442]
This paper introduces a method to select a group of pretext tasks among a set of candidates.
Experiments conducted on speaker recognition and automatic speech recognition validate our approach.
arXiv Detail & Related papers (2021-07-01T16:36:29Z) - Adaptive Self-training for Few-shot Neural Sequence Labeling [55.43109437200101]
We develop techniques to address the label scarcity challenge for neural sequence labeling models.
Self-training serves as an effective mechanism to learn from large amounts of unlabeled data.
meta-learning helps in adaptive sample re-weighting to mitigate error propagation from noisy pseudo-labels.
arXiv Detail & Related papers (2020-10-07T22:29:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.