Related papers: Towards Automating Text Annotation: A Case Study on Semantic Proximity Annotation using GPT-4

Towards Automating Text Annotation: A Case Study on Semantic Proximity Annotation using GPT-4

URL: http://arxiv.org/abs/2407.04130v1
Date: Thu, 4 Jul 2024 19:16:44 GMT
Title: Towards Automating Text Annotation: A Case Study on Semantic Proximity Annotation using GPT-4
Authors: Sachin Yadav, Tejaswi Choppa, Dominik Schlechtweg,
Abstract summary: This paper reuses human annotation guidelines along with some annotated data to design automatic prompts. We implement the prompting strategies into an open-source text annotation tool, enabling easy online use via the OpenAI API.
Score: 4.40960504549418
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: This paper explores using GPT-3.5 and GPT-4 to automate the data annotation process with automatic prompting techniques. The main aim of this paper is to reuse human annotation guidelines along with some annotated data to design automatic prompts for LLMs, focusing on the semantic proximity annotation task. Automatic prompts are compared to customized prompts. We further implement the prompting strategies into an open-source text annotation tool, enabling easy online use via the OpenAI API. Our study reveals the crucial role of accurate prompt design and suggests that prompting GPT-4 with human-like instructions is not straightforwardly possible for the semantic proximity task. We show that small modifications to the human guidelines already improve the performance, suggesting possible ways for future research.

Related papers

Evaluating GPT-4 at Grading Handwritten Solutions in Math Exams [48.99818550820575]
We leverage state-of-the-art multi-modal AI models, in particular GPT-4o, to automatically grade handwritten responses to college-level math exams. Using real student responses to questions in a probability theory exam, we evaluate GPT-4o's alignment with ground-truth scores from human graders using various prompting techniques.
arXiv Detail & Related papers (2024-11-07T22:51:47Z)
Keeping Humans in the Loop: Human-Centered Automated Annotation with Generative AI [0.0]
We use GPT-4 to replicate 27 annotation tasks across 11 password-protected datasets. For each task, we compare GPT-4 annotations against human-annotated ground-truth labels and against annotations from separate supervised classification models fine-tuned on human-generated labels. Our findings underscore the importance of a human-centered workflow and careful evaluation standards.
arXiv Detail & Related papers (2024-09-14T15:27:43Z)
GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text [1.2699007098398802]
This study codifies 22 rhetorical and linguistic features identified in literature related to the language of persuasion. RhetAnn, a web application, was specifically designed to minimize an otherwise considerable mental effort. A small set of annotated data was used to fine-tune GPT-3.5, a generative large language model (LLM), to annotate the remaining data.
arXiv Detail & Related papers (2024-07-16T15:15:39Z)
APT-Pipe: A Prompt-Tuning Tool for Social Data Annotation using ChatGPT [28.976911675881826]
We propose APT-Pipe, an automated prompt-tuning pipeline. We test it across twelve distinct text classification datasets. We find that prompts tuned by APT-Pipe help ChatGPT achieve higher weighted F1-score on nine out of twelve experimented datasets.
arXiv Detail & Related papers (2024-01-24T10:09:11Z)
Weaving Pathways for Justice with GPT: LLM-driven automated drafting of interactive legal applications [0.0]
We describe 3 approaches to automating the completion of court forms. A generative AI approach that uses GPT-3 to iteratively prompt the user to answer questions, a constrained template-driven approach that uses GPT-4-turbo to generate a draft of questions that are subject to human review, and a hybrid method. We conclude that the hybrid model of constrained automated drafting with human review is best suited to the task of authoring guided interviews.
arXiv Detail & Related papers (2023-12-14T18:20:59Z)
Prompt Engineering or Fine Tuning: An Empirical Assessment of Large Language Models in Automated Software Engineering Tasks [8.223311621898983]
GPT-4 with conversational prompts showed drastic improvement compared to GPT-4 with automatic prompting strategies. fully automated prompt engineering with no human in the loop requires more study and improvement.
arXiv Detail & Related papers (2023-10-11T00:21:00Z)
Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis: The case of apology [9.941695905504282]
This study explores the possibility of using large language models (LLMs) to automate pragma-discursive corpus annotation. We find that GPT-4 outperformed GPT-3.5, with accuracy approaching that of a human coder.
arXiv Detail & Related papers (2023-05-15T04:10:13Z)
Automatic Prompt Optimization with "Gradient Descent" and Beam Search [64.08364384823645]
Large Language Models (LLMs) have shown impressive performance as general purpose agents, but their abilities remain highly dependent on prompts. We propose a simple and nonparametric solution to this problem, Automatic Prompt Optimization (APO) APO uses minibatches of data to form natural language "gradients" that criticize the current prompt. The gradients are then "propagated" into the prompt by editing the prompt in the opposite semantic direction of the gradient.
arXiv Detail & Related papers (2023-05-04T15:15:22Z)
Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs. Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z)
TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA) In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge. Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z)
Supporting Vision-Language Model Inference with Confounder-pruning Knowledge Prompt [71.77504700496004]
Vision-language models are pre-trained by aligning image-text pairs in a common space to deal with open-set visual concepts. To boost the transferability of the pre-trained models, recent works adopt fixed or learnable prompts. However, how and what prompts can improve inference performance remains unclear.
arXiv Detail & Related papers (2022-05-23T07:51:15Z)
Annotation Curricula to Implicitly Train Non-Expert Annotators [56.67768938052715]
voluntary studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain. This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations. We propose annotation curricula, a novel approach to implicitly train annotators.
arXiv Detail & Related papers (2021-06-04T09:48:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.