Use Me Wisely: AI-Driven Assessment for LLM Prompting Skills Development
- URL: http://arxiv.org/abs/2503.02532v1
- Date: Tue, 04 Mar 2025 11:56:33 GMT
- Title: Use Me Wisely: AI-Driven Assessment for LLM Prompting Skills Development
- Authors: Dimitri Ognibene, Gregor Donabauer, Emily Theophilou, Cansu Koyuturk, Mona Yavari, Sathya Bursic, Alessia Telari, Alessia Testa, Raffaele Boiano, Davide Taibi, Davinia Hernandez-Leo, Udo Kruschwitz, Martin Ruskov,
- Abstract summary: Large language model (LLM)-powered chatbots have become popular across various domains, supporting a range of tasks and processes.<n>Yet, prompting is highly task- and domain-dependent, limiting the effectiveness of generic approaches.<n>In this study, we explore whether LLM-based methods can facilitate learning assessments by using ad-hoc guidelines and a minimal number of annotated prompt samples.
- Score: 5.559706293891474
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of large language model (LLM)-powered chatbots, such as ChatGPT, has become popular across various domains, supporting a range of tasks and processes. However, due to the intrinsic complexity of LLMs, effective prompting is more challenging than it may seem. This highlights the need for innovative educational and support strategies that are both widely accessible and seamlessly integrated into task workflows. Yet, LLM prompting is highly task- and domain-dependent, limiting the effectiveness of generic approaches. In this study, we explore whether LLM-based methods can facilitate learning assessments by using ad-hoc guidelines and a minimal number of annotated prompt samples. Our framework transforms these guidelines into features that can be identified within learners' prompts. Using these feature descriptions and annotated examples, we create few-shot learning detectors. We then evaluate different configurations of these detectors, testing three state-of-the-art LLMs and ensembles. We run experiments with cross-validation on a sample of original prompts, as well as tests on prompts collected from task-naive learners. Our results show how LLMs perform on feature detection. Notably, GPT- 4 demonstrates strong performance on most features, while closely related models, such as GPT-3 and GPT-3.5 Turbo (Instruct), show inconsistent behaviors in feature classification. These differences highlight the need for further research into how design choices impact feature selection and prompt detection. Our findings contribute to the fields of generative AI literacy and computer-supported learning assessment, offering valuable insights for both researchers and practitioners.
Related papers
- Can Large Language Models Match Tutoring System Adaptivity? A Benchmarking Study [0.0]
Large Language Models (LLMs) hold promise as dynamic instructional aids.
Yet, it remains unclear whether LLMs can replicate the adaptivity of intelligent tutoring systems (ITS)
arXiv Detail & Related papers (2025-04-07T23:57:32Z) - Enhancing LLM's Ability to Generate More Repository-Aware Unit Tests Through Precise Contextual Information Injection [4.367526927436771]
Large Language Models (LLMs) guided by prompt engineering have gained attention for their ability to handle a broad range of tasks.<n>LLMs may exhibit hallucinations when generating unit tests for focal methods or functions due to their lack of awareness regarding the project's global context.<n>We propose RATester, which enhances the LLM's ability to generate more repository-aware unit tests.
arXiv Detail & Related papers (2025-01-13T15:43:36Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning [59.07490387145391]
Large language models (LLMs) have demonstrated impressive capabilities in various natural language processing tasks.
Their application to information retrieval (IR) tasks is still challenging due to the infrequent occurrence of many IR-specific concepts in natural language.
We introduce a novel instruction tuning dataset, INTERS, encompassing 20 tasks across three fundamental IR categories.
arXiv Detail & Related papers (2024-01-12T12:10:28Z) - Prompt Highlighter: Interactive Control for Multi-Modal LLMs [50.830448437285355]
This study targets a critical aspect of multi-modal LLMs' (LLMs&VLMs) inference: explicit controllable text generation.
We introduce a novel inference method, Prompt Highlighter, which enables users to highlight specific prompt spans to interactively control the focus during generation.
We find that, during inference, guiding the models with highlighted tokens through the attention weights leads to more desired outputs.
arXiv Detail & Related papers (2023-12-07T13:53:29Z) - Large Language Models Can be Lazy Learners: Analyze Shortcuts in
In-Context Learning [28.162661418161466]
Large language models (LLMs) have recently shown great potential for in-context learning.
This paper investigates the reliance of LLMs on shortcuts or spurious correlations within prompts.
We uncover a surprising finding that larger models are more likely to utilize shortcuts in prompts during inference.
arXiv Detail & Related papers (2023-05-26T20:56:30Z) - OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs.
Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z) - ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for
Document Information Extraction [56.790794611002106]
Large language models (LLMs) have demonstrated remarkable results in various natural language processing (NLP) tasks with in-context learning.
We propose a simple but effective in-context learning framework called ICL-D3IE.
Specifically, we extract the most difficult and distinct segments from hard training documents as hard demonstrations.
arXiv Detail & Related papers (2023-03-09T06:24:50Z) - CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented
Dialog Systems [56.302581679816775]
This paper proposes Comprehensive Instruction (CINS) that exploits PLMs with task-specific instructions.
We design a schema (definition, constraint, prompt) of instructions and their customized realizations for three important downstream tasks in ToD.
Experiments are conducted on these ToD tasks in realistic few-shot learning scenarios with small validation data.
arXiv Detail & Related papers (2021-09-10T03:23:06Z) - Prompt-Learning for Fine-Grained Entity Typing [40.983849729537795]
We investigate the application of prompt-learning on fine-grained entity typing in fully supervised, few-shot and zero-shot scenarios.
We propose a self-supervised strategy that carries out distribution-level optimization in prompt-learning to automatically summarize the information of entity types.
arXiv Detail & Related papers (2021-08-24T09:39:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.