Related papers: Metaphor identification using large language models: A comparison of RAG, prompt engineering, and fine-tuning

Metaphor identification using large language models: A comparison of RAG, prompt engineering, and fine-tuning

URL: http://arxiv.org/abs/2509.24866v2
Date: Wed, 01 Oct 2025 14:06:17 GMT
Title: Metaphor identification using large language models: A comparison of RAG, prompt engineering, and fine-tuning
Authors: Matteo Fuoli, Weihang Huang, Jeannette Littlemore, Sarah Turner, Ellen Wilding,
Abstract summary: This study investigates the potential of large language models (LLMs) to automate metaphor identification in full texts.<n>We compare three methods: (i) retrieval-augmented generation (RAG), where the model is provided with a codebook and instructed to annotate texts based on its rules and examples; (ii) prompt engineering, where we design task-specific verbal instructions; and (iii) fine-tuning, where the model is trained on hand-coded texts to optimize performance.
Score: 0.6524460254566904
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Metaphor is a pervasive feature of discourse and a powerful lens for examining cognition, emotion, and ideology. Large-scale analysis, however, has been constrained by the need for manual annotation due to the context-sensitive nature of metaphor. This study investigates the potential of large language models (LLMs) to automate metaphor identification in full texts. We compare three methods: (i) retrieval-augmented generation (RAG), where the model is provided with a codebook and instructed to annotate texts based on its rules and examples; (ii) prompt engineering, where we design task-specific verbal instructions; and (iii) fine-tuning, where the model is trained on hand-coded texts to optimize performance. Within prompt engineering, we test zero-shot, few-shot, and chain-of-thought strategies. Our results show that state-of-the-art closed-source LLMs can achieve high accuracy, with fine-tuning yielding a median F1 score of 0.79. A comparison of human and LLM outputs reveals that most discrepancies are systematic, reflecting well-known grey areas and conceptual challenges in metaphor theory. We propose that LLMs can be used to at least partly automate metaphor identification and can serve as a testbed for developing and refining metaphor identification protocols and the theory that underpins them.

Related papers

Unveiling LLMs' Metaphorical Understanding: Exploring Conceptual Irrelevance, Context Leveraging and Syntactic Influence [40.32545329527664]
Large Language Models (LLMs) demonstrate advanced capabilities in knowledge integration, contextual reasoning, and creative generation.<n>This study examines LLMs' metaphor-processing abilities from three perspectives.
arXiv Detail & Related papers (2025-10-05T09:45:51Z)
In-Context Watermarks for Large Language Models [71.29952527565749]
In-Context Watermarking (ICW) embeds watermarks into generated text solely through prompt engineering.<n>We investigate four ICW strategies at different levels of granularity, each paired with a tailored detection method.<n>Our experiments validate the feasibility of ICW as a model-agnostic, practical watermarking approach.
arXiv Detail & Related papers (2025-05-22T17:24:51Z)
Enhancing multimodal analogical reasoning with Logic Augmented Generation [1.3654846342364308]
In this paper, we apply a logic-augmented generation (LAG) framework that leverages the explicit representation of a text through a semantic knowledge graph.<n>This method generates extended knowledge graph triples representing implicit meaning, enabling systems to reason on unlabeled multimodal data regardless of the domain.<n>The results show that this integrated approach surpasses current baselines, performs better than humans in understanding visual metaphors, and enables more explainable reasoning processes.
arXiv Detail & Related papers (2025-04-15T13:47:55Z)
Contextualize-then-Aggregate: Circuits for In-Context Learning in Gemma-2 2B [51.74607395697567]
In-Context Learning (ICL) is an intriguing ability of large language models (LLMs)<n>We use causal interventions to identify information flow in Gemma-2 2B for five naturalistic ICL tasks.<n>We find that the model infers task information using a two-step strategy we call contextualize-then-aggregate.
arXiv Detail & Related papers (2025-03-31T18:33:55Z)
A Dual-Perspective Metaphor Detection Framework Using Large Language Models [29.18537460293431]
We propose DMD, a novel dual-perspective framework for metaphor detection.<n>It harnesses both implicit and explicit applications of metaphor theories to guide LLMs in metaphor detection.<n>In comparison to previous methods, our framework offers more transparent reasoning processes and delivers more reliable predictions.
arXiv Detail & Related papers (2024-12-23T06:50:04Z)
VladVA: Discriminative Fine-tuning of LVLMs [67.14293827774827]
Contrastively-trained Vision-Language Models (VLMs) like CLIP have become the de facto approach for discriminative vision-language representation learning.<n>We propose to combine "the best of both worlds": a new training approach for discriminative fine-tuning of LVLMs.
arXiv Detail & Related papers (2024-12-05T17:54:27Z)
Generative Context-aware Fine-tuning of Self-supervised Speech Models [54.389711404209415]
We study the use of generative large language models (LLM) generated context information. We propose an approach to distill the generated information during fine-tuning of self-supervised speech models. We evaluate the proposed approach using the SLUE and Libri-light benchmarks for several downstream tasks: automatic speech recognition, named entity recognition, and sentiment analysis.
arXiv Detail & Related papers (2023-12-15T15:46:02Z)
Leveraging Pre-trained Language Model for Speech Sentiment Analysis [58.78839114092951]
We explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis. We propose a pseudo label-based semi-supervised training strategy using a language model on an end-to-end speech sentiment approach.
arXiv Detail & Related papers (2021-06-11T20:15:21Z)
Metaphor Generation with Conceptual Mappings [58.61307123799594]
We aim to generate a metaphoric sentence given a literal expression by replacing relevant verbs. We propose to control the generation process by encoding conceptual mappings between cognitive domains. We show that the unsupervised CM-Lex model is competitive with recent deep learning metaphor generation systems.
arXiv Detail & Related papers (2021-06-02T15:27:05Z)
MERMAID: Metaphor Generation with Symbolism and Discriminative Decoding [22.756157298168127]
Based on a theoretically-grounded connection between metaphors and symbols, we propose a method to automatically construct a parallel corpus. For the generation task, we incorporate a metaphor discriminator to guide the decoding of a sequence to sequence model fine-tuned on our parallel data. A task-based evaluation shows that human-written poems enhanced with metaphors are preferred 68% of the time compared to poems without metaphors.
arXiv Detail & Related papers (2021-03-11T16:39:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.