Enhancing Rhetorical Figure Annotation: An Ontology-Based Web Application with RAG Integration
- URL: http://arxiv.org/abs/2412.13799v1
- Date: Wed, 18 Dec 2024 12:45:55 GMT
- Title: Enhancing Rhetorical Figure Annotation: An Ontology-Based Web Application with RAG Integration
- Authors: Ramona Kühn, Jelena Mitrović, Michael Granitzer,
- Abstract summary: We develop a web application called "Find your Figure"
It facilitates the identification and annotation of German rhetorical figures.
In addition, we improve the user experience with Retrieval Generation (RAG)
- Score: 0.6372911857214884
- License:
- Abstract: Rhetorical figures play an important role in our communication. They are used to convey subtle, implicit meaning, or to emphasize statements. We notice them in hate speech, fake news, and propaganda. By improving the systems for computational detection of rhetorical figures, we can also improve tasks such as hate speech and fake news detection, sentiment analysis, opinion mining, or argument mining. Unfortunately, there is a lack of annotated data, as well as qualified annotators that would help us build large corpora to train machine learning models for the detection of rhetorical figures. The situation is particularly difficult in languages other than English, and for rhetorical figures other than metaphor, sarcasm, and irony. To overcome this issue, we develop a web application called "Find your Figure" that facilitates the identification and annotation of German rhetorical figures. The application is based on the German Rhetorical ontology GRhOOT which we have specially adapted for this purpose. In addition, we improve the user experience with Retrieval Augmented Generation (RAG). In this paper, we present the restructuring of the ontology, the development of the web application, and the built-in RAG pipeline. We also identify the optimal RAG settings for our application. Our approach is one of the first to practically use rhetorical ontologies in combination with RAG and shows promising results.
Related papers
- GEM-RAG: Graphical Eigen Memories For Retrieval Augmented Generation [3.2027710059627545]
We introduce Graphical Eigen Memories For Retrieval Augmented Generation (GEM-RAG)
GEM-RAG works by tagging each chunk of text in a given text corpus with LLM generated utility'' questions.
We evaluate GEM-RAG, using both UnifiedQA and GPT-3.5 Turbo as the LLMs, with SBERT, and OpenAI's text encoders on two standard QA tasks.
arXiv Detail & Related papers (2024-09-23T21:42:47Z) - Reading with Intent [7.623508712778745]
RAG systems that rely on the open internet as their knowledge source have to contend with the complexities of human-generated content.
We introduce a prompting system designed to enhance the model's ability to interpret and generate responses in the presence of sarcasm.
arXiv Detail & Related papers (2024-08-20T20:47:27Z) - Multi-turn Response Selection with Commonsense-enhanced Language Models [32.921901489497714]
We design a Siamese network where a pre-trained Language model merges with a Graph neural network (SinLG)
SinLG takes advantage of Pre-trained Language Models (PLMs) to catch the word correlations in the context and response candidates.
The GNN aims to assist the PLM in fine-tuning, and arousing its related memories to attain better performance.
arXiv Detail & Related papers (2024-07-26T03:13:47Z) - Improved Contextual Recognition In Automatic Speech Recognition Systems
By Semantic Lattice Rescoring [4.819085609772069]
We propose a novel approach for enhancing contextual recognition within ASR systems via semantic lattice processing.
Our solution consists of using Hidden Markov Models and Gaussian Mixture Models (HMM-GMM) along with Deep Neural Networks (DNN) models for better accuracy.
We demonstrate the effectiveness of our proposed framework on the LibriSpeech dataset with empirical analyses.
arXiv Detail & Related papers (2023-10-14T23:16:05Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - VD-PCR: Improving Visual Dialog with Pronoun Coreference Resolution [79.05412803762528]
The visual dialog task requires an AI agent to interact with humans in multi-round dialogs based on a visual environment.
We propose VD-PCR, a novel framework to improve Visual Dialog understanding with Pronoun Coreference Resolution.
With the proposed implicit and explicit methods, VD-PCR achieves state-of-the-art experimental results on the VisDial dataset.
arXiv Detail & Related papers (2022-05-29T15:29:50Z) - Multilingual Extraction and Categorization of Lexical Collocations with
Graph-aware Transformers [86.64972552583941]
We put forward a sequence tagging BERT-based model enhanced with a graph-aware transformer architecture, which we evaluate on the task of collocation recognition in context.
Our results suggest that explicitly encoding syntactic dependencies in the model architecture is helpful, and provide insights on differences in collocation typification in English, Spanish and French.
arXiv Detail & Related papers (2022-05-23T16:47:37Z) - Dialogue Meaning Representation for Task-Oriented Dialogue Systems [51.91615150842267]
We propose Dialogue Meaning Representation (DMR), a flexible and easily extendable representation for task-oriented dialogue.
Our representation contains a set of nodes and edges with inheritance hierarchy to represent rich semantics for compositional semantics and task-specific concepts.
We propose two evaluation tasks to evaluate different machine learning based dialogue models, and further propose a novel coreference resolution model GNNCoref for the graph-based coreference resolution task.
arXiv Detail & Related papers (2022-04-23T04:17:55Z) - BERTuit: Understanding Spanish language in Twitter through a native
transformer [70.77033762320572]
We present bfBERTuit, the larger transformer proposed so far for Spanish language, pre-trained on a massive dataset of 230M Spanish tweets.
Our motivation is to provide a powerful resource to better understand Spanish Twitter and to be used on applications focused on this social network.
arXiv Detail & Related papers (2022-04-07T14:28:51Z) - Data Expansion using Back Translation and Paraphrasing for Hate Speech
Detection [1.192436948211501]
We present a new deep learning-based method that fuses a Back Translation method, and a Paraphrasing technique for data augmentation.
We evaluate our proposal on five publicly available datasets; namely, AskFm corpus, Formspring dataset, Warner and Waseem dataset, Olid, and Wikipedia toxic comments dataset.
arXiv Detail & Related papers (2021-05-25T09:52:42Z) - Reasoning in Dialog: Improving Response Generation by Context Reading
Comprehension [49.92173751203827]
In multi-turn dialog, utterances do not always take the full form of sentences.
We propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question.
arXiv Detail & Related papers (2020-12-14T10:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.