Automatic Scoring of Dream Reports' Emotional Content with Large
Language Models
- URL: http://arxiv.org/abs/2302.14828v1
- Date: Tue, 28 Feb 2023 18:23:17 GMT
- Title: Automatic Scoring of Dream Reports' Emotional Content with Large
Language Models
- Authors: Lorenzo Bertolini, Valentina Elce, Adriana Michalak, Giulio Bernardi,
Julie Weeds
- Abstract summary: The study of dream content typically relies on the analysis of verbal reports provided by dreamers upon awakening from their sleep.
This task is classically performed through manual scoring provided by trained annotators, at a great time expense.
While a consistent body of work suggests that natural language processing (NLP) tools can support the automatic analysis of dream reports, proposed methods lacked the ability to reason over a report's full context and required extensive data pre-processing.
In this work, we address these limitations by adopting large language models (LLMs) to study and replicate the manual annotation of dream reports, using a mixture of off-
- Score: 3.1761323820497656
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the field of dream research, the study of dream content typically relies
on the analysis of verbal reports provided by dreamers upon awakening from
their sleep. This task is classically performed through manual scoring provided
by trained annotators, at a great time expense. While a consistent body of work
suggests that natural language processing (NLP) tools can support the automatic
analysis of dream reports, proposed methods lacked the ability to reason over a
report's full context and required extensive data pre-processing. Furthermore,
in most cases, these methods were not validated against standard manual scoring
approaches. In this work, we address these limitations by adopting large
language models (LLMs) to study and replicate the manual annotation of dream
reports, using a mixture of off-the-shelf and bespoke approaches, with a focus
on references to reports' emotions. Our results show that the off-the-shelf
method achieves a low performance probably in light of inherent linguistic
differences between reports collected in different (groups of) individuals. On
the other hand, the proposed bespoke text classification method achieves a high
performance, which is robust against potential biases. Overall, these
observations indicate that our approach could find application in the analysis
of large dream datasets and may favour reproducibility and comparability of
results across studies.
Related papers
- A Bayesian Approach to Harnessing the Power of LLMs in Authorship Attribution [57.309390098903]
Authorship attribution aims to identify the origin or author of a document.
Large Language Models (LLMs) with their deep reasoning capabilities and ability to maintain long-range textual associations offer a promising alternative.
Our results on the IMDb and blog datasets show an impressive 85% accuracy in one-shot authorship classification across ten authors.
arXiv Detail & Related papers (2024-10-29T04:14:23Z) - Mitigating Biases to Embrace Diversity: A Comprehensive Annotation Benchmark for Toxic Language [0.0]
This study introduces a prescriptive annotation benchmark grounded in humanities research to ensure consistent, unbiased labeling of offensive language.
We contribute two newly annotated datasets that achieve higher inter-annotator agreement between human and language model (LLM) annotations.
arXiv Detail & Related papers (2024-10-17T08:10:24Z) - Towards More Effective Table-to-Text Generation: Assessing In-Context Learning and Self-Evaluation with Open-Source Models [0.0]
This study explores the effectiveness of various in-context learning strategies in language models (LMs) across benchmark datasets.
We employ a large language model (LLM) self-evaluation approach using chain-of-thought reasoning and assess its correlation with human-aligned metrics like BERTScore.
Our findings highlight the significant impact of examples in improving table-to-text generation and suggest that, while LLM self-evaluation has potential, its current alignment with human judgment could be enhanced.
arXiv Detail & Related papers (2024-10-15T09:19:42Z) - Exploring Large Language Models for Relevance Judgments in Tetun [0.03683202928838613]
This paper explores the feasibility of using large language models (LLMs) to automate relevance assessments.
LLMs are employed to automate relevance judgment tasks, by providing a series of query-document pairs in Tetun as the input text.
Our investigation reveals results that align closely with those reported in studies of high-resource languages.
arXiv Detail & Related papers (2024-06-11T14:28:24Z) - Sequence-to-Sequence Language Models for Character and Emotion Detection in Dream Narratives [0.0]
This paper presents the first study on character and emotion detection in the English portion of the open DreamBank corpus of dream narratives.
Our results show that language models can effectively address this complex task.
We evaluate the impact of model size, prediction order of characters, and the consideration of proper names and character traits.
arXiv Detail & Related papers (2024-03-21T08:27:49Z) - How to Determine the Most Powerful Pre-trained Language Model without
Brute Force Fine-tuning? An Empirical Survey [23.757740341834126]
We show that H-Score generally performs well with superiorities in effectiveness and efficiency.
We also outline the difficulties of consideration of training details, applicability to text generation, and consistency to certain metrics which shed light on future directions.
arXiv Detail & Related papers (2023-12-08T01:17:28Z) - Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges.
Our model is trained on user queries and LLM-generated responses under massive real-world scenarios.
Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv Detail & Related papers (2023-10-09T07:27:15Z) - Bias and Fairness in Large Language Models: A Survey [73.87651986156006]
We present a comprehensive survey of bias evaluation and mitigation techniques for large language models (LLMs)
We first consolidate, formalize, and expand notions of social bias and fairness in natural language processing.
We then unify the literature by proposing three intuitive, two for bias evaluation, and one for mitigation.
arXiv Detail & Related papers (2023-09-02T00:32:55Z) - Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language
Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks.
Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena.
For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z) - mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
Despite promising results, current models still suffer from generating factually inconsistent summaries.
We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.