DialogueLLM: Context and Emotion Knowledge-Tuned Large Language Models
for Emotion Recognition in Conversations
- URL: http://arxiv.org/abs/2310.11374v4
- Date: Wed, 17 Jan 2024 02:44:56 GMT
- Title: DialogueLLM: Context and Emotion Knowledge-Tuned Large Language Models
for Emotion Recognition in Conversations
- Authors: Yazhou Zhang, Mengyao Wang, Youxi Wu, Prayag Tiwari, Qiuchi Li, Benyou
Wang, Jing Qin
- Abstract summary: Large language models (LLMs) have shown extraordinary efficacy across numerous downstream natural language processing (NLP) tasks.
We propose DialogueLLM, a context and emotion knowledge tuned LLM that is obtained by fine-tuning LLaMA models.
We offer a comprehensive evaluation of our proposed model on three benchmarking emotion recognition in conversations datasets.
- Score: 28.15933355881604
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) and their variants have shown extraordinary
efficacy across numerous downstream natural language processing (NLP) tasks,
which has presented a new vision for the development of NLP. Despite their
remarkable performance in natural language generating (NLG), LLMs lack a
distinct focus on the emotion understanding domain. As a result, using LLMs for
emotion recognition may lead to suboptimal and inadequate precision. Another
limitation of LLMs is that they are typical trained without leveraging
multi-modal information. To overcome these limitations, we propose DialogueLLM,
a context and emotion knowledge tuned LLM that is obtained by fine-tuning LLaMA
models with 13,638 multi-modal (i.e., texts and videos) emotional dialogues.
The visual information is considered as the supplementary knowledge to
construct high-quality instructions. We offer a comprehensive evaluation of our
proposed model on three benchmarking emotion recognition in conversations (ERC)
datasets and compare the results against the SOTA baselines and other SOTA
LLMs. Additionally, DialogueLLM-7B can be easily trained using LoRA on a 40GB
A100 GPU in 5 hours, facilitating reproducibility for other researchers.
Related papers
- Beyond Silent Letters: Amplifying LLMs in Emotion Recognition with Vocal Nuances [3.396456345114466]
We propose SpeechCueLLM, a method that translates speech characteristics into natural language descriptions.
We evaluate SpeechCueLLM on two datasets: IEMOCAP and MELD, showing significant improvements in emotion recognition accuracy.
arXiv Detail & Related papers (2024-07-31T03:53:14Z) - Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models.
It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop.
Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z) - FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks.
We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z) - Empowering Language Models with Active Inquiry for Deeper Understanding [31.11672018840381]
We introduce LaMAI (Language Model with Active Inquiry), designed to endow large language models with interactive engagement.
LaMAI uses active learning techniques to raise the most informative questions, fostering a dynamic bidirectional dialogue.
Our empirical studies, across a variety of complex datasets, demonstrate the effectiveness of LaMAI.
arXiv Detail & Related papers (2024-02-06T05:24:16Z) - Boosting Large Language Model for Speech Synthesis: An Empirical Study [86.89548753080432]
Large language models (LLMs) have made significant advancements in natural language processing and are concurrently extending the language ability to other modalities, such as speech and vision.
We conduct a comprehensive empirical exploration of boosting LLMs with the ability to generate speech, by combining pre-trained LLM LLaMA/OPT and text-to-speech synthesis model VALL-E.
We compare three integration methods between LLMs and speech models, including directly fine-tuned LLMs, superposed layers of LLMs and VALL-E, and coupled LLMs and VALL-E using LLMs as a powerful text encoder
arXiv Detail & Related papers (2023-12-30T14:20:04Z) - Customising General Large Language Models for Specialised Emotion
Recognition Tasks [24.822342337306363]
We investigate how large language models (LLMs) perform in linguistic emotion recognition.
Specifically, we exemplify a publicly available and widely used LLM -- Chat General Language Model.
We customise it for our target by using two different modal adaptation techniques, i.e., deep prompt tuning and low-rank adaptation.
The experimental results show that the adapted LLM can easily outperform other state-of-the-art but specialised deep models.
arXiv Detail & Related papers (2023-10-22T08:09:13Z) - Let Models Speak Ciphers: Multiagent Debate through Embeddings [84.20336971784495]
We introduce CIPHER (Communicative Inter-Model Protocol Through Embedding Representation) to address this issue.
By deviating from natural language, CIPHER offers an advantage of encoding a broader spectrum of information without any modification to the model weights.
This showcases the superiority and robustness of embeddings as an alternative "language" for communication among LLMs.
arXiv Detail & Related papers (2023-10-10T03:06:38Z) - Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools.
Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions.
Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z) - Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness
and Ethics [32.123919380959485]
Multi-modal large language models (MLLMs) are trained based on large language models (LLM)
While they excel in multi-modal tasks, the pure NLP abilities of MLLMs are often underestimated and left untested.
We show that visual instruction tuning, a prevailing strategy for transitioning LLMs into MLLMs, unexpectedly and interestingly helps models attain both improved truthfulness and ethical alignment.
arXiv Detail & Related papers (2023-09-13T17:57:21Z) - TouchStone: Evaluating Vision-Language Models by Language Models [91.69776377214814]
We propose an evaluation method that uses strong large language models as judges to comprehensively evaluate the various abilities of LVLMs.
We construct a comprehensive visual dialogue dataset TouchStone, consisting of open-world images and questions, covering five major categories of abilities and 27 subtasks.
We demonstrate that powerful LVLMs, such as GPT-4, can effectively score dialogue quality by leveraging their textual capabilities alone.
arXiv Detail & Related papers (2023-08-31T17:52:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.