Related papers: The Colorful Future of LLMs: Evaluating and Improving LLMs as Emotional Supporters for Queer Youth

The Colorful Future of LLMs: Evaluating and Improving LLMs as Emotional Supporters for Queer Youth

URL: http://arxiv.org/abs/2402.11886v1
Date: Mon, 19 Feb 2024 06:54:55 GMT
Title: The Colorful Future of LLMs: Evaluating and Improving LLMs as Emotional Supporters for Queer Youth
Authors: Shir Lissak, Nitay Calderon, Geva Shenkman, Yaakov Ophir, Eyal Fruchter, Anat Brunstein Klomek and Roi Reichart
Abstract summary: This paper aims to explore the potential of Large Language Models to revolutionize emotional support for queers. We develop a novel ten-question scale that is inspired by psychological standards and expert input. We find that LLM responses are supportive and inclusive, outscoring humans. However, they tend to be generic, not empathetic enough, and lack personalization, resulting in nonreliable and potentially harmful advice.
Score: 14.751539420563752
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Queer youth face increased mental health risks, such as depression, anxiety, and suicidal ideation. Hindered by negative stigma, they often avoid seeking help and rely on online resources, which may provide incompatible information. Although access to a supportive environment and reliable information is invaluable, many queer youth worldwide have no access to such support. However, this could soon change due to the rapid adoption of Large Language Models (LLMs) such as ChatGPT. This paper aims to comprehensively explore the potential of LLMs to revolutionize emotional support for queers. To this end, we conduct a qualitative and quantitative analysis of LLM's interactions with queer-related content. To evaluate response quality, we develop a novel ten-question scale that is inspired by psychological standards and expert input. We apply this scale to score several LLMs and human comments to posts where queer youth seek advice and share experiences. We find that LLM responses are supportive and inclusive, outscoring humans. However, they tend to be generic, not empathetic enough, and lack personalization, resulting in nonreliable and potentially harmful advice. We discuss these challenges, demonstrate that a dedicated prompt can improve the performance, and propose a blueprint of an LLM-supporter that actively (but sensitively) seeks user context to provide personalized, empathetic, and reliable responses. Our annotated dataset is available for further research.

Related papers

"It Listens Better Than My Therapist": Exploring Social Media Discourse on LLMs as Mental Health Tool [1.223779595809275]
Large language models (LLMs) offer new capabilities in conversational fluency, empathy simulation, and availability. This study explores how users engage with LLMs as mental health tools by analyzing over 10,000 TikTok comments. Results show that nearly 20% of comments reflect personal use, with these users expressing overwhelmingly positive attitudes.
arXiv Detail & Related papers (2025-04-14T17:37:32Z)
Evaluation of LLM Vulnerabilities to Being Misused for Personalized Disinformation Generation [0.5070610131852027]
Large language models (LLMs) can be effectively misused for generating disinformation news articles. This study fills this gap by evaluation of vulnerabilities of recent open and closed LLMs. Our results demonstrate the need for stronger safety-filters and disclaimers.
arXiv Detail & Related papers (2024-12-18T09:48:53Z)
Look Before You Leap: Enhancing Attention and Vigilance Regarding Harmful Content with GuidelineLLM [53.79753074854936]
Large language models (LLMs) are increasingly vulnerable to emerging jailbreak attacks. This vulnerability poses significant risks to real-world applications. We propose a novel defensive paradigm called GuidelineLLM.
arXiv Detail & Related papers (2024-12-10T12:42:33Z)
Embracing AI in Education: Understanding the Surge in Large Language Model Use by Secondary Students [53.20318273452059]
Large language models (LLMs) like OpenAI's ChatGPT have opened up new avenues in education. Despite school restrictions, our survey of over 300 middle and high school students revealed that a remarkable 70% of students have utilized LLMs. We propose a few ideas to address such issues, including subject-specific models, personalized learning, and AI classrooms.
arXiv Detail & Related papers (2024-11-27T19:19:34Z)
NewsInterview: a Dataset and a Playground to Evaluate LLMs' Ground Gap via Informational Interviews [65.35458530702442]
We focus on journalistic interviews, a domain rich in grounding communication and abundant in data. We curate a dataset of 40,000 two-person informational interviews from NPR and CNN. LLMs are significantly less likely than human interviewers to use acknowledgements and to pivot to higher-level questions.
arXiv Detail & Related papers (2024-11-21T01:37:38Z)
Persuasion with Large Language Models: a Survey [49.86930318312291]
Large Language Models (LLMs) have created new disruptive possibilities for persuasive communication. In areas such as politics, marketing, public health, e-commerce, and charitable giving, such LLM Systems have already achieved human-level or even super-human persuasiveness. Our survey suggests that the current and future potential of LLM-based persuasion poses profound ethical and societal risks.
arXiv Detail & Related papers (2024-11-11T10:05:52Z)
LLM Echo Chamber: personalized and automated disinformation [0.0]
Large Language Models can spread persuasive, humanlike misinformation at scale, which could influence public opinion. This study examines these risks, focusing on LLMs ability to propagate misinformation as factual. To investigate this, we built the LLM Echo Chamber, a controlled digital environment simulating social media chatrooms, where misinformation often spreads. This setup, evaluated by GPT4 for persuasiveness and harmfulness, sheds light on the ethical concerns surrounding LLMs and emphasizes the need for stronger safeguards against misinformation.
arXiv Detail & Related papers (2024-09-24T17:04:12Z)
I Need Help! Evaluating LLM's Ability to Ask for Users' Support: A Case Study on Text-to-SQL Generation [60.00337758147594]
This study explores the proactive ability of LLMs to seek user support. We propose metrics to evaluate the trade-off between performance improvements and user burden. Our experiments show that without external feedback, many LLMs struggle to recognize their need for user support.
arXiv Detail & Related papers (2024-07-20T06:12:29Z)
LLMs Assist NLP Researchers: Critique Paper (Meta-)Reviewing [106.45895712717612]
Large language models (LLMs) have shown remarkable versatility in various generative tasks. This study focuses on the topic of LLMs assist NLP Researchers. To our knowledge, this is the first work to provide such a comprehensive analysis.
arXiv Detail & Related papers (2024-06-24T01:30:22Z)
Can AI Relate: Testing Large Language Model Response for Mental Health Support [23.97212082563385]
Large language models (LLMs) are already being piloted for clinical use in hospital systems like NYU Langone, Dana-Farber and the NHS. We develop an evaluation framework for determining whether LLM response is a viable and ethical path forward for the automation of mental health treatment.
arXiv Detail & Related papers (2024-05-20T13:42:27Z)
Large Language Models are Capable of Offering Cognitive Reappraisal, if Guided [38.11184388388781]
Large language models (LLMs) have offered new opportunities for emotional support. This work takes a first step by engaging with cognitive reappraisals. We conduct a first-of-its-kind expert evaluation of an LLM's zero-shot ability to generate cognitive reappraisal responses.
arXiv Detail & Related papers (2024-04-01T17:56:30Z)
Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation [28.74445806009475]
This work initially analyzes the results of large language models (LLMs) on ESConv. We observe that exhibiting high preference for specific strategies hinders effective emotional support. Our findings emphasize that (1) low preference for specific strategies hinders the progress of emotional support, (2) external assistance helps reduce preference bias, and (3) existing LLMs alone cannot become good emotional supporters.
arXiv Detail & Related papers (2024-02-20T18:21:32Z)
Factuality of Large Language Models: A Survey [29.557596701431827]
We critically analyze existing work with the aim to identify the major challenges and their associated causes. We analyze the obstacles to automated factuality evaluation for open-ended text generation.
arXiv Detail & Related papers (2024-02-04T09:36:31Z)
Are Large Language Models Really Robust to Word-Level Perturbations? [68.60618778027694]
We propose a novel rational evaluation approach that leverages pre-trained reward models as diagnostic tools. Longer conversations manifest the comprehensive grasp of language models in terms of their proficiency in understanding questions. Our results demonstrate that LLMs frequently exhibit vulnerability to word-level perturbations that are commonplace in daily language usage.
arXiv Detail & Related papers (2023-09-20T09:23:46Z)
Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate [85.3444184685235]
We propose a Multi-Agent Debate (MAD) framework, in which multiple agents express their arguments in the state of "tit for tat" and a judge manages the debate process to obtain a final solution. Our framework encourages divergent thinking in LLMs which would be helpful for tasks that require deep levels of contemplation.
arXiv Detail & Related papers (2023-05-30T15:25:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.