SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable
Responses Created Through Human-Machine Collaboration
- URL: http://arxiv.org/abs/2305.17696v1
- Date: Sun, 28 May 2023 11:51:20 GMT
- Title: SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable
Responses Created Through Human-Machine Collaboration
- Authors: Hwaran Lee, Seokhee Hong, Joonsuk Park, Takyoung Kim, Meeyoung Cha,
Yejin Choi, Byoung Pil Kim, Gunhee Kim, Eun-Ju Lee, Yong Lim, Alice Oh,
Sangchul Park and Jung-Woo Ha
- Abstract summary: This dataset is a large-scale Korean dataset of 49k sensitive questions with 42k acceptable and 46k non-acceptable responses.
The dataset was constructed leveraging HyperCLOVA in a human-in-the-loop manner based on real news headlines.
- Score: 75.62448812759968
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The potential social harms that large language models pose, such as
generating offensive content and reinforcing biases, are steeply rising.
Existing works focus on coping with this concern while interacting with
ill-intentioned users, such as those who explicitly make hate speech or elicit
harmful responses. However, discussions on sensitive issues can become toxic
even if the users are well-intentioned. For safer models in such scenarios, we
present the Sensitive Questions and Acceptable Response (SQuARe) dataset, a
large-scale Korean dataset of 49k sensitive questions with 42k acceptable and
46k non-acceptable responses. The dataset was constructed leveraging HyperCLOVA
in a human-in-the-loop manner based on real news headlines. Experiments show
that acceptable response generation significantly improves for HyperCLOVA and
GPT-3, demonstrating the efficacy of this dataset.
Related papers
- When Large Language Models contradict humans? Large Language Models' Sycophantic Behaviour [0.8133739801185272]
We show that Large Language Models (LLMs) show sycophantic tendencies when responding to queries involving subjective opinions and statements.
LLMs at various scales seem not to follow the users' hints by demonstrating confidence in delivering the correct answers.
arXiv Detail & Related papers (2023-11-15T22:18:33Z) - Decoding the Silent Majority: Inducing Belief Augmented Social Graph
with Large Language Model for Response Forecasting [74.68371461260946]
SocialSense is a framework that induces a belief-centered graph on top of an existent social network, along with graph-based propagation to capture social dynamics.
Our method surpasses existing state-of-the-art in experimental evaluations for both zero-shot and supervised settings.
arXiv Detail & Related papers (2023-10-20T06:17:02Z) - A Benchmark for Understanding Dialogue Safety in Mental Health Support [15.22008156903607]
This paper aims to develop a theoretically and factually grounded taxonomy that prioritizes the positive impact on help-seekers.
We analyze the dataset using popular language models, including BERT-base, RoBERTa-large, and ChatGPT.
The developed dataset and findings serve as valuable benchmarks for advancing research on dialogue safety in mental health support.
arXiv Detail & Related papers (2023-07-31T07:33:16Z) - Leveraging Implicit Feedback from Deployment Data in Dialogue [83.02878726357523]
We study improving social conversational agents by learning from natural dialogue between users and a deployed model.
We leverage signals like user response length, sentiment and reaction of the future human utterances in the collected dialogue episodes.
arXiv Detail & Related papers (2023-07-26T11:34:53Z) - Measuring the Effect of Influential Messages on Varying Personas [67.1149173905004]
We present a new task, Response Forecasting on Personas for News Media, to estimate the response a persona might have upon seeing a news message.
The proposed task not only introduces personalization in the modeling but also predicts the sentiment polarity and intensity of each response.
This enables more accurate and comprehensive inference on the mental state of the persona.
arXiv Detail & Related papers (2023-05-25T21:01:00Z) - SaFeRDialogues: Taking Feedback Gracefully after Conversational Safety
Failures [9.38317687250036]
This work proposes SaFeRDialogues, a task and dataset of graceful responses to feedback about safety failures.
We collect a dataset of 10k dialogues demonstrating safety failures, feedback signaling them, and a response acknowledging the feedback.
We show how fine-tuning on this dataset results in conversations that human raters deem considerably more likely to lead to a civil conversation.
arXiv Detail & Related papers (2021-10-14T16:41:25Z) - Characterizing User Susceptibility to COVID-19 Misinformation on Twitter [40.0762273487125]
This study attempts to answer it who constitutes the population vulnerable to the online misinformation in the pandemic.
We distinguish different types of users, ranging from social bots to humans with various level of engagement with COVID-related misinformation.
We then identify users' online features and situational predictors that correlate with their susceptibility to COVID-19 misinformation.
arXiv Detail & Related papers (2021-09-20T13:31:15Z) - Exploiting Unsupervised Data for Emotion Recognition in Conversations [76.01690906995286]
Emotion Recognition in Conversations (ERC) aims to predict the emotional state of speakers in conversations.
The available supervised data for the ERC task is limited.
We propose a novel approach to leverage unsupervised conversation data.
arXiv Detail & Related papers (2020-10-02T13:28:47Z) - Counterfactual Off-Policy Training for Neural Response Generation [94.76649147381232]
We propose to explore potential responses by counterfactual reasoning.
Training on the counterfactual responses under the adversarial learning framework helps to explore the high-reward area of the potential response space.
An empirical study on the DailyDialog dataset shows that our approach significantly outperforms the HRED model.
arXiv Detail & Related papers (2020-04-29T22:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.