Chain of Explanation: New Prompting Method to Generate Higher Quality
Natural Language Explanation for Implicit Hate Speech
- URL: http://arxiv.org/abs/2209.04889v1
- Date: Sun, 11 Sep 2022 15:04:11 GMT
- Title: Chain of Explanation: New Prompting Method to Generate Higher Quality
Natural Language Explanation for Implicit Hate Speech
- Authors: Fan Huang, Haewoon Kwak, Jisun An
- Abstract summary: We propose the Chain of Explanation Prompting method, inspired by the chain of thoughts study citewei2022chain, to generate high-quality NLE for implicit hate speech.
We build a benchmark based on the selected mainstream Pre-trained Language Models (PLMs), with various evaluation metrics from lexical, semantic, and faithful aspects.
To further evaluate the quality of the generated NLE from human perceptions, we hire human annotators to score the informativeness and clarity of the generated NLE.
- Score: 8.761064812847078
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies have exploited advanced generative language models to generate
Natural Language Explanations (NLE) for why a certain text could be hateful. We
propose the Chain of Explanation Prompting method, inspired by the chain of
thoughts study \cite{wei2022chain}, to generate high-quality NLE for implicit
hate speech. We build a benchmark based on the selected mainstream Pre-trained
Language Models (PLMs), including GPT-2, GPT-Neo, OPT, T5, and BART, with
various evaluation metrics from lexical, semantic, and faithful aspects. To
further evaluate the quality of the generated NLE from human perceptions, we
hire human annotators to score the informativeness and clarity of the generated
NLE. Then, we inspect which automatic evaluation metric could be best
correlated with the human-annotated informativeness and clarity metric scores.
Related papers
- Unsupervised Approach to Evaluate Sentence-Level Fluency: Do We Really
Need Reference? [3.2528685897001455]
This paper adapts an existing unsupervised technique for measuring text fluency without the need for any reference.
Our approach leverages various word embeddings and trains language models using Recurrent Neural Network (RNN) architectures.
To assess the performance of the models, we conduct a comparative analysis across 10 Indic languages.
arXiv Detail & Related papers (2023-12-03T20:09:23Z) - Automatic Evaluation of Generative Models with Instruction Tuning [14.369719297698694]
Recent paradigm fine-tunes pre-trained language models to emulate human judgements for a particular task and evaluation criterion.
Inspired by the generalization ability of instruction-tuned models, we propose a learned metric based on instruction tuning.
arXiv Detail & Related papers (2023-10-30T23:00:52Z) - Generative Spoken Language Model based on continuous word-sized audio
tokens [52.081868603603844]
We introduce a Generative Spoken Language Model based on word-size continuous-valued audio embeddings.
The resulting model is the first generative language model based on word-size continuous embeddings.
arXiv Detail & Related papers (2023-10-08T16:46:14Z) - Reranking for Natural Language Generation from Logical Forms: A Study
based on Large Language Models [47.08364281023261]
Large language models (LLMs) have demonstrated impressive capabilities in natural language generation.
However, their output quality can be inconsistent, posing challenges for generating natural language from logical forms (LFs)
arXiv Detail & Related papers (2023-09-21T17:54:58Z) - Situated Natural Language Explanations [54.083715161895036]
Natural language explanations (NLEs) are among the most accessible tools for explaining decisions to humans.
Existing NLE research perspectives do not take the audience into account.
Situated NLE provides a perspective and facilitates further research on the generation and evaluation of explanations.
arXiv Detail & Related papers (2023-08-27T14:14:28Z) - G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment [64.01972723692587]
We present G-Eval, a framework of using large language models with chain-of-thoughts (CoT) and a form-filling paradigm to assess the quality of NLG outputs.
We show that G-Eval with GPT-4 as the backbone model achieves a Spearman correlation of 0.514 with human on summarization task, outperforming all previous methods by a large margin.
arXiv Detail & Related papers (2023-03-29T12:46:54Z) - GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator [114.8954615026781]
We propose a GAN-style model for encoder-decoder pre-training by introducing an auxiliary discriminator.
GanLM is trained with two pre-training objectives: replaced token detection and replaced token denoising.
Experiments in language generation benchmarks show that GanLM with the powerful language understanding capability outperforms various strong pre-trained language models.
arXiv Detail & Related papers (2022-12-20T12:51:11Z) - Using Pre-Trained Language Models for Producing Counter Narratives
Against Hate Speech: a Comparative Study [17.338923885534193]
We present an extensive study on the use of pre-trained language models for the task of automatic Counter Narrative (CN) generation.
We first present a comparative study to determine whether there is a particular Language Model (or class of LMs) and a particular decoding mechanism that are the most appropriate to generate CNs.
Findings show that autoregressive models combined with decodings are the most promising.
arXiv Detail & Related papers (2022-04-04T12:44:47Z) - Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of
Language Models [86.02610674750345]
Adversarial GLUE (AdvGLUE) is a new multi-task benchmark to explore and evaluate the vulnerabilities of modern large-scale language models under various types of adversarial attacks.
We apply 14 adversarial attack methods to GLUE tasks to construct AdvGLUE, which is further validated by humans for reliable annotations.
All the language models and robust training methods we tested perform poorly on AdvGLUE, with scores lagging far behind the benign accuracy.
arXiv Detail & Related papers (2021-11-04T12:59:55Z) - Towards Language Modelling in the Speech Domain Using Sub-word
Linguistic Units [56.52704348773307]
We propose a novel LSTM-based generative speech LM based on linguistic units including syllables and phonemes.
With a limited dataset, orders of magnitude smaller than that required by contemporary generative models, our model closely approximates babbling speech.
We show the effect of training with auxiliary text LMs, multitask learning objectives, and auxiliary articulatory features.
arXiv Detail & Related papers (2021-10-31T22:48:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.