Multitask Instruction-based Prompting for Fallacy Recognition
- URL: http://arxiv.org/abs/2301.09992v1
- Date: Tue, 24 Jan 2023 13:39:23 GMT
- Title: Multitask Instruction-based Prompting for Fallacy Recognition
- Authors: Tariq Alhindi, Tuhin Chakrabarty, Elena Musi and Smaranda Muresan
- Abstract summary: We show the ability of a multitask prompting approach to recognize 28 unique fallacies across domains and genres.
We also study the effect of model size and prompt choice by analyzing the per-class (i.e., fallacy type) results.
- Score: 35.10919984256853
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fallacies are used as seemingly valid arguments to support a position and
persuade the audience about its validity. Recognizing fallacies is an
intrinsically difficult task both for humans and machines. Moreover, a big
challenge for computational models lies in the fact that fallacies are
formulated differently across the datasets with differences in the input format
(e.g., question-answer pair, sentence with fallacy fragment), genre (e.g.,
social media, dialogue, news), as well as types and number of fallacies (from 5
to 18 types per dataset). To move towards solving the fallacy recognition task,
we approach these differences across datasets as multiple tasks and show how
instruction-based prompting in a multitask setup based on the T5 model improves
the results against approaches built for a specific dataset such as T5, BERT or
GPT-3. We show the ability of this multitask prompting approach to recognize 28
unique fallacies across domains and genres and study the effect of model size
and prompt choice by analyzing the per-class (i.e., fallacy type) results.
Finally, we analyze the effect of annotation quality on model performance, and
the feasibility of complementing this approach with external knowledge.
Related papers
- PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics
Capabilities [40.55743949223173]
Pragmatics Understanding Benchmark (PUB) is a dataset consisting of fourteen tasks in four pragmatics phenomena.
PUB includes a total of 28k data points, 6.1k of which have been created by us, and the rest are adapted from existing datasets.
Our study indicates that fine-tuning for instruction-following and chat significantly enhances the pragmatics capabilities of smaller language models.
arXiv Detail & Related papers (2024-01-13T13:46:14Z) - Large Language Models are Few-Shot Training Example Generators: A Case Study in Fallacy Recognition [49.38757847011105]
computational fallacy recognition faces challenges due to diverse genres, domains, and types of fallacies found in datasets.
We aim to enhance existing models for fallacy recognition by incorporating additional context and by leveraging large language models to generate synthetic data.
Our evaluation results demonstrate consistent improvements across fallacy types, datasets, and generators.
arXiv Detail & Related papers (2023-11-16T04:17:47Z) - Robust and Explainable Identification of Logical Fallacies in Natural
Language Arguments [5.850977561881791]
We formalize prior theoretical work on logical fallacies into a comprehensive three-stage evaluation framework.
We employ three families of robust and explainable methods based on prototype reasoning, instance-based reasoning, and knowledge injection.
We extensively evaluate these methods on our datasets, focusing on their robustness and explainability.
arXiv Detail & Related papers (2022-12-12T20:27:17Z) - Exploring the Trade-off between Plausibility, Change Intensity and
Adversarial Power in Counterfactual Explanations using Multi-objective
Optimization [73.89239820192894]
We argue that automated counterfactual generation should regard several aspects of the produced adversarial instances.
We present a novel framework for the generation of counterfactual examples.
arXiv Detail & Related papers (2022-05-20T15:02:53Z) - On Modality Bias Recognition and Reduction [70.69194431713825]
We study the modality bias problem in the context of multi-modal classification.
We propose a plug-and-play loss function method, whereby the feature space for each label is adaptively learned.
Our method yields remarkable performance improvements compared with the baselines.
arXiv Detail & Related papers (2022-02-25T13:47:09Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z) - Improving Commonsense Causal Reasoning by Adversarial Training and Data
Augmentation [14.92157586545743]
This paper presents a number of techniques for making models more robust in the domain of causal reasoning.
We show a statistically significant improvement on performance and on both datasets, even with only a small number of additionally generated data points.
arXiv Detail & Related papers (2021-01-13T09:55:29Z) - Differentiable Language Model Adversarial Attacks on Categorical
Sequence Classifiers [0.0]
An adversarial attack paradigm explores various scenarios for the vulnerability of deep learning models.
We use a fine-tuning of a language model for adversarial attacks as a generator of adversarial examples.
Our model works for diverse datasets on bank transactions, electronic health records, and NLP datasets.
arXiv Detail & Related papers (2020-06-19T11:25:36Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.