Learning From Revisions: Quality Assessment of Claims in Argumentation
at Scale
- URL: http://arxiv.org/abs/2101.10250v1
- Date: Mon, 25 Jan 2021 17:32:04 GMT
- Title: Learning From Revisions: Quality Assessment of Claims in Argumentation
at Scale
- Authors: Gabriella Skitalinskaya, Jonas Klaff and Henning Wachsmuth
- Abstract summary: We study claim quality assessment irrespective of discussed aspects by comparing different revisions of the same claim.
We propose two tasks: assessing which claim of a revision pair is better, and ranking all versions of a claim by quality.
- Score: 12.883536911500062
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Assessing the quality of arguments and of the claims the arguments are
composed of has become a key task in computational argumentation. However, even
if different claims share the same stance on the same topic, their assessment
depends on the prior perception and weighting of the different aspects of the
topic being discussed. This renders it difficult to learn topic-independent
quality indicators. In this paper, we study claim quality assessment
irrespective of discussed aspects by comparing different revisions of the same
claim. We compile a large-scale corpus with over 377k claim revision pairs of
various types from kialo.com, covering diverse topics from politics, ethics,
entertainment, and others. We then propose two tasks: (a) assessing which claim
of a revision pair is better, and (b) ranking all versions of a claim by
quality. Our first experiments with embedding-based logistic regression and
transformer-based neural networks show promising results, suggesting that
learned indicators generalize well across topics. In a detailed error analysis,
we give insights into what quality dimensions of claims can be assessed
reliably. We provide the data and scripts needed to reproduce all results.
Related papers
- Beyond Relevance: Evaluate and Improve Retrievers on Perspective Awareness [56.42192735214931]
retrievers are expected to not only rely on the semantic relevance between the documents and the queries but also recognize the nuanced intents or perspectives behind a user query.
In this work, we study whether retrievers can recognize and respond to different perspectives of the queries.
We show that current retrievers have limited awareness of subtly different perspectives in queries and can also be biased toward certain perspectives.
arXiv Detail & Related papers (2024-05-04T17:10:00Z) - Argument Quality Assessment in the Age of Instruction-Following Large Language Models [45.832808321166844]
A critical task in any such application is the assessment of an argument's quality.
We identify the diversity of quality notions and the subjectiveness of their perception as the main hurdles towards substantial progress on argument quality assessment.
We argue that the capabilities of instruction-following large language models (LLMs) to leverage knowledge across contexts enable a much more reliable assessment.
arXiv Detail & Related papers (2024-03-24T10:43:21Z) - To Revise or Not to Revise: Learning to Detect Improvable Claims for
Argumentative Writing Support [20.905660642919052]
We explore the main challenges to identifying argumentative claims in need of specific revisions.
We propose a new sampling strategy based on revision distance.
We provide evidence that using contextual information and domain knowledge can further improve prediction results.
arXiv Detail & Related papers (2023-05-26T10:19:54Z) - Contextualizing Argument Quality Assessment with Relevant Knowledge [11.367297319588411]
SPARK is a novel method for scoring argument quality based on contextualization via relevant knowledge.
We devise four augmentations that leverage large language models to provide feedback, infer hidden assumptions, supply a similar-quality argument, or give a counter-argument.
arXiv Detail & Related papers (2023-05-20T21:04:58Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - Towards a Unified Multi-Dimensional Evaluator for Text Generation [101.47008809623202]
We propose a unified multi-dimensional evaluator UniEval for Natural Language Generation (NLG)
We re-frame NLG evaluation as a Boolean Question Answering (QA) task, and by guiding the model with different questions, we can use one evaluator to evaluate from multiple dimensions.
Experiments on three typical NLG tasks show that UniEval correlates substantially better with human judgments than existing metrics.
arXiv Detail & Related papers (2022-10-13T17:17:03Z) - Towards a Holistic View on Argument Quality Prediction [3.182597245365433]
A decisive property of arguments is their strength or quality.
While there are works on the automated estimation of argument strength, their scope is narrow.
We assess the generalization capabilities of argument quality estimation across diverse domains, the interplay with related argument mining tasks, and the impact of emotions on perceived argument strength.
arXiv Detail & Related papers (2022-05-19T18:44:23Z) - Creating a Domain-diverse Corpus for Theory-based Argument Quality
Assessment [6.654552816487819]
We describe GAQCorpus, the first large, domain-diverse annotated corpus of theory-based AQ.
We discuss how we designed the annotation task to reliably collect a large number of judgments with crowdsourcing.
Our work will inform research on theory-based argumentation annotation and enable the creation of more diverse corpora to support computational AQ assessment.
arXiv Detail & Related papers (2020-11-03T09:40:25Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv Detail & Related papers (2020-10-13T21:33:24Z) - SubjQA: A Dataset for Subjectivity and Review Comprehension [52.13338191442912]
We investigate the relationship between subjectivity and question answering (QA)
We find that subjectivity is also an important feature in the case of QA, albeit with more intricate interactions between subjectivity and QA performance.
We release an English QA dataset (SubjQA) based on customer reviews, containing subjectivity annotations for questions and answer spans across 6 distinct domains.
arXiv Detail & Related papers (2020-04-29T15:59:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.