Intrinsic Quality Assessment of Arguments
- URL: http://arxiv.org/abs/2010.12473v1
- Date: Fri, 23 Oct 2020 15:16:10 GMT
- Title: Intrinsic Quality Assessment of Arguments
- Authors: Henning Wachsmuth and Till Werner
- Abstract summary: We study the intrinsic computational assessment of 15 dimensions, i.e., only learning from an argument's text.
We observe moderate but significant learning success for most dimensions.
- Score: 21.261009977405898
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Several quality dimensions of natural language arguments have been
investigated. Some are likely to be reflected in linguistic features (e.g., an
argument's arrangement), whereas others depend on context (e.g., relevance) or
topic knowledge (e.g., acceptability). In this paper, we study the intrinsic
computational assessment of 15 dimensions, i.e., only learning from an
argument's text. In systematic experiments with eight feature types on an
existing corpus, we observe moderate but significant learning success for most
dimensions. Rhetorical quality seems hardest to assess, and subjectivity
features turn out strong, although length bias in the corpus impedes full
validity. We also find that human assessors differ more clearly to each other
than to our approach.
Related papers
- A scale of conceptual orality and literacy: Automatic text categorization in the tradition of "Nähe und Distanz" [0.0]
It is stipulated that written texts can be rated on a scale of conceptual orality and literacy by linguistic features.
This article establishes such a scale based on PCA and combines it with automatic analysis.
The scale is also discussed with a view to its use in corpus compilation and as a guide for analyzes in larger corpora.
arXiv Detail & Related papers (2025-02-05T15:08:37Z) - ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models [53.00812898384698]
We argue that human evaluation of generative large language models (LLMs) should be a multidisciplinary undertaking.
We highlight how cognitive biases can conflate fluent information and truthfulness, and how cognitive uncertainty affects the reliability of rating scores such as Likert.
We propose the ConSiDERS-The-Human evaluation framework consisting of 6 pillars -- Consistency, Scoring Criteria, Differentiating, User Experience, Responsible, and Scalability.
arXiv Detail & Related papers (2024-05-28T22:45:28Z) - Which Argumentative Aspects of Hate Speech in Social Media can be
reliably identified? [2.7647400328727256]
It is unclear which aspects of argumentation can be reliably identified and integrated in language models.
We show that some components can be identified with reasonable reliability.
We propose adaptations of those categories that can be more reliably reproduced.
arXiv Detail & Related papers (2023-06-05T15:50:57Z) - Modeling Appropriate Language in Argumentation [34.90028129715041]
We operationalize appropriate language in argumentation for the first time.
We derive a new taxonomy of 14 dimensions that determine inappropriate language in online discussions.
arXiv Detail & Related papers (2023-05-24T09:17:05Z) - Natural Language Decompositions of Implicit Content Enable Better Text
Representations [56.85319224208865]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account.
We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed.
Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z) - How Do In-Context Examples Affect Compositional Generalization? [86.57079616209474]
In this paper, we present CoFe, a test suite to investigate in-context compositional generalization.
We find that the compositional generalization performance can be easily affected by the selection of in-context examples.
Our systematic experiments indicate that in-context examples should be structurally similar to the test case, diverse from each other, and individually simple.
arXiv Detail & Related papers (2023-05-08T16:32:18Z) - Towards a Holistic View on Argument Quality Prediction [3.182597245365433]
A decisive property of arguments is their strength or quality.
While there are works on the automated estimation of argument strength, their scope is narrow.
We assess the generalization capabilities of argument quality estimation across diverse domains, the interplay with related argument mining tasks, and the impact of emotions on perceived argument strength.
arXiv Detail & Related papers (2022-05-19T18:44:23Z) - Learning From Revisions: Quality Assessment of Claims in Argumentation
at Scale [12.883536911500062]
We study claim quality assessment irrespective of discussed aspects by comparing different revisions of the same claim.
We propose two tasks: assessing which claim of a revision pair is better, and ranking all versions of a claim by quality.
arXiv Detail & Related papers (2021-01-25T17:32:04Z) - A computational model implementing subjectivity with the 'Room Theory'.
The case of detecting Emotion from Text [68.8204255655161]
This work introduces a new method to consider subjectivity and general context dependency in text analysis.
By using similarity measure between words, we are able to extract the relative relevance of the elements in the benchmark.
This method could be applied to all the cases where evaluating subjectivity is relevant to understand the relative value or meaning of a text.
arXiv Detail & Related papers (2020-05-12T21:26:04Z) - SubjQA: A Dataset for Subjectivity and Review Comprehension [52.13338191442912]
We investigate the relationship between subjectivity and question answering (QA)
We find that subjectivity is also an important feature in the case of QA, albeit with more intricate interactions between subjectivity and QA performance.
We release an English QA dataset (SubjQA) based on customer reviews, containing subjectivity annotations for questions and answer spans across 6 distinct domains.
arXiv Detail & Related papers (2020-04-29T15:59:30Z) - A Deep Neural Framework for Contextual Affect Detection [51.378225388679425]
A short and simple text carrying no emotion can represent some strong emotions when reading along with its context.
We propose a Contextual Affect Detection framework which learns the inter-dependence of words in a sentence.
arXiv Detail & Related papers (2020-01-28T05:03:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.