Creating a Domain-diverse Corpus for Theory-based Argument Quality
  Assessment
        - URL: http://arxiv.org/abs/2011.01589v1
 - Date: Tue, 3 Nov 2020 09:40:25 GMT
 - Title: Creating a Domain-diverse Corpus for Theory-based Argument Quality
  Assessment
 - Authors: Lily Ng, Anne Lauscher, Joel Tetreault, Courtney Napoles
 - Abstract summary: We describe GAQCorpus, the first large, domain-diverse annotated corpus of theory-based AQ.
We discuss how we designed the annotation task to reliably collect a large number of judgments with crowdsourcing.
Our work will inform research on theory-based argumentation annotation and enable the creation of more diverse corpora to support computational AQ assessment.
 - Score: 6.654552816487819
 - License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
 - Abstract:   Computational models of argument quality (AQ) have focused primarily on
assessing the overall quality or just one specific characteristic of an
argument, such as its convincingness or its clarity. However, previous work has
claimed that assessment based on theoretical dimensions of argumentation could
benefit writers, but developing such models has been limited by the lack of
annotated data. In this work, we describe GAQCorpus, the first large,
domain-diverse annotated corpus of theory-based AQ. We discuss how we designed
the annotation task to reliably collect a large number of judgments with
crowdsourcing, formulating theory-based guidelines that helped make subjective
judgments of AQ more objective. We demonstrate how to identify arguments and
adapt the annotation task for three diverse domains. Our work will inform
research on theory-based argumentation annotation and enable the creation of
more diverse corpora to support computational AQ assessment.
 
       
      
        Related papers
        - SpeechR: A Benchmark for Speech Reasoning in Large Audio-Language Models [60.72029578488467]
SpeechR is a unified benchmark for evaluating reasoning over speech in large audio-language models.<n>It evaluates models along three key dimensions: factual retrieval, procedural inference, and normative judgment.<n> Evaluations on eleven state-of-the-art LALMs reveal that high transcription accuracy does not translate into strong reasoning capabilities.
arXiv  Detail & Related papers  (2025-08-04T03:28:04Z) - From Thinking to Output: Chain-of-Thought and Text Generation   Characteristics in Reasoning Language Models [10.38327947136263]
This paper proposes a novel framework for analyzing the reasoning characteristics of four cutting-edge large reasoning models.<n>A diverse dataset consists of real-world scenario-based questions covering logical deduction, causal inference, and multi-step problem-solving.<n>The research results uncover various patterns of how these models balance exploration and exploitation, deal with problems, and reach conclusions.
arXiv  Detail & Related papers  (2025-06-20T14:02:16Z) - PixelThink: Towards Efficient Chain-of-Pixel Reasoning [70.32510083790069]
PixelThink is a simple yet effective scheme that integrates externally estimated task difficulty and internally measured model uncertainty.<n>It learns to compress reasoning length in accordance with scene complexity and predictive confidence.<n> Experimental results demonstrate that the proposed approach improves both reasoning efficiency and overall segmentation performance.
arXiv  Detail & Related papers  (2025-05-29T17:55:49Z) - Identifying Aspects in Peer Reviews [61.374437855024844]
We develop a data-driven schema for deriving fine-grained aspects from a corpus of peer reviews.
We introduce a dataset of peer reviews augmented with aspects and show how it can be used for community-level review analysis.
arXiv  Detail & Related papers  (2025-04-09T14:14:42Z) - Trade-offs in Large Reasoning Models: An Empirical Analysis of   Deliberative and Adaptive Reasoning over Foundational Capabilities [101.77467538102924]
Recent advancements in Large Reasoning Models (LRMs) have demonstrated remarkable performance in specialized reasoning tasks.
We show that acquiring deliberative reasoning capabilities significantly reduces the foundational capabilities of LRMs.
We demonstrate that adaptive reasoning -- employing modes like Zero-Thinking, Less-Thinking, and Summary-Thinking -- can effectively alleviate these drawbacks.
arXiv  Detail & Related papers  (2025-03-23T08:18:51Z) - The Foundations of Tokenization: Statistical and Computational Concerns [51.370165245628975]
Tokenization is a critical step in the NLP pipeline.
Despite its recognized importance as a standard representation method in NLP, the theoretical underpinnings of tokenization are not yet fully understood.
The present paper contributes to addressing this theoretical gap by proposing a unified formal framework for representing and analyzing tokenizer models.
arXiv  Detail & Related papers  (2024-07-16T11:12:28Z) - Argument Quality Assessment in the Age of Instruction-Following Large   Language Models [45.832808321166844]
A critical task in any such application is the assessment of an argument's quality.
We identify the diversity of quality notions and the subjectiveness of their perception as the main hurdles towards substantial progress on argument quality assessment.
We argue that the capabilities of instruction-following large language models (LLMs) to leverage knowledge across contexts enable a much more reliable assessment.
arXiv  Detail & Related papers  (2024-03-24T10:43:21Z) - Generation of Explanations for Logic Reasoning [0.0]
The research is centred on employing GPT-3.5-turbo to automate the analysis of fortiori arguments.
This thesis makes significant contributions to the fields of artificial intelligence and logical reasoning.
arXiv  Detail & Related papers  (2023-11-22T15:22:04Z) - Coherent Entity Disambiguation via Modeling Topic and Categorical
  Dependency [87.16283281290053]
Previous entity disambiguation (ED) methods adopt a discriminative paradigm, where prediction is made based on matching scores between mention context and candidate entities.
We propose CoherentED, an ED system equipped with novel designs aimed at enhancing the coherence of entity predictions.
We achieve new state-of-the-art results on popular ED benchmarks, with an average improvement of 1.3 F1 points.
arXiv  Detail & Related papers  (2023-11-06T16:40:13Z) - Generative Judge for Evaluating Alignment [84.09815387884753]
We propose a generative judge with 13B parameters, Auto-J, designed to address these challenges.
Our model is trained on user queries and LLM-generated responses under massive real-world scenarios.
 Experimentally, Auto-J outperforms a series of strong competitors, including both open-source and closed-source models.
arXiv  Detail & Related papers  (2023-10-09T07:27:15Z) - Investigating Fairness Disparities in Peer Review: A Language Model
  Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv  Detail & Related papers  (2022-11-07T16:19:42Z) - Learning From Revisions: Quality Assessment of Claims in Argumentation
  at Scale [12.883536911500062]
We study claim quality assessment irrespective of discussed aspects by comparing different revisions of the same claim.
We propose two tasks: assessing which claim of a revision pair is better, and ranking all versions of a claim by quality.
arXiv  Detail & Related papers  (2021-01-25T17:32:04Z) - Weakly-Supervised Aspect-Based Sentiment Analysis via Joint
  Aspect-Sentiment Topic Embedding [71.2260967797055]
We propose a weakly-supervised approach for aspect-based sentiment analysis.
We learn sentiment, aspect> joint topic embeddings in the word embedding space.
We then use neural models to generalize the word-level discriminative information.
arXiv  Detail & Related papers  (2020-10-13T21:33:24Z) - A Generalised Approach for Encoding and Reasoning with Qualitative
  Theories in Answer Set Programming [3.963609604649393]
A family of ASP encodings is proposed which can handle any qualitative calculus with binary relations.
This paper is under consideration for acceptance in TPLP.
arXiv  Detail & Related papers  (2020-08-04T13:31:25Z) - Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality
  Assessment in Natural Language Processing [6.654552816487819]
We present GAQCorpus: the first large-scale English multi-domain (community Q&A forums, debate forums, review forums) corpus annotated with theory-based AQ scores.
We demonstrate the feasibility of large-scale AQ annotation, show that exploiting relations between dimensions yields performance improvements, and explore the synergies between theory-based prediction and practical AQ assessment.
arXiv  Detail & Related papers  (2020-06-01T10:39:50Z) - Evaluations and Methods for Explanation through Robustness Analysis [117.7235152610957]
We establish a novel set of evaluation criteria for such feature based explanations by analysis.
We obtain new explanations that are loosely necessary and sufficient for a prediction.
We extend the explanation to extract the set of features that would move the current prediction to a target class.
arXiv  Detail & Related papers  (2020-05-31T05:52:05Z) 
        This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.