Related papers: Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality Assessment in Natural Language Processing

Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality Assessment in Natural Language Processing

URL: http://arxiv.org/abs/2006.00843v2
Date: Tue, 3 Nov 2020 09:26:07 GMT
Title: Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality Assessment in Natural Language Processing
Authors: Anne Lauscher, Lily Ng, Courtney Napoles, Joel Tetreault
Abstract summary: We present GAQCorpus: the first large-scale English multi-domain (community Q&A forums, debate forums, review forums) corpus annotated with theory-based AQ scores. We demonstrate the feasibility of large-scale AQ annotation, show that exploiting relations between dimensions yields performance improvements, and explore the synergies between theory-based prediction and practical AQ assessment.
Score: 6.654552816487819
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Though preceding work in computational argument quality (AQ) mostly focuses on assessing overall AQ, researchers agree that writers would benefit from feedback targeting individual dimensions of argumentation theory. However, a large-scale theory-based corpus and corresponding computational models are missing. We fill this gap by conducting an extensive analysis covering three diverse domains of online argumentative writing and presenting GAQCorpus: the first large-scale English multi-domain (community Q&A forums, debate forums, review forums) corpus annotated with theory-based AQ scores. We then propose the first computational approaches to theory-based assessment, which can serve as strong baselines for future work. We demonstrate the feasibility of large-scale AQ annotation, show that exploiting relations between dimensions yields performance improvements, and explore the synergies between theory-based prediction and practical AQ assessment.

Related papers

SpeechR: A Benchmark for Speech Reasoning in Large Audio-Language Models [60.72029578488467]
SpeechR is a unified benchmark for evaluating reasoning over speech in large audio-language models.<n>It evaluates models along three key dimensions: factual retrieval, procedural inference, and normative judgment.<n> Evaluations on eleven state-of-the-art LALMs reveal that high transcription accuracy does not translate into strong reasoning capabilities.
arXiv Detail & Related papers (2025-08-04T03:28:04Z)
From Thinking to Output: Chain-of-Thought and Text Generation Characteristics in Reasoning Language Models [10.38327947136263]
This paper proposes a novel framework for analyzing the reasoning characteristics of four cutting-edge large reasoning models.<n>A diverse dataset consists of real-world scenario-based questions covering logical deduction, causal inference, and multi-step problem-solving.<n>The research results uncover various patterns of how these models balance exploration and exploitation, deal with problems, and reach conclusions.
arXiv Detail & Related papers (2025-06-20T14:02:16Z)
Towards Analyzing and Understanding the Limitations of VAPO: A Theoretical Perspective [6.963986923957048]
VAPO is a framework for reinforcement learning for large language models.<n>It addresses challenges such as value model bias, heterogeneous sequence lengths, and sparse reward signals.<n>This paper explores VAPO from a theoretical perspective, highlighting areas where its assumptions might be challenged.
arXiv Detail & Related papers (2025-05-23T15:03:41Z)
Learning to Align Multi-Faceted Evaluation: A Unified and Robust Framework [61.38174427966444]
Large Language Models (LLMs) are being used more and more extensively for automated evaluation in various scenarios. Previous studies have attempted to fine-tune open-source LLMs to replicate the evaluation explanations and judgments of powerful proprietary models. We propose a novel evaluation framework, ARJudge, that adaptively formulates evaluation criteria and synthesizes both text-based and code-driven analyses.
arXiv Detail & Related papers (2025-02-26T06:31:45Z)
Hypothesis-Driven Theory-of-Mind Reasoning for Large Language Models [76.6028674686018]
We introduce thought-tracing, an inference-time reasoning algorithm to trace the mental states of agents. Our algorithm is modeled after the Bayesian theory-of-mind framework. We evaluate thought-tracing on diverse theory-of-mind benchmarks, demonstrating significant performance improvements.
arXiv Detail & Related papers (2025-02-17T15:08:50Z)
Explore Theory of Mind: Program-guided adversarial data generation for theory of mind reasoning [88.68573198200698]
We introduce ExploreToM, the first framework to allow large-scale generation of diverse and challenging theory of mind data. Our approach leverages an A* search over a custom domain-specific language to produce complex story structures and novel, diverse, yet plausible scenarios. Our evaluation reveals that state-of-the-art LLMs, such as Llama-3.1-70B and GPT-4o, show accuracies as low as 0% and 9% on ExploreToM-generated data.
arXiv Detail & Related papers (2024-12-12T21:29:00Z)
The Foundations of Tokenization: Statistical and Computational Concerns [51.370165245628975]
Tokenization is a critical step in the NLP pipeline. Despite its recognized importance as a standard representation method in NLP, the theoretical underpinnings of tokenization are not yet fully understood. The present paper contributes to addressing this theoretical gap by proposing a unified formal framework for representing and analyzing tokenizer models.
arXiv Detail & Related papers (2024-07-16T11:12:28Z)
eRST: A Signaled Graph Theory of Discourse Relations and Organization [14.074017875514787]
We present a new theoretical framework for computational discourse analysis, based on an expansion of Rhetorical Structure Theory (RST) The framework encompasses discourse relation graphs with tree-breaking, non-projective and concurrent relations, as well as implicit and explicit signals which give explainable rationales to our analyses. We present and evaluate a freely available corpus of English annotated according to our framework, encompassing 12 spoken and written genres with over 200K tokens.
arXiv Detail & Related papers (2024-03-20T12:52:38Z)
Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus [4.569421189811511]
We introduce a novel approach to evaluate the inference and contextual understanding abilities of Large Language Models (LLMs) We focus on three key components from the Language of Thought Hypothesis (LoTH): Logical Coherence, Compositionality, and Productivity. Our experiments reveal that while LLMs demonstrate some inference capabilities, they still significantly lag behind human-level reasoning in these three aspects.
arXiv Detail & Related papers (2024-03-18T13:50:50Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
Investigating Fairness Disparities in Peer Review: A Language Model Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs) We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date. We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z)
Investigating the Role of Centering Theory in the Context of Neural Coreference Resolution Systems [71.57556446474486]
We investigate the connection between centering theory and modern coreference resolution systems. We show that high-quality neural coreference resolvers may not benefit much from explicitly modeling centering ideas. We formulate a version of CT that also models recency and show that it captures coreference information better compared to vanilla CT.
arXiv Detail & Related papers (2022-10-26T12:55:26Z)
Revise and Resubmit: An Intertextual Model of Text-based Collaboration in Peer Review [52.359007622096684]
Peer review is a key component of the publishing process in most fields of science. Existing NLP studies focus on the analysis of individual texts. editorial assistance often requires modeling interactions between pairs of texts.
arXiv Detail & Related papers (2022-04-22T16:39:38Z)
Learning From Revisions: Quality Assessment of Claims in Argumentation at Scale [12.883536911500062]
We study claim quality assessment irrespective of discussed aspects by comparing different revisions of the same claim. We propose two tasks: assessing which claim of a revision pair is better, and ranking all versions of a claim by quality.
arXiv Detail & Related papers (2021-01-25T17:32:04Z)
Creating a Domain-diverse Corpus for Theory-based Argument Quality Assessment [6.654552816487819]
We describe GAQCorpus, the first large, domain-diverse annotated corpus of theory-based AQ. We discuss how we designed the annotation task to reliably collect a large number of judgments with crowdsourcing. Our work will inform research on theory-based argumentation annotation and enable the creation of more diverse corpora to support computational AQ assessment.
arXiv Detail & Related papers (2020-11-03T09:40:25Z)
A Generalised Approach for Encoding and Reasoning with Qualitative Theories in Answer Set Programming [3.963609604649393]
A family of ASP encodings is proposed which can handle any qualitative calculus with binary relations. This paper is under consideration for acceptance in TPLP.
arXiv Detail & Related papers (2020-08-04T13:31:25Z)
Invariant Rationalization [84.1861516092232]
A typical rationalization criterion, i.e. maximum mutual information (MMI), finds the rationale that maximizes the prediction performance based only on the rationale. We introduce a game-theoretic invariant rationalization criterion where the rationales are constrained to enable the same predictor to be optimal across different environments. We show both theoretically and empirically that the proposed rationales can rule out spurious correlations, generalize better to different test scenarios, and align better with human judgments.
arXiv Detail & Related papers (2020-03-22T00:50:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.