Related papers: Is My Model Using The Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning

Is My Model Using The Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning

URL: http://arxiv.org/abs/2108.00578v1
Date: Mon, 2 Aug 2021 01:14:19 GMT
Title: Is My Model Using The Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning
Authors: Vivek Gupta, Riyaz A. Bhat, Atreya Ghosal, Manish Srivastava, Maneesh Singh, Vivek Srikumar
Abstract summary: Neural models routinely report state-of-the-art performance across NLP tasks involving reasoning. Our experiments demonstrate that a BERT-based model representative of today's state-of-the-art fails to properly reason on the following counts.
Score: 26.168211982441875
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While neural models routinely report state-of-the-art performance across NLP tasks involving reasoning, their outputs are often observed to not properly use and reason on the evidence presented to them in the inputs. A model that reasons properly is expected to attend to the right parts of the input, be self-consistent in its predictions across examples, avoid spurious patterns in inputs, and to ignore biasing from its underlying pre-trained language model in a nuanced, context-sensitive fashion (e.g. handling counterfactuals). Do today's models do so? In this paper, we study this question using the problem of reasoning on tabular data. The tabular nature of the input is particularly suited for the study as it admits systematic probes targeting the properties listed above. Our experiments demonstrate that a BERT-based model representative of today's state-of-the-art fails to properly reason on the following counts: it often (a) misses the relevant evidence, (b) suffers from hypothesis and knowledge biases, and, (c) relies on annotation artifacts and knowledge from pre-trained language models as primary evidence rather than relying on reasoning on the premises in the tabular input.

Related papers

Faithfulness Tests for Natural Language Explanations [87.01093277918599]
Explanations of neural models aim to reveal a model's decision-making process for its predictions. Recent work shows that current methods giving explanations such as saliency maps or counterfactuals can be misleading. This work explores the challenging question of evaluating the faithfulness of natural language explanations.
arXiv Detail & Related papers (2023-05-29T11:40:37Z)
Enhancing Tabular Reasoning with Pattern Exploiting Training [14.424742483714846]
Recent methods based on pre-trained language models have exhibited superior performance over tabular tasks. In this work, we utilize Pattern-Exploiting Training (PET) on pre-trained language models to strengthen these reasoning models' pre-existing knowledge and reasoning abilities.
arXiv Detail & Related papers (2022-10-21T21:28:18Z)
Logical Satisfiability of Counterfactuals for Faithful Explanations in NLI [60.142926537264714]
We introduce the methodology of Faithfulness-through-Counterfactuals. It generates a counterfactual hypothesis based on the logical predicates expressed in the explanation. It then evaluates if the model's prediction on the counterfactual is consistent with that expressed logic.
arXiv Detail & Related papers (2022-05-25T03:40:59Z)
Visual Abductive Reasoning [85.17040703205608]
Abductive reasoning seeks the likeliest possible explanation for partial observations. We propose a new task and dataset, Visual Abductive Reasoning ( VAR), for examining abductive reasoning ability of machine intelligence in everyday visual situations.
arXiv Detail & Related papers (2022-03-26T10:17:03Z)
Interpretable Data-Based Explanations for Fairness Debugging [7.266116143672294]
Gopher is a system that produces compact, interpretable, and causal explanations for bias or unexpected model behavior. We introduce the concept of causal responsibility that quantifies the extent to which intervening on training data by removing or updating subsets of it can resolve the bias. Building on this concept, we develop an efficient approach for generating the top-k patterns that explain model bias.
arXiv Detail & Related papers (2021-12-17T20:10:00Z)
Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations [97.91630330328815]
We conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distinguish between genuine and fake hotel reviews. We observe that for a linear bag-of-words model, participants with access to the feature coefficients during training are able to cause a larger reduction in model confidence in the testing phase when compared to the no-explanation control.
arXiv Detail & Related papers (2021-12-17T18:29:56Z)
Exploring Lexical Irregularities in Hypothesis-Only Models of Natural Language Inference [5.283529004179579]
Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) is the task of predicting the entailment relation between a pair of sentences. Models that understand entailment should encode both, the premise and the hypothesis. Experiments by Poliak et al. revealed a strong preference of these models towards patterns observed only in the hypothesis.
arXiv Detail & Related papers (2021-01-19T01:08:06Z)
Learning from others' mistakes: Avoiding dataset biases without modeling them [111.17078939377313]
State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended task. Previous work has demonstrated effective methods to circumvent these issues when knowledge of the bias is available. We show a method for training models that learn to ignore these problematic correlations.
arXiv Detail & Related papers (2020-12-02T16:10:54Z)
Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge [96.92252296244233]
Large pre-trained language models (LMs) acquire some reasoning capacity, but this ability is difficult to control. We show that LMs can be trained to reliably perform systematic reasoning combining both implicit, pre-trained knowledge and explicit natural language statements. Our work paves a path towards open-domain systems that constantly improve by interacting with users who can instantly correct a model by adding simple natural language statements.
arXiv Detail & Related papers (2020-06-11T17:02:20Z)
CausaLM: Causal Model Explanation Through Counterfactual Language Models [33.29636213961804]
CausaLM is a framework for producing causal model explanations using counterfactual language representation models. We show that language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest. A byproduct of our method is a language representation model that is unaffected by the tested concept.
arXiv Detail & Related papers (2020-05-27T15:06:35Z)
HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in Natural Language Inference [38.14399396661415]
We derive adversarial examples in terms of the hypothesis-only bias. We investigate two debiasing approaches which exploit the artificial pattern modeling to mitigate such hypothesis-only bias.
arXiv Detail & Related papers (2020-03-05T16:46:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.