Teaching Probabilistic Logical Reasoning to Transformers
- URL: http://arxiv.org/abs/2305.13179v2
- Date: Fri, 9 Feb 2024 17:29:19 GMT
- Title: Teaching Probabilistic Logical Reasoning to Transformers
- Authors: Aliakbar Nafar, Kristen Brent Venable, Parisa Kordjamshidi
- Abstract summary: We evaluate the capability of transformer-based language models in making inferences over uncertain text.
We propose a novel end-to-end fine-tuning approach, Probabilistic Constraint Training.
- Score: 21.335836561959887
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we evaluate the capability of transformer-based language
models in making inferences over uncertain text that includes uncertain rules
of reasoning. We cover both Pre-trained Language Models (PLMs) and generative
Large Language Models (LLMs). Our evaluation results show that both generations
of language models struggle with reasoning over uncertain text. We propose a
novel end-to-end fine-tuning approach, Probabilistic Constraint Training (PCT),
that utilizes probabilistic logical rules as constraints in the fine-tuning
phase without relying on these rules in the inference stage. To assess the
effectiveness of PCT, we utilize the related corpora and, additionally, create
a new and more challenging benchmark that, unlike the previous ones, uses
instance-specific rules. Our study demonstrates that PCT improves the
transformer-based language model's intrinsic reasoning and makes their
probabilistic logical reasoning process more explicit and explainable.
Furthermore, PCT equips these models to effectively handle novel situations,
including higher reasoning depth, new domains, and complex probabilistic
structures.
Related papers
- Benchmarking Defeasible Reasoning with Large Language Models -- Initial Experiments and Future Directions [0.36868085124383626]
This paper proposes a benchmark that corresponds to various defeasible rule-based reasoning patterns.
We modified an existing benchmark for defeasible logic reasoners by translating defeasible rules into text suitable for Large Language Models.
We conducted preliminary experiments on nonmonotonic rule-based reasoning using ChatGPT and compared it with reasoning patterns defined by defeasible logic.
arXiv Detail & Related papers (2024-10-16T12:36:23Z) - Enhancing adversarial robustness in Natural Language Inference using explanations [41.46494686136601]
We cast the spotlight on the underexplored task of Natural Language Inference (NLI)
We validate the usage of natural language explanation as a model-agnostic defence strategy through extensive experimentation.
We research the correlation of widely used language generation metrics with human perception, in order for them to serve as a proxy towards robust NLI models.
arXiv Detail & Related papers (2024-09-11T17:09:49Z) - On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning [87.73401758641089]
Chain-of-thought (CoT) reasoning has improved the performance of modern language models (LMs)
We show that LMs can represent the same family of distributions over strings as probabilistic Turing machines.
arXiv Detail & Related papers (2024-06-20T10:59:02Z) - Scaling Synthetic Logical Reasoning Datasets with Context-Sensitive Declarative Grammars [0.6537995248511139]
We present a declarative framework with flexible context-sensitive rules binding multiple languages.
We construct first-order logic problems by selecting up to 32 premises and one hypothesis.
We demonstrate that using semantic constraints during generation and careful English verbalization of predicates enhances logical reasoning without hurting natural English tasks.
arXiv Detail & Related papers (2024-06-16T18:10:49Z) - How Truncating Weights Improves Reasoning in Language Models [49.80959223722325]
We study how certain global associations tend to be stored in specific weight components or Transformer blocks.
We analyze how this arises during training, both empirically and theoretically.
arXiv Detail & Related papers (2024-06-05T08:51:08Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Improving Language Models Meaning Understanding and Consistency by
Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness.
A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results.
We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z) - On Conditional and Compositional Language Model Differentiable Prompting [75.76546041094436]
Prompts have been shown to be an effective method to adapt a frozen Pretrained Language Model (PLM) to perform well on downstream tasks.
We propose a new model, Prompt Production System (PRopS), which learns to transform task instructions or input metadata, into continuous prompts.
arXiv Detail & Related papers (2023-07-04T02:47:42Z) - Autoregressive Structured Prediction with Language Models [73.11519625765301]
We describe an approach to model structures as sequences of actions in an autoregressive manner with PLMs.
Our approach achieves the new state-of-the-art on all the structured prediction tasks we looked at.
arXiv Detail & Related papers (2022-10-26T13:27:26Z) - Can Unsupervised Knowledge Transfer from Social Discussions Help
Argument Mining? [25.43442712037725]
We propose a novel transfer learning strategy to overcome the challenges of unsupervised, argumentative discourse-aware knowledge.
We utilize argumentation-rich social discussions from the ChangeMyView subreddit as a source of unsupervised, argumentative discourse-aware knowledge.
We introduce a novel prompt-based strategy for inter-component relation prediction that compliments our proposed finetuning method.
arXiv Detail & Related papers (2022-03-24T06:48:56Z) - Evaluating Pretrained Transformer Models for Entity Linking in
Task-Oriented Dialog [1.4524096882720263]
We evaluate different pretrained transformer models (PTMs) for understanding short phrases of text.
Several of the PTMs produce sub-par results when compared to traditional techniques.
We find that some of their shortcomings can be addressed by using PTMs fine-tuned for text-similarity tasks.
arXiv Detail & Related papers (2021-12-15T18:20:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.