Hybrid Autoregressive Solver for Scalable Abductive Natural Language
Inference
- URL: http://arxiv.org/abs/2107.11879v1
- Date: Sun, 25 Jul 2021 19:29:53 GMT
- Title: Hybrid Autoregressive Solver for Scalable Abductive Natural Language
Inference
- Authors: Marco Valentino, Mokanarangan Thayaparan, Deborah Ferreira, Andr\'e
Freitas
- Abstract summary: We propose a hybrid abductive solver that autoregressively combines a dense bi-encoder with a sparse model of explanatory power.
Our experiments demonstrate that the proposed framework can achieve performance comparable with the state-of-the-art cross-encoder.
- Score: 2.867517731896504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Regenerating natural language explanations for science questions is a
challenging task for evaluating complex multi-hop and abductive inference
capabilities. In this setting, Transformers trained on human-annotated
explanations achieve state-of-the-art performance when adopted as cross-encoder
architectures. However, while much attention has been devoted to the quality of
the constructed explanations, the problem of performing abductive inference at
scale is still under-studied. As intrinsically not scalable, the cross-encoder
architectural paradigm is not suitable for efficient multi-hop inference on
massive facts banks. To maximise both accuracy and inference time, we propose a
hybrid abductive solver that autoregressively combines a dense bi-encoder with
a sparse model of explanatory power, computed leveraging explicit patterns in
the explanations. Our experiments demonstrate that the proposed framework can
achieve performance comparable with the state-of-the-art cross-encoder while
being $\approx 50$ times faster and scalable to corpora of millions of facts.
Moreover, we study the impact of the hybridisation on semantic drift and
science question answering without additional training, showing that it boosts
the quality of the explanations and contributes to improved downstream
inference performance.
Related papers
- A Theoretical Perspective for Speculative Decoding Algorithm [60.79447486066416]
One effective way to accelerate inference is emphSpeculative Decoding, which employs a small model to sample a sequence of draft tokens and a large model to validate.
This paper tackles this gap by conceptualizing the decoding problem via markov chain abstraction and studying the key properties, emphoutput quality and inference acceleration, from a theoretical perspective.
arXiv Detail & Related papers (2024-10-30T01:53:04Z) - More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed Routing [5.846028298833611]
Conditionally Overlapping Mixture of ExperTs (COMET) is a general deep learning method that inducing a modular, sparse architecture with an exponential number of overlapping experts.
We demonstrate the effectiveness of COMET on a range of tasks, including image classification, language modeling, and regression.
arXiv Detail & Related papers (2024-10-10T14:58:18Z) - Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Disentangled Representation Learning with Transmitted Information Bottleneck [57.22757813140418]
We present textbfDisTIB (textbfTransmitted textbfInformation textbfBottleneck for textbfDisd representation learning), a novel objective that navigates the balance between information compression and preservation.
arXiv Detail & Related papers (2023-11-03T03:18:40Z) - How Do Transformers Learn Topic Structure: Towards a Mechanistic
Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure"
We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z) - Enhancing Dual-Encoders with Question and Answer Cross-Embeddings for
Answer Retrieval [29.16807969384253]
Dual-Encoders is a promising mechanism for answer retrieval in question answering (QA) systems.
We propose a framework to enhance the Dual-Encoders model with question answer cross-embeddings and a novel Geometry Alignment Mechanism (GAM)
Our framework significantly improves Dual-Encoders model and outperforms the state-of-the-art method on multiple answer retrieval datasets.
arXiv Detail & Related papers (2022-06-07T02:39:24Z) - Hybrid Predictive Coding: Inferring, Fast and Slow [62.997667081978825]
We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner.
We demonstrate that our model is inherently sensitive to its uncertainty and adaptively balances balances to obtain accurate beliefs using minimum computational expense.
arXiv Detail & Related papers (2022-04-05T12:52:45Z) - $\partial$-Explainer: Abductive Natural Language Inference via
Differentiable Convex Optimization [2.624902795082451]
This paper presents a novel framework named $partial$-Explainer (Diff-Explainer) that combines the best of both worlds by casting the constrained optimization as part of a deep neural network.
Our experiments show up to $approx 10%$ improvement over non-differentiable solver while still providing explanations for supporting its inference.
arXiv Detail & Related papers (2021-05-07T17:49:19Z) - Case-Based Abductive Natural Language Inference [4.726777092009554]
Case-Based Abductive Natural Language Inference (CB-ANLI)
Case-Based Abductive Natural Language Inference (CB-ANLI)
arXiv Detail & Related papers (2020-09-30T09:50:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.