Hybrid Autoregressive Solver for Scalable Abductive Natural Language
Inference
- URL: http://arxiv.org/abs/2107.11879v1
- Date: Sun, 25 Jul 2021 19:29:53 GMT
- Title: Hybrid Autoregressive Solver for Scalable Abductive Natural Language
Inference
- Authors: Marco Valentino, Mokanarangan Thayaparan, Deborah Ferreira, Andr\'e
Freitas
- Abstract summary: We propose a hybrid abductive solver that autoregressively combines a dense bi-encoder with a sparse model of explanatory power.
Our experiments demonstrate that the proposed framework can achieve performance comparable with the state-of-the-art cross-encoder.
- Score: 2.867517731896504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Regenerating natural language explanations for science questions is a
challenging task for evaluating complex multi-hop and abductive inference
capabilities. In this setting, Transformers trained on human-annotated
explanations achieve state-of-the-art performance when adopted as cross-encoder
architectures. However, while much attention has been devoted to the quality of
the constructed explanations, the problem of performing abductive inference at
scale is still under-studied. As intrinsically not scalable, the cross-encoder
architectural paradigm is not suitable for efficient multi-hop inference on
massive facts banks. To maximise both accuracy and inference time, we propose a
hybrid abductive solver that autoregressively combines a dense bi-encoder with
a sparse model of explanatory power, computed leveraging explicit patterns in
the explanations. Our experiments demonstrate that the proposed framework can
achieve performance comparable with the state-of-the-art cross-encoder while
being $\approx 50$ times faster and scalable to corpora of millions of facts.
Moreover, we study the impact of the hybridisation on semantic drift and
science question answering without additional training, showing that it boosts
the quality of the explanations and contributes to improved downstream
inference performance.
Related papers
- Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - Disentangled Representation Learning with Transmitted Information
Bottleneck [73.0553263960709]
We present textbfDisTIB (textbfTransmitted textbfInformation textbfBottleneck for textbfDisd representation learning), a novel objective that navigates the balance between information compression and preservation.
arXiv Detail & Related papers (2023-11-03T03:18:40Z) - Faith and Fate: Limits of Transformers on Compositionality [109.79516190693415]
We investigate the limits of transformer large language models across three representative compositional tasks.
These tasks require breaking problems down into sub-steps and synthesizing these steps into a precise answer.
Our empirical findings suggest that transformer LLMs solve compositional tasks by reducing multi-step compositional reasoning into linearized subgraph matching.
arXiv Detail & Related papers (2023-05-29T23:24:14Z) - How Do Transformers Learn Topic Structure: Towards a Mechanistic
Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure"
We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z) - Transformer Meets Boundary Value Inverse Problems [4.165221477234755]
Transformer-based deep direct sampling method is proposed for solving a class of boundary value inverse problem.
A real-time reconstruction is achieved by evaluating the learned inverse operator between carefully designed data and reconstructed images.
arXiv Detail & Related papers (2022-09-29T17:45:25Z) - Enhancing Dual-Encoders with Question and Answer Cross-Embeddings for
Answer Retrieval [29.16807969384253]
Dual-Encoders is a promising mechanism for answer retrieval in question answering (QA) systems.
We propose a framework to enhance the Dual-Encoders model with question answer cross-embeddings and a novel Geometry Alignment Mechanism (GAM)
Our framework significantly improves Dual-Encoders model and outperforms the state-of-the-art method on multiple answer retrieval datasets.
arXiv Detail & Related papers (2022-06-07T02:39:24Z) - Hybrid Predictive Coding: Inferring, Fast and Slow [62.997667081978825]
We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner.
We demonstrate that our model is inherently sensitive to its uncertainty and adaptively balances balances to obtain accurate beliefs using minimum computational expense.
arXiv Detail & Related papers (2022-04-05T12:52:45Z) - Adaptive Discrete Communication Bottlenecks with Dynamic Vector
Quantization [76.68866368409216]
We propose learning to dynamically select discretization tightness conditioned on inputs.
We show that dynamically varying tightness in communication bottlenecks can improve model performance on visual reasoning and reinforcement learning tasks.
arXiv Detail & Related papers (2022-02-02T23:54:26Z) - $\partial$-Explainer: Abductive Natural Language Inference via
Differentiable Convex Optimization [2.624902795082451]
This paper presents a novel framework named $partial$-Explainer (Diff-Explainer) that combines the best of both worlds by casting the constrained optimization as part of a deep neural network.
Our experiments show up to $approx 10%$ improvement over non-differentiable solver while still providing explanations for supporting its inference.
arXiv Detail & Related papers (2021-05-07T17:49:19Z) - Case-Based Abductive Natural Language Inference [4.726777092009554]
Case-Based Abductive Natural Language Inference (CB-ANLI)
Case-Based Abductive Natural Language Inference (CB-ANLI)
arXiv Detail & Related papers (2020-09-30T09:50:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.