Related papers: Hybrid Autoregressive Solver for Scalable Abductive Natural Language Inference

Hybrid Autoregressive Solver for Scalable Abductive Natural Language Inference

URL: http://arxiv.org/abs/2107.11879v1
Date: Sun, 25 Jul 2021 19:29:53 GMT
Title: Hybrid Autoregressive Solver for Scalable Abductive Natural Language Inference
Authors: Marco Valentino, Mokanarangan Thayaparan, Deborah Ferreira, Andr\'e Freitas
Abstract summary: We propose a hybrid abductive solver that autoregressively combines a dense bi-encoder with a sparse model of explanatory power. Our experiments demonstrate that the proposed framework can achieve performance comparable with the state-of-the-art cross-encoder.
Score: 2.867517731896504
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Regenerating natural language explanations for science questions is a challenging task for evaluating complex multi-hop and abductive inference capabilities. In this setting, Transformers trained on human-annotated explanations achieve state-of-the-art performance when adopted as cross-encoder architectures. However, while much attention has been devoted to the quality of the constructed explanations, the problem of performing abductive inference at scale is still under-studied. As intrinsically not scalable, the cross-encoder architectural paradigm is not suitable for efficient multi-hop inference on massive facts banks. To maximise both accuracy and inference time, we propose a hybrid abductive solver that autoregressively combines a dense bi-encoder with a sparse model of explanatory power, computed leveraging explicit patterns in the explanations. Our experiments demonstrate that the proposed framework can achieve performance comparable with the state-of-the-art cross-encoder while being $\approx 50$ times faster and scalable to corpora of millions of facts. Moreover, we study the impact of the hybridisation on semantic drift and science question answering without additional training, showing that it boosts the quality of the explanations and contributes to improved downstream inference performance.

Related papers

On the Robustness of Transformers against Context Hijacking for Linear Classification [26.1838836907147]
Transformer-based Large Language Models (LLMs) have demonstrated powerful in-context learning capabilities. They can be disrupted by factually correct context, a phenomenon known as context hijacking. We show that a well-trained deeper transformer can achieve higher robustness, which aligns with empirical observations.
arXiv Detail & Related papers (2025-02-21T17:31:00Z)
A Smooth Transition Between Induction and Deduction: Fast Abductive Learning Based on Probabilistic Symbol Perception [81.30687085692576]
We introduce an optimization algorithm named as Probabilistic Symbol Perception (PSP), which makes a smooth transition between induction and deduction. Experiments demonstrate the promising results.
arXiv Detail & Related papers (2025-02-18T14:59:54Z)
A Theoretical Perspective for Speculative Decoding Algorithm [60.79447486066416]
One effective way to accelerate inference is emphSpeculative Decoding, which employs a small model to sample a sequence of draft tokens and a large model to validate. This paper tackles this gap by conceptualizing the decoding problem via markov chain abstraction and studying the key properties, emphoutput quality and inference acceleration, from a theoretical perspective.
arXiv Detail & Related papers (2024-10-30T01:53:04Z)
More Experts Than Galaxies: Conditionally-overlapping Experts With Biologically-Inspired Fixed Routing [5.846028298833611]
Conditionally Overlapping Mixture of ExperTs (COMET) is a general deep learning method that inducing a modular, sparse architecture with an exponential number of overlapping experts. We demonstrate the effectiveness of COMET on a range of tasks, including image classification, language modeling, and regression.
arXiv Detail & Related papers (2024-10-10T14:58:18Z)
Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance. Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning. Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
Disentangled Representation Learning with Transmitted Information Bottleneck [57.22757813140418]
We present textbfDisTIB (textbfTransmitted textbfInformation textbfBottleneck for textbfDisd representation learning), a novel objective that navigates the balance between information compression and preservation.
arXiv Detail & Related papers (2023-11-03T03:18:40Z)
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure" We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z)
Enhancing Dual-Encoders with Question and Answer Cross-Embeddings for Answer Retrieval [29.16807969384253]
Dual-Encoders is a promising mechanism for answer retrieval in question answering (QA) systems. We propose a framework to enhance the Dual-Encoders model with question answer cross-embeddings and a novel Geometry Alignment Mechanism (GAM) Our framework significantly improves Dual-Encoders model and outperforms the state-of-the-art method on multiple answer retrieval datasets.
arXiv Detail & Related papers (2022-06-07T02:39:24Z)
Hybrid Predictive Coding: Inferring, Fast and Slow [62.997667081978825]
We propose a hybrid predictive coding network that combines both iterative and amortized inference in a principled manner. We demonstrate that our model is inherently sensitive to its uncertainty and adaptively balances balances to obtain accurate beliefs using minimum computational expense.
arXiv Detail & Related papers (2022-04-05T12:52:45Z)
$\partial$-Explainer: Abductive Natural Language Inference via Differentiable Convex Optimization [2.624902795082451]
This paper presents a novel framework named $partial$-Explainer (Diff-Explainer) that combines the best of both worlds by casting the constrained optimization as part of a deep neural network. Our experiments show up to $approx 10%$ improvement over non-differentiable solver while still providing explanations for supporting its inference.
arXiv Detail & Related papers (2021-05-07T17:49:19Z)
Case-Based Abductive Natural Language Inference [4.726777092009554]
Case-Based Abductive Natural Language Inference (CB-ANLI) Case-Based Abductive Natural Language Inference (CB-ANLI)
arXiv Detail & Related papers (2020-09-30T09:50:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.