Related papers: Delaying Interaction Layers in Transformer-based Encoders for Efficient Open Domain Question Answering

Delaying Interaction Layers in Transformer-based Encoders for Efficient Open Domain Question Answering

URL: http://arxiv.org/abs/2010.08422v1
Date: Fri, 16 Oct 2020 14:36:38 GMT
Title: Delaying Interaction Layers in Transformer-based Encoders for Efficient Open Domain Question Answering
Authors: Wissam Siblini, Mohamed Challal and Charlotte Pasqual
Abstract summary: Open Domain Question Answering (ODQA) on a large-scale corpus of documents is a key challenge in computer science. We propose a more direct and complementary solution which consists in applying a generic change in the architecture of transformer-based models. The resulting variants are competitive with the original models on the extractive task and allow, on the ODQA setting, a significant speedup and even a performance improvement in many cases.
Score: 3.111078740559015
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Open Domain Question Answering (ODQA) on a large-scale corpus of documents (e.g. Wikipedia) is a key challenge in computer science. Although transformer-based language models such as Bert have shown on SQuAD the ability to surpass humans for extracting answers in small passages of text, they suffer from their high complexity when faced to a much larger search space. The most common way to tackle this problem is to add a preliminary Information Retrieval step to heavily filter the corpus and only keep the relevant passages. In this paper, we propose a more direct and complementary solution which consists in applying a generic change in the architecture of transformer-based models to delay the attention between subparts of the input and allow a more efficient management of computations. The resulting variants are competitive with the original models on the extractive task and allow, on the ODQA setting, a significant speedup and even a performance improvement in many cases.

Related papers

HAFFormer: A Hierarchical Attention-Free Framework for Alzheimer's Disease Detection From Spontaneous Speech [42.688549469089985]
We construct a novel framework, namely Hierarchical Attention-Free Transformer (HAFFormer), to better deal with long speech for Alzheimer's Disease detection. Specifically, we employ an attention-free module of Multi-Scale Depthwise Convolution to replace the self-attention and thus avoid the expensive computation. By conducting extensive experiments on the ADReSS-M dataset, the introduced HAFFormer can achieve competitive results (82.6% accuracy) with other recent work.
arXiv Detail & Related papers (2024-05-07T02:19:16Z)
Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs [75.40636935415601]
Deep learning often faces the challenge of efficiently processing dynamic inputs, such as sensor data or user inputs. We take an incremental computing approach, looking to reuse calculations as the inputs change. We apply this approach to the transformers architecture, creating an efficient incremental inference algorithm with complexity proportional to the fraction of modified inputs.
arXiv Detail & Related papers (2023-07-27T16:30:27Z)
Dynamic Query Selection for Fast Visual Perceiver [42.07082299370995]
We show how to make Perceivers even more efficient, by reducing the number of queries Q during inference while limiting the accuracy drop. In this work, we explore how to make Perceivers even more efficient, by reducing the number of queries Q during inference while limiting the accuracy drop.
arXiv Detail & Related papers (2022-05-22T17:23:51Z)
HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization. Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z)
Long Document Ranking with Query-Directed Sparse Transformer [30.997237454078526]
We design Query-Directed Sparse attention that induces IR-axiomatic structures in transformer self-attention. Our model, QDS-Transformer, enforces the principle properties desired in ranking. Experiments on one fully supervised and three few-shot TREC document ranking benchmarks demonstrate the consistent and robust advantage of QDS-Transformer.
arXiv Detail & Related papers (2020-10-23T21:57:56Z)
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval [117.07047313964773]
We propose a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions. Our method does not require access to any corpus-specific information, such as inter-document hyperlinks or human-annotated entity markers. Our system also yields a much better efficiency-accuracy trade-off, matching the best published accuracy on HotpotQA while being 10 times faster at inference time.
arXiv Detail & Related papers (2020-09-27T06:12:29Z)
Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding [90.77031668988661]
Cluster-Former is a novel clustering-based sparse Transformer to perform attention across chunked sequences. The proposed framework is pivoted on two unique types of Transformer layer: Sliding-Window Layer and Cluster-Former Layer. Experiments show that Cluster-Former achieves state-of-the-art performance on several major QA benchmarks.
arXiv Detail & Related papers (2020-09-13T22:09:30Z)
The Cascade Transformer: an Application for Efficient Answer Sentence Selection [116.09532365093659]
We introduce the Cascade Transformer, a technique to adapt transformer-based models into a cascade of rankers. When compared to a state-of-the-art transformer model, our approach reduces computation by 37% with almost no impact on accuracy.
arXiv Detail & Related papers (2020-05-05T23:32:01Z)
Pre-training Tasks for Embedding-based Large-scale Retrieval [68.01167604281578]
We consider the large-scale query-document retrieval problem. Given a query (e.g., a question), return the set of relevant documents from a large document corpus. We show that the key ingredient of learning a strong embedding-based Transformer model is the set of pre-training tasks.
arXiv Detail & Related papers (2020-02-10T16:44:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.