BERT Rankers are Brittle: a Study using Adversarial Document
Perturbations
- URL: http://arxiv.org/abs/2206.11724v1
- Date: Thu, 23 Jun 2022 14:16:48 GMT
- Title: BERT Rankers are Brittle: a Study using Adversarial Document
Perturbations
- Authors: Yumeng Wang, Lijun Lyu, Avishek Anand
- Abstract summary: Contextual ranking models based on BERT are well established for a wide range of passage and document ranking tasks.
We argue that BERT-rankers are not immune to adversarial attacks targeting retrieved documents given a query.
- Score: 3.6704226968275258
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Contextual ranking models based on BERT are now well established for a wide
range of passage and document ranking tasks. However, the robustness of
BERT-based ranking models under adversarial inputs is under-explored. In this
paper, we argue that BERT-rankers are not immune to adversarial attacks
targeting retrieved documents given a query. Firstly, we propose algorithms for
adversarial perturbation of both highly relevant and non-relevant documents
using gradient-based optimization methods. The aim of our algorithms is to
add/replace a small number of tokens to a highly relevant or non-relevant
document to cause a large rank demotion or promotion. Our experiments show that
a small number of tokens can already result in a large change in the rank of a
document. Moreover, we find that BERT-rankers heavily rely on the document
start/head for relevance prediction, making the initial part of the document
more susceptible to adversarial attacks. More interestingly, we find a small
set of recurring adversarial words that when added to documents result in
successful rank demotion/promotion of any relevant/non-relevant document
respectively. Finally, our adversarial tokens also show particular topic
preferences within and across datasets, exposing potential biases from BERT
pre-training or downstream datasets.
Related papers
- Generating Natural Language Queries for More Effective Systematic Review
Screening Prioritisation [53.77226503675752]
The current state of the art uses the final title of the review as a query to rank the documents using BERT-based neural rankers.
In this paper, we explore alternative sources of queries for prioritising screening, such as the Boolean query used to retrieve the documents to be screened and queries generated by instruction-based large-scale language models such as ChatGPT and Alpaca.
Our best approach is not only viable based on the information available at the time of screening, but also has similar effectiveness to the final title.
arXiv Detail & Related papers (2023-09-11T05:12:14Z) - DAPR: A Benchmark on Document-Aware Passage Retrieval [57.45793782107218]
We propose and name this task emphDocument-Aware Passage Retrieval (DAPR)
While analyzing the errors of the State-of-The-Art (SoTA) passage retrievers, we find the major errors (53.5%) are due to missing document context.
Our created benchmark enables future research on developing and comparing retrieval systems for the new task.
arXiv Detail & Related papers (2023-05-23T10:39:57Z) - Shuffle & Divide: Contrastive Learning for Long Text [6.187839874846451]
We propose a self-supervised learning method for long text documents based on contrastive learning.
A key to our method is Shuffle and Divide (SaD), a simple text augmentation algorithm.
We have empirically evaluated our method by performing unsupervised text classification on the 20 Newsgroups, Reuters-21578, BBC, and BBCSport datasets.
arXiv Detail & Related papers (2023-04-19T02:02:29Z) - Open Domain Multi-document Summarization: A Comprehensive Study of Model
Brittleness under Retrieval [42.73076855699184]
Multi-document summarization (MDS) assumes a set of topic-related documents are provided as input.
We study this more challenging setting by formalizing the task and bootstrapping it using existing datasets, retrievers and summarizers.
arXiv Detail & Related papers (2022-12-20T18:41:38Z) - GERE: Generative Evidence Retrieval for Fact Verification [57.78768817972026]
We propose GERE, the first system that retrieves evidences in a generative fashion.
The experimental results on the FEVER dataset show that GERE achieves significant improvements over the state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-12T03:49:35Z) - Augmenting Document Representations for Dense Retrieval with
Interpolation and Perturbation [49.940525611640346]
Document Augmentation for dense Retrieval (DAR) framework augments the representations of documents with their Dense Augmentation and perturbations.
We validate the performance of DAR on retrieval tasks with two benchmark datasets, showing that the proposed DAR significantly outperforms relevant baselines on the dense retrieval of both the labeled and unlabeled documents.
arXiv Detail & Related papers (2022-03-15T09:07:38Z) - Improving Query Representations for Dense Retrieval with Pseudo
Relevance Feedback [29.719150565643965]
This paper proposes ANCE-PRF, a new query encoder that uses pseudo relevance feedback (PRF) to improve query representations for dense retrieval.
ANCE-PRF uses a BERT encoder that consumes the query and the top retrieved documents from a dense retrieval model, ANCE, and it learns to produce better query embeddings directly from relevance labels.
Analysis shows that the PRF encoder effectively captures the relevant and complementary information from PRF documents, while ignoring the noise with its learned attention mechanism.
arXiv Detail & Related papers (2021-08-30T18:10:26Z) - Multilevel Text Alignment with Cross-Document Attention [59.76351805607481]
Existing alignment methods operate at a single, predefined level.
We propose a new learning approach that equips previously established hierarchical attention encoders for representing documents with a cross-document attention component.
arXiv Detail & Related papers (2020-10-03T02:52:28Z) - Fine-Grained Relevance Annotations for Multi-Task Document Ranking and
Question Answering [9.480648914353035]
We present FiRA: a novel dataset of Fine-Grained Relevances.
We extend the ranked retrieval annotations of the Deep Learning track of TREC 2019 with passage and word level graded relevance annotations for all relevant documents.
As an example, we evaluate the recently introduced TKL document ranking model. We find that although TKL exhibits state-of-the-art retrieval results for long documents, it misses many relevant passages.
arXiv Detail & Related papers (2020-08-12T14:59:50Z) - Context-Based Quotation Recommendation [60.93257124507105]
We propose a novel context-aware quote recommendation system.
It generates a ranked list of quotable paragraphs and spans of tokens from a given source document.
We conduct experiments on a collection of speech transcripts and associated news articles.
arXiv Detail & Related papers (2020-05-17T17:49:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.