The Web Is Your Oyster -- Knowledge-Intensive NLP against a Very Large
Web Corpus
- URL: http://arxiv.org/abs/2112.09924v1
- Date: Sat, 18 Dec 2021 13:15:34 GMT
- Title: The Web Is Your Oyster -- Knowledge-Intensive NLP against a Very Large
Web Corpus
- Authors: Aleksandra Piktus and Fabio Petroni and Vladimir Karpukhin and Dmytro
Okhonko and Samuel Broscheit and Gautier Izacard and Patrick Lewis and Barlas
O\u{g}uz and Edouard Grave and Wen-tau Yih and Sebastian Riedel
- Abstract summary: We propose a new setup for evaluating existing KI-NLP tasks in which we generalize the background corpus to a universal web snapshot.
We repurpose KILT, a standard KI-NLP benchmark initially developed for Wikipedia, and ask systems to use a subset of CCNet - the Sphere corpus.
We find that despite potential gaps of coverage, challenges of scale, lack of structure and lower quality, retrieval from Sphere enables a state-of-the-art-and-read system to match and even outperform Wikipedia-based models.
- Score: 76.9522248303716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In order to address the increasing demands of real-world applications, the
research for knowledge-intensive NLP (KI-NLP) should advance by capturing the
challenges of a truly open-domain environment: web scale knowledge, lack of
structure, inconsistent quality, and noise. To this end, we propose a new setup
for evaluating existing KI-NLP tasks in which we generalize the background
corpus to a universal web snapshot. We repurpose KILT, a standard KI-NLP
benchmark initially developed for Wikipedia, and ask systems to use a subset of
CCNet - the Sphere corpus - as a knowledge source. In contrast to Wikipedia,
Sphere is orders of magnitude larger and better reflects the full diversity of
knowledge on the Internet. We find that despite potential gaps of coverage,
challenges of scale, lack of structure and lower quality, retrieval from Sphere
enables a state-of-the-art retrieve-and-read system to match and even
outperform Wikipedia-based models on several KILT tasks - even if we
aggressively filter content that looks like Wikipedia. We also observe that
while a single dense passage index over Wikipedia can outperform a sparse BM25
version, on Sphere this is not yet possible. To facilitate further research
into this area, and minimise the community's reliance on proprietary black box
search engines, we will share our indices, evaluation metrics and
infrastructure.
Related papers
- Bidirectional Knowledge Reconfiguration for Lightweight Point Cloud
Analysis [74.00441177577295]
Point cloud analysis faces computational system overhead, limiting its application on mobile or edge devices.
This paper explores feature distillation for lightweight point cloud models.
We propose bidirectional knowledge reconfiguration to distill informative contextual knowledge from the teacher to the student.
arXiv Detail & Related papers (2023-10-08T11:32:50Z) - SOE-Net: A Self-Attention and Orientation Encoding Network for Point
Cloud based Place Recognition [50.9889997200743]
We tackle the problem of place recognition from point cloud data with a self-attention and orientation encoding network (SOE-Net)
SOE-Net fully explores the relationship between points and incorporates long-range context into point-wise local descriptors.
Experiments on various benchmark datasets demonstrate superior performance of the proposed network over the current state-of-the-art approaches.
arXiv Detail & Related papers (2020-11-24T22:28:25Z) - Revisiting Rainbow: Promoting more Insightful and Inclusive Deep
Reinforcement Learning Research [15.710674189908614]
We argue that, despite the community's emphasis on large-scale environments, the traditional small-scale environments can still yield valuable scientific insights.
We revisit the paper which introduced the Rainbow algorithm and present some new insights into the algorithms used by Rainbow.
arXiv Detail & Related papers (2020-11-20T15:23:40Z) - Hierarchical Neural Architecture Search for Deep Stereo Matching [131.94481111956853]
We propose the first end-to-end hierarchical NAS framework for deep stereo matching.
Our framework incorporates task-specific human knowledge into the neural architecture search framework.
It is ranked at the top 1 accuracy on KITTI stereo 2012, 2015 and Middlebury benchmarks, as well as the top 1 on SceneFlow dataset.
arXiv Detail & Related papers (2020-10-26T11:57:37Z) - CLASS: Cross-Level Attention and Supervision for Salient Objects
Detection [10.01397180778694]
We propose a novel deep network for accurate SOD, named CLASS.
In experiments, with the proposed CLA and CLS, our CLASS net consistently outperforms 13 state-of-the-art methods on five datasets.
arXiv Detail & Related papers (2020-09-23T03:10:12Z) - KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT)
All tasks in KILT are grounded in the same snapshot of Wikipedia.
We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z) - Is deeper better? It depends on locality of relevant features [5.33024001730262]
We investigate the effect of increasing the depth within an over parameterized regime.
Experiments show that deeper is better for local labels, whereas shallower is better for global labels.
It is shown that the neural kernel does not correctly capture the depth dependence of the generalization performance.
arXiv Detail & Related papers (2020-05-26T02:44:18Z) - Dense Residual Network: Enhancing Global Dense Feature Flow for
Character Recognition [75.4027660840568]
This paper explores how to enhance the local and global dense feature flow by exploiting hierarchical features fully from all the convolution layers.
Technically, we propose an efficient and effective CNN framework, i.e., Fast Dense Residual Network (FDRN) for text recognition.
arXiv Detail & Related papers (2020-01-23T06:55:08Z) - Disentangling Trainability and Generalization in Deep Neural Networks [45.15453323967438]
We analyze the spectrum of the Neural Tangent Kernel (NTK) for trainability and generalization across a range of networks.
We find that CNNs without global average pooling behave almost identically to FCNs, but that CNNs with pooling have markedly different and often better generalization performance.
arXiv Detail & Related papers (2019-12-30T18:53:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.