Wizard of Search Engine: Access to Information Through Conversations
with Search Engines
- URL: http://arxiv.org/abs/2105.08301v1
- Date: Tue, 18 May 2021 06:35:36 GMT
- Title: Wizard of Search Engine: Access to Information Through Conversations
with Search Engines
- Authors: Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zhumin Chen,
Zhaochun Ren and Maarten de Rijke
- Abstract summary: We make efforts to facilitate research on CIS from three aspects.
We formulate a pipeline for CIS with six sub-tasks: intent detection (ID), keyphrase extraction (KE), action prediction (AP), query selection (QS), passage selection (PS) and response generation (RG)
We release a benchmark dataset, called wizard of search engine (WISE), which allows for comprehensive and in-depth research on all aspects of CIS.
- Score: 58.53420685514819
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conversational information seeking (CIS) is playing an increasingly important
role in connecting people to information. Due to the lack of suitable resource,
previous studies on CIS are limited to the study of theoretical/conceptual
frameworks, laboratory-based user studies, or a particular aspect of CIS (e.g.,
asking clarifying questions). In this work, we make efforts to facilitate
research on CIS from three aspects. (1) We formulate a pipeline for CIS with
six sub-tasks: intent detection (ID), keyphrase extraction (KE), action
prediction (AP), query selection (QS), passage selection (PS), and response
generation (RG). (2) We release a benchmark dataset, called wizard of search
engine (WISE), which allows for comprehensive and in-depth research on all
aspects of CIS. (3) We design a neural architecture capable of training and
evaluating both jointly and separately on the six sub-tasks, and devise a
pre-train/fine-tune learning scheme, that can reduce the requirements of WISE
in scale by making full use of available data. We report some useful
characteristics of CIS based on statistics of WISE. We also show that our best
performing model variant isable to achieve effective CIS as indicated by
several metrics. We release the dataset, the code, as well as the evaluation
scripts to facilitate future research by measuring further improvements in this
important research direction.
Related papers
- SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers [43.18330795060871]
SPIQA is a dataset specifically designed to interpret complex figures and tables within the context of scientific research articles.
We employ automatic and manual curation to create the dataset.
SPIQA comprises 270K questions divided into training, validation, and three different evaluation splits.
arXiv Detail & Related papers (2024-07-12T16:37:59Z) - A Comprehensive Survey on Underwater Image Enhancement Based on Deep Learning [51.7818820745221]
Underwater image enhancement (UIE) presents a significant challenge within computer vision research.
Despite the development of numerous UIE algorithms, a thorough and systematic review is still absent.
arXiv Detail & Related papers (2024-05-30T04:46:40Z) - RethinkingTMSC: An Empirical Study for Target-Oriented Multimodal
Sentiment Classification [70.9087014537896]
Target-oriented Multimodal Sentiment Classification (TMSC) has gained significant attention among scholars.
To investigate the causes of this problem, we perform extensive empirical evaluation and in-depth analysis of the datasets.
arXiv Detail & Related papers (2023-10-14T14:52:37Z) - AVIS: Autonomous Visual Information Seeking with Large Language Model
Agent [123.75169211547149]
We propose an autonomous information seeking visual question answering framework, AVIS.
Our method leverages a Large Language Model (LLM) to dynamically strategize the utilization of external tools.
AVIS achieves state-of-the-art results on knowledge-intensive visual question answering benchmarks such as Infoseek and OK-VQA.
arXiv Detail & Related papers (2023-06-13T20:50:22Z) - A Study of Situational Reasoning for Traffic Understanding [63.45021731775964]
We devise three novel text-based tasks for situational reasoning in the traffic domain.
We adopt four knowledge-enhanced methods that have shown generalization capability across language reasoning tasks in prior work.
We provide in-depth analyses of model performance on data partitions and examine model predictions categorically.
arXiv Detail & Related papers (2023-06-05T01:01:12Z) - Investigating Neural Architectures by Synthetic Dataset Design [14.317837518705302]
Recent years have seen the emergence of many new neural network structures (architectures and layers)
We sketch a methodology to measure the effect of each structure on a network's ability, by designing ad hoc synthetic datasets.
We illustrate our methodology by building three datasets to evaluate each of the three following network properties.
arXiv Detail & Related papers (2022-04-23T10:50:52Z) - RPT: Toward Transferable Model on Heterogeneous Researcher Data via
Pre-Training [19.987304448524043]
We propose a multi-task self-supervised learning-based researcher data pre-training model named RPT.
We divide the researchers' data into semantic document sets and community graph.
We propose three self-supervised learning objectives to train the whole model.
arXiv Detail & Related papers (2021-10-08T03:42:09Z) - Mining Implicit Relevance Feedback from User Behavior for Web Question
Answering [92.45607094299181]
We make the first study to explore the correlation between user behavior and passage relevance.
Our approach significantly improves the accuracy of passage ranking without extra human labeled data.
In practice, this work has proved effective to substantially reduce the human labeling cost for the QA service in a global commercial search engine.
arXiv Detail & Related papers (2020-06-13T07:02:08Z) - A New Perspective on Learning Context-Specific Independence [18.273290530700567]
Local structure such as context-specific independence (CSI) has received much attention in the probabilistic graphical model (PGM) literature.
In this paper, we provide a new perspective on how to learn CSIs from data.
arXiv Detail & Related papers (2020-06-12T01:11:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.