Location Aware Modular Biencoder for Tourism Question Answering
- URL: http://arxiv.org/abs/2401.02187v1
- Date: Thu, 4 Jan 2024 10:39:58 GMT
- Title: Location Aware Modular Biencoder for Tourism Question Answering
- Authors: Haonan Li, Martin Tomko, Timothy Baldwin
- Abstract summary: We propose treating the QA task as a dense vector retrieval problem.
We encode questions and POIs separately and retrieve the most relevant POIs for a question by utilizing embedding space similarity.
Experiments on a real-world tourism QA dataset demonstrate that our approach is effective, efficient, and outperforms previous methods.
- Score: 33.5507972300392
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Answering real-world tourism questions that seek Point-of-Interest (POI)
recommendations is challenging, as it requires both spatial and non-spatial
reasoning, over a large candidate pool. The traditional method of encoding each
pair of question and POI becomes inefficient when the number of candidates
increases, making it infeasible for real-world applications. To overcome this,
we propose treating the QA task as a dense vector retrieval problem, where we
encode questions and POIs separately and retrieve the most relevant POIs for a
question by utilizing embedding space similarity. We use pretrained language
models (PLMs) to encode textual information, and train a location encoder to
capture spatial information of POIs. Experiments on a real-world tourism QA
dataset demonstrate that our approach is effective, efficient, and outperforms
previous methods across all metrics. Enabled by the dense retrieval
architecture, we further build a global evaluation baseline, expanding the
search space by 20 times compared to previous work. We also explore several
factors that impact on the model's performance through follow-up experiments.
Our code and model are publicly available at https://github.com/haonan-li/LAMB.
Related papers
- Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent [102.31558123570437]
Multimodal Retrieval Augmented Generation (mRAG) plays an important role in mitigating the "hallucination" issue inherent in multimodal large language models (MLLMs)
We propose the first self-adaptive planning agent for multimodal retrieval, OmniSearch.
arXiv Detail & Related papers (2024-11-05T09:27:21Z) - Multi-LLM QA with Embodied Exploration [55.581423861790945]
We investigate the use of Multi-Embodied LLM Explorers (MELE) for question-answering in an unknown environment.
Multiple LLM-based agents independently explore and then answer queries about a household environment.
We analyze different aggregation methods to generate a single, final answer for each query.
arXiv Detail & Related papers (2024-06-16T12:46:40Z) - The Impacts of Data, Ordering, and Intrinsic Dimensionality on Recall in Hierarchical Navigable Small Worlds [0.09208007322096533]
Investigation focuses on HNSW's efficacy across a spectrum of datasets.
We discover that the recall of approximate HNSW search, in comparison to exact K Nearest Neighbours (KNN) search, is linked to the vector space's intrinsic dimensionality.
We observe that running popular benchmark datasets with HNSW instead of KNN can shift rankings by up to three positions for some models.
arXiv Detail & Related papers (2024-05-28T04:16:43Z) - Building Interpretable and Reliable Open Information Retriever for New
Domains Overnight [67.03842581848299]
Information retrieval is a critical component for many down-stream tasks such as open-domain question answering (QA)
We propose an information retrieval pipeline that uses entity/event linking model and query decomposition model to focus more accurately on different information units of the query.
We show that, while being more interpretable and reliable, our proposed pipeline significantly improves passage coverages and denotation accuracies across five IR and QA benchmarks.
arXiv Detail & Related papers (2023-08-09T07:47:17Z) - MFBE: Leveraging Multi-Field Information of FAQs for Efficient Dense
Retrieval [1.7403133838762446]
We propose a bi-encoder-based query-FAQ matching model that leverages multiple combinations of FAQ fields.
Our model achieves around 27% and 20% better top-1 accuracy for the FAQ retrieval task on internal and open datasets.
arXiv Detail & Related papers (2023-02-23T12:02:49Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - Semi-Structured Query Grounding for Document-Oriented Databases with
Deep Retrieval and Its Application to Receipt and POI Matching [23.52046767195031]
We aim to address practical challenges when using embedding-based retrieval for the query grounding problem in semi-structured data.
We conduct extensive experiments to find the most effective combination of modules for the embedding and retrieval of both query and database entries.
The proposed model significantly outperforms the conventional manual pattern-based model while requiring much less development and maintenance cost.
arXiv Detail & Related papers (2022-02-23T05:32:34Z) - A Reinforcement Learning Approach to the Orienteering Problem with Time
Windows [0.0]
The Orienteering Problem with Time Windows (OPTW) is an optimization problem where the goal is to maximize the total score collected from different visited locations.
This study explores the use of Pointer Network models trained using reinforcement learning to solve the OPTW problem.
arXiv Detail & Related papers (2020-11-07T00:38:06Z) - KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT)
All tasks in KILT are grounded in the same snapshot of Wikipedia.
We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z) - Mining Implicit Relevance Feedback from User Behavior for Web Question
Answering [92.45607094299181]
We make the first study to explore the correlation between user behavior and passage relevance.
Our approach significantly improves the accuracy of passage ranking without extra human labeled data.
In practice, this work has proved effective to substantially reduce the human labeling cost for the QA service in a global commercial search engine.
arXiv Detail & Related papers (2020-06-13T07:02:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.