Related papers: Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity

Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity

URL: http://arxiv.org/abs/2101.02351v1
Date: Thu, 7 Jan 2021 03:27:32 GMT
Title: Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity
Authors: Ankush Chopra, Shruti Agrawal and Sohom Ghosh
Abstract summary: We discuss a framework for calculating similarities between a given input query and a set of predefined questions to retrieve the question which matches to it the most. We have used it for the financial domain, but the framework is generalized for any domain-specific search engine and can be used in other domains as well.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Search is one of the most common platforms used to seek information. However, users mostly get overloaded with results whenever they use such a platform to resolve their queries. Nowadays, direct answers to queries are being provided as a part of the search experience. The question-answer (QA) retrieval process plays a significant role in enriching the search experience. Most off-the-shelf Semantic Textual Similarity models work fine for well-formed search queries, but their performances degrade when applied to a domain-specific setting having incomplete or grammatically ill-formed search queries in prevalence. In this paper, we discuss a framework for calculating similarities between a given input query and a set of predefined questions to retrieve the question which matches to it the most. We have used it for the financial domain, but the framework is generalized for any domain-specific search engine and can be used in other domains as well. We use Siamese network [6] over Long Short-Term Memory (LSTM) [3] models to train a classifier which generates unnormalized and normalized similarity scores for a given pair of questions. Moreover, for each of these question pairs, we calculate three other similarity scores: cosine similarity between their average word2vec embeddings [15], cosine similarity between their sentence embeddings [7] generated using RoBERTa [17] and their customized fuzzy-match score. Finally, we develop a metaclassifier using Support Vector Machines [19] for combining these five scores to detect if a given pair of questions is similar. We benchmark our model's performance against existing State Of The Art (SOTA) models on Quora Question Pairs (QQP) dataset as well as a dataset specific to the financial domain.

Related papers

Robust Knowledge Extraction from Large Language Models using Social Choice Theory [18.634845632109496]
Large-language models (LLMs) can support a wide range of applications like conversational agents, creative writing or general query answering. They are ill-suited for query answering in high-stake domains like medicine because they are typically not robust. We propose using ranking queries repeatedly and to aggregate the queries using methods from social choice theory.
arXiv Detail & Related papers (2023-12-22T17:57:29Z)
Semantic Equivalence of e-Commerce Queries [6.232692545488813]
This paper introduces a framework to recognize and leverage query equivalence to enhance searcher and business outcomes. The proposed approach addresses three key problems: mapping queries to vector representations of search intent, identifying nearest neighbor queries expressing equivalent or similar intent, and optimizing for user or business objectives.
arXiv Detail & Related papers (2023-08-07T18:40:13Z)
QUADRo: Dataset and Models for QUestion-Answer Database Retrieval [97.84448420852854]
Given a database (DB) of question/answer (q/a) pairs, it is possible to answer a target question by scanning the DB for similar questions. We build a large scale DB of 6.3M q/a pairs, using public questions, and design a new system based on neural IR and a q/a pair reranker. We show that our DB-based approach is competitive with Web-based methods, i.e., a QA system built on top the BING search engine.
arXiv Detail & Related papers (2023-03-30T00:42:07Z)
Automated Query Generation for Evidence Collection from Web Search Engines [2.642698101441705]
It is widely accepted that so-called facts can be checked by searching for information on the Internet. This process requires a fact-checker to formulate a search query based on the fact and to present it to a search engine. We ask the question as to whether it is possible to automate the first step, that of query generation.
arXiv Detail & Related papers (2023-03-15T14:32:00Z)
UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question. We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z)
Session-Aware Query Auto-completion using Extreme Multi-label Ranking [61.753713147852125]
We take the novel approach of modeling session-aware query auto-completion as an e Multi-Xtreme Ranking (XMR) problem. We adapt a popular XMR algorithm for this purpose by proposing several modifications to the key steps in the algorithm. Our approach meets the stringent latency requirements for auto-complete systems while leveraging session information in making suggestions.
arXiv Detail & Related papers (2020-12-09T17:56:22Z)
Effective FAQ Retrieval and Question Matching With Unsupervised Knowledge Injection [10.82418428209551]
We propose a contextual language model for retrieving appropriate answers to frequently asked questions. We also explore to capitalize on domain-specific topically-relevant relations between words in an unsupervised manner. We evaluate variants of our approach on a publicly-available Chinese FAQ dataset, and further apply and contextualize it to a large-scale question-matching task.
arXiv Detail & Related papers (2020-10-27T05:03:34Z)
Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question. Most open QA systems have considered only retrieving information from unstructured text. We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z)
Match$^2$: A Matching over Matching Model for Similar Question Identification [74.7142127303489]
Community Question Answering (CQA) has become a primary means for people to acquire knowledge, where people are free to ask questions or submit answers. Similar question identification becomes a core task in CQA which aims to find a similar question from the archived repository whenever a new question is asked. It has long been a challenge to properly measure the similarity between two questions due to the inherent variation of natural language, i.e., there could be different ways to ask a same question or different questions sharing similar expressions. Traditional methods typically take a one-side usage, which leverages the answer as some expanded representation of the
arXiv Detail & Related papers (2020-06-21T05:59:34Z)
Robust Question Answering Through Sub-part Alignment [53.94003466761305]
We model question answering as an alignment problem. We train our model on SQuAD v1.1 and test it on several adversarial and out-of-domain datasets.
arXiv Detail & Related papers (2020-04-30T09:10:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.