Graph Enhanced BERT for Query Understanding
- URL: http://arxiv.org/abs/2204.06522v2
- Date: Fri, 17 Nov 2023 05:17:45 GMT
- Title: Graph Enhanced BERT for Query Understanding
- Authors: Juanhui Li, Yao Ma, Wei Zeng, Suqi Cheng, Jiliang Tang, Shuaiqiang
Wang, Dawei Yin
- Abstract summary: query understanding plays a key role in exploring users' search intents and facilitating users to locate their most desired information.
In recent years, pre-trained language models (PLMs) have advanced various natural language processing tasks.
We propose a novel graph-enhanced pre-training framework, GE-BERT, which can leverage both query content and the query graph.
- Score: 55.90334539898102
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Query understanding plays a key role in exploring users' search intents and
facilitating users to locate their most desired information. However, it is
inherently challenging since it needs to capture semantic information from
short and ambiguous queries and often requires massive task-specific labeled
data. In recent years, pre-trained language models (PLMs) have advanced various
natural language processing tasks because they can extract general semantic
information from large-scale corpora. Therefore, there are unprecedented
opportunities to adopt PLMs for query understanding. However, there is a gap
between the goal of query understanding and existing pre-training strategies --
the goal of query understanding is to boost search performance while existing
strategies rarely consider this goal. Thus, directly applying them to query
understanding is sub-optimal. On the other hand, search logs contain user
clicks between queries and urls that provide rich users' search behavioral
information on queries beyond their content. Therefore, in this paper, we aim
to fill this gap by exploring search logs. In particular, to incorporate search
logs into pre-training, we first construct a query graph where nodes are
queries and two queries are connected if they lead to clicks on the same urls.
Then we propose a novel graph-enhanced pre-training framework, GE-BERT, which
can leverage both query content and the query graph. In other words, GE-BERT
can capture both the semantic information and the users' search behavioral
information of queries. Extensive experiments on various query understanding
tasks have demonstrated the effectiveness of the proposed framework.
Related papers
- QueryBuilder: Human-in-the-Loop Query Development for Information Retrieval [12.543590253664492]
We present a novel, interactive system called $textitQueryBuilder$.
It allows a novice, English-speaking user to create queries with a small amount of effort.
It rapidly develops cross-lingual information retrieval queries corresponding to the user's information needs.
arXiv Detail & Related papers (2024-09-07T00:46:58Z) - Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - User Intent Recognition and Semantic Cache Optimization-Based Query Processing Framework using CFLIS and MGR-LAU [0.0]
This work analyzed the informational, navigational, and transactional-based intents in queries for enhanced QP.
For efficient QP, the data is structured using Epanechnikov Kernel-Ordering Points To Identify the Clustering Structure (EK-OPTICS)
The extracted features, detected intents and structured data are inputted to the Multi-head Gated Recurrent Learnable Attention Unit (MGR-LAU)
arXiv Detail & Related papers (2024-06-06T20:28:05Z) - Enhanced Facet Generation with LLM Editing [5.4327243200369555]
In information retrieval, facet identification of a user query is an important task.
Previous studies can enhance facet prediction by leveraging retrieved documents and related queries obtained through a search engine.
However, there are challenges in extending it to other applications when a search engine operates as part of the model.
arXiv Detail & Related papers (2024-03-25T00:43:44Z) - Decomposing Complex Queries for Tip-of-the-tongue Retrieval [72.07449449115167]
Complex queries describe content elements (e.g., book characters or events), information beyond the document text.
This retrieval setting, called tip of the tongue (TOT), is especially challenging for models reliant on lexical and semantic overlap between query and document text.
We introduce a simple yet effective framework for handling such complex queries by decomposing the query into individual clues, routing those as sub-queries to specialized retrievers, and ensembling the results.
arXiv Detail & Related papers (2023-05-24T11:43:40Z) - Neural Graph Reasoning: Complex Logical Query Answering Meets Graph
Databases [63.96793270418793]
Complex logical query answering (CLQA) is a recently emerged task of graph machine learning.
We introduce the concept of Neural Graph Database (NGDBs)
NGDB consists of a Neural Graph Storage and a Neural Graph Engine.
arXiv Detail & Related papers (2023-03-26T04:03:37Z) - UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question
Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question.
We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z) - Query Understanding via Intent Description Generation [75.64800976586771]
We propose a novel Query-to-Intent-Description (Q2ID) task for query understanding.
Unlike existing ranking tasks which leverage the query and its description to compute the relevance of documents, Q2ID is a reverse task which aims to generate a natural language intent description.
We demonstrate the effectiveness of our model by comparing with several state-of-the-art generation models on the Q2ID task.
arXiv Detail & Related papers (2020-08-25T08:56:40Z) - Modeling Information Need of Users in Search Sessions [5.172625611483604]
We propose a sequence-to-sequence based neural architecture that leverages the set of past queries issued by users.
Firstly, we employ our model for predicting the words in the current query that are important and would be retained in the next query.
We show that our intuitive strategy of capturing information need can yield superior performance at these tasks on two large real-world search log datasets.
arXiv Detail & Related papers (2020-01-03T15:25:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.