Learning to Retrieve for Job Matching
- URL: http://arxiv.org/abs/2402.13435v1
- Date: Wed, 21 Feb 2024 00:05:25 GMT
- Title: Learning to Retrieve for Job Matching
- Authors: Jianqiang Shen, Yuchin Juan, Shaobo Zhang, Ping Liu, Wen Pu, Sriram
Vasudevan, Qingquan Song, Fedor Borisyuk, Kay Qianqi Shen, Haichao Wei,
Yunxiang Ren, Yeou S. Chiou, Sicong Kuang, Yuan Yin, Ben Zheng, Muchen Wu,
Shaghayegh Gharghabi, Xiaoqing Wang, Huichao Xue, Qi Guo, Daniel Hewlett,
Luke Simon, Liangjie Hong, Wenjing Zhang
- Abstract summary: We discuss applying learning-to-retrieve technology to enhance LinkedIns job search and recommendation systems.
We leverage confirmed hire data to construct a graph that evaluates a seeker's qualification for a job, and utilize learned links for retrieval.
In addition to a solution based on a conventional inverted index, we developed an on-GPU solution capable of supporting both KNN and term matching efficiently.
- Score: 22.007634436648427
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Web-scale search systems typically tackle the scalability challenge with a
two-step paradigm: retrieval and ranking. The retrieval step, also known as
candidate selection, often involves extracting standardized entities, creating
an inverted index, and performing term matching for retrieval. Such traditional
methods require manual and time-consuming development of query models. In this
paper, we discuss applying learning-to-retrieve technology to enhance LinkedIns
job search and recommendation systems. In the realm of promoted jobs, the key
objective is to improve the quality of applicants, thereby delivering value to
recruiter customers. To achieve this, we leverage confirmed hire data to
construct a graph that evaluates a seeker's qualification for a job, and
utilize learned links for retrieval. Our learned model is easy to explain,
debug, and adjust. On the other hand, the focus for organic jobs is to optimize
seeker engagement. We accomplished this by training embeddings for personalized
retrieval, fortified by a set of rules derived from the categorization of
member feedback. In addition to a solution based on a conventional inverted
index, we developed an on-GPU solution capable of supporting both KNN and term
matching efficiently.
Related papers
- Learning to Rank for Multiple Retrieval-Augmented Models through Iterative Utility Maximization [21.115495457454365]
This paper investigates the design of a unified search engine to serve multiple retrieval-augmented generation (RAG) agents.
We introduce an iterative approach where the search engine generates retrieval results for these RAG agents and gathers feedback on the quality of the retrieved documents during an offline phase.
We adapt this approach to an online setting, allowing the search engine to refine its behavior based on real-time individual agents feedback.
arXiv Detail & Related papers (2024-10-13T17:53:50Z) - Query-oriented Data Augmentation for Session Search [71.84678750612754]
We propose query-oriented data augmentation to enrich search logs and empower the modeling.
We generate supplemental training pairs by altering the most important part of a search context.
We develop several strategies to alter the current query, resulting in new training data with varying degrees of difficulty.
arXiv Detail & Related papers (2024-07-04T08:08:33Z) - List-aware Reranking-Truncation Joint Model for Search and
Retrieval-augmented Generation [80.12531449946655]
We propose a Reranking-Truncation joint model (GenRT) that can perform the two tasks concurrently.
GenRT integrates reranking and truncation via generative paradigm based on encoder-decoder architecture.
Our method achieves SOTA performance on both reranking and truncation tasks for web search and retrieval-augmented LLMs.
arXiv Detail & Related papers (2024-02-05T06:52:53Z) - Learning to Rank in Generative Retrieval [62.91492903161522]
Generative retrieval aims to generate identifier strings of relevant passages as the retrieval target.
We propose a learning-to-rank framework for generative retrieval, dubbed LTRGR.
This framework only requires an additional learning-to-rank training phase to enhance current generative retrieval systems.
arXiv Detail & Related papers (2023-06-27T05:48:14Z) - Unified Embedding Based Personalized Retrieval in Etsy Search [0.206242362470764]
We propose learning a unified embedding model incorporating graph, transformer and term-based embeddings end to end.
Our personalized retrieval model significantly improves the overall search experience, as measured by a 5.58% increase in search purchase rate and a 2.63% increase in site-wide conversion rate.
arXiv Detail & Related papers (2023-06-07T23:24:50Z) - Recommender Systems with Generative Retrieval [58.454606442670034]
We propose a novel generative retrieval approach, where the retrieval model autoregressively decodes the identifiers of the target candidates.
To that end, we create semantically meaningful of codewords to serve as a Semantic ID for each item.
We show that recommender systems trained with the proposed paradigm significantly outperform the current SOTA models on various datasets.
arXiv Detail & Related papers (2023-05-08T21:48:17Z) - Task Oriented Conversational Modelling With Subjective Knowledge [0.0]
DSTC-11 proposes a three stage pipeline consisting of knowledge seeking turn detection, knowledge selection and response generation.
We propose entity retrieval methods which result in an accurate and faster knowledge search.
Preliminary results show a 4 % improvement in exact match score on knowledge selection task.
arXiv Detail & Related papers (2023-03-30T20:23:49Z) - Incorporating Relevance Feedback for Information-Seeking Retrieval using
Few-Shot Document Re-Ranking [56.80065604034095]
We introduce a kNN approach that re-ranks documents based on their similarity with the query and the documents the user considers relevant.
To evaluate our different integration strategies, we transform four existing information retrieval datasets into the relevance feedback scenario.
arXiv Detail & Related papers (2022-10-19T16:19:37Z) - CorpusBrain: Pre-train a Generative Retrieval Model for
Knowledge-Intensive Language Tasks [62.22920673080208]
Single-step generative model can dramatically simplify the search process and be optimized in end-to-end manner.
We name the pre-trained generative retrieval model as CorpusBrain as all information about the corpus is encoded in its parameters without the need of constructing additional index.
arXiv Detail & Related papers (2022-08-16T10:22:49Z) - Detection, Disambiguation, Re-ranking: Autoregressive Entity Linking as
a Multi-Task Problem [46.028180604304985]
We propose an autoregressive entity linking model, that is trained with two auxiliary tasks, and learns to re-rank generated samples at inference time.
We show through ablation studies that each of the two auxiliary tasks increases performance, and that re-ranking is an important factor to the increase.
arXiv Detail & Related papers (2022-04-12T17:55:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.