Related papers: Neural Methods for Effective, Efficient, and Exposure-Aware Information Retrieval

Neural Methods for Effective, Efficient, and Exposure-Aware Information Retrieval

URL: http://arxiv.org/abs/2012.11685v2
Date: Fri, 19 Mar 2021 21:47:04 GMT
Title: Neural Methods for Effective, Efficient, and Exposure-Aware Information Retrieval
Authors: Bhaskar Mitra
Abstract summary: We present novel neural architectures and methods motivated by the specific needs and challenges of information retrieval. In many real-life IR tasks, the retrieval involves extremely large collections--such as the document index of a commercial Web search engine--containing billions of documents.
Score: 7.3371176873092585
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural networks with deep architectures have demonstrated significant performance improvements in computer vision, speech recognition, and natural language processing. The challenges in information retrieval (IR), however, are different from these other application areas. A common form of IR involves ranking of documents--or short passages--in response to keyword-based queries. Effective IR systems must deal with query-document vocabulary mismatch problem, by modeling relationships between different query and document terms and how they indicate relevance. Models should also consider lexical matches when the query contains rare terms--such as a person's name or a product model number--not seen during training, and to avoid retrieving semantically related but irrelevant results. In many real-life IR tasks, the retrieval involves extremely large collections--such as the document index of a commercial Web search engine--containing billions of documents. Efficient IR methods should take advantage of specialized IR data structures, such as inverted index, to efficiently retrieve from large collections. Given an information need, the IR system also mediates how much exposure an information artifact receives by deciding whether it should be displayed, and where it should be positioned, among other results. Exposure-aware IR systems may optimize for additional objectives, besides relevance, such as parity of exposure for retrieved items and content publishers. In this thesis, we present novel neural architectures and methods motivated by the specific needs and challenges of IR tasks.

Related papers

DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning [4.817888539036794]
DynaSearcher is an innovative search agent enhanced by dynamic knowledge graphs and multi-reward reinforcement learning (RL)<n>We employ a multi-reward RL framework for fine-grained control over training objectives such as retrieval accuracy, efficiency, and response quality.<n> Experimental results demonstrate that our approach achieves state-of-the-art answer accuracy on six multi-hop question answering datasets.
arXiv Detail & Related papers (2025-07-23T09:58:31Z)
Causal Retrieval with Semantic Consideration [6.967392207053045]
We propose CAWAI, a retrieval model that is trained with dual objectives: semantic and causal relations. Our experiments demonstrate that CAWAI outperforms various models on diverse causal retrieval tasks. We also show that CAWAI exhibits strong zero-shot generalization across scientific domain QA tasks.
arXiv Detail & Related papers (2025-04-07T03:04:31Z)
Leveraging Inter-Chunk Interactions for Enhanced Retrieval in Large Language Model-Based Question Answering [12.60063463163226]
IIER captures the internal connections between document chunks by considering three types of interactions: structural, keyword, and semantic. It identifies multiple seed nodes based on the target question and iteratively searches for relevant chunks to gather supporting evidence. It refines the context and reasoning chain, aiding the large language model in reasoning and answer generation.
arXiv Detail & Related papers (2024-08-06T02:39:55Z)
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning [59.07490387145391]
Large language models (LLMs) have demonstrated impressive capabilities in various natural language processing tasks. Their application to information retrieval (IR) tasks is still challenging due to the infrequent occurrence of many IR-specific concepts in natural language. We introduce a novel instruction tuning dataset, INTERS, encompassing 20 tasks across three fundamental IR categories.
arXiv Detail & Related papers (2024-01-12T12:10:28Z)
Plot Retrieval as an Assessment of Abstract Semantic Association [131.58819293115124]
Text pairs in Plot Retrieval have less word overlap and more abstract semantic association. Plot Retrieval can be the benchmark for further research on the semantic association modeling ability of IR models.
arXiv Detail & Related papers (2023-11-03T02:02:43Z)
Large Language Models for Information Retrieval: A Survey [58.30439850203101]
Information retrieval has evolved from term-based methods to its integration with advanced neural models. Recent research has sought to leverage large language models (LLMs) to improve IR systems. We delve into the confluence of LLMs and IR systems, including crucial aspects such as query rewriters, retrievers, rerankers, and readers.
arXiv Detail & Related papers (2023-08-14T12:47:22Z)
Building Interpretable and Reliable Open Information Retriever for New Domains Overnight [67.03842581848299]
Information retrieval is a critical component for many down-stream tasks such as open-domain question answering (QA) We propose an information retrieval pipeline that uses entity/event linking model and query decomposition model to focus more accurately on different information units of the query. We show that, while being more interpretable and reliable, our proposed pipeline significantly improves passage coverages and denotation accuracies across five IR and QA benchmarks.
arXiv Detail & Related papers (2023-08-09T07:47:17Z)
Synergistic Interplay between Search and Large Language Models for Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections. InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z)
Exposing Query Identification for Search Transparency [69.06545074617685]
We explore the feasibility of approximate exposing query identification (EQI) as a retrieval task by reversing the role of queries and documents in two classes of search systems. We derive an evaluation metric to measure the quality of a ranking of exposing queries, as well as conducting an empirical analysis focusing on various practical aspects of approximate EQI.
arXiv Detail & Related papers (2021-10-14T20:19:27Z)
Keyword Extraction for Improved Document Retrieval in Conversational Search [10.798537120200006]
Mixed-initiative conversational search provides enormous advantages. incorporating additional information provided by the user from the conversation poses some challenges. We have collected two conversational keyword extraction datasets and propose an end-to-end document retrieval pipeline incorporating them.
arXiv Detail & Related papers (2021-09-13T13:55:37Z)
Multi-Perspective Semantic Information Retrieval [22.74453301532817]
This work introduces the concept of a Multi-Perspective IR system, which combines multiple deep learning and traditional IR models to better predict the relevance of a query-sentence pair. This work is evaluated on the BioASQ Biomedical IR + QA challenges.
arXiv Detail & Related papers (2020-09-03T21:56:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.