Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face
- URL: http://arxiv.org/abs/2302.14534v2
- Date: Sun, 24 Mar 2024 14:34:53 GMT
- Title: Spacerini: Plug-and-play Search Engines with Pyserini and Hugging Face
- Authors: Christopher Akiki, Odunayo Ogundepo, Aleksandra Piktus, Xinyu Zhang, Akintunde Oladipo, Jimmy Lin, Martin Potthast,
- Abstract summary: Spacerini is a tool that integrates the Pyserini toolkit for reproducible information retrieval research with Hugging Face.
Spacerini makes state-of-the-art sparse and dense retrieval models more accessible to non-IR practitioners.
- Score: 104.2943594704532
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present Spacerini, a tool that integrates the Pyserini toolkit for reproducible information retrieval research with Hugging Face to enable the seamless construction and deployment of interactive search engines. Spacerini makes state-of-the-art sparse and dense retrieval models more accessible to non-IR practitioners while minimizing deployment effort. This is useful for NLP researchers who want to better understand and validate their research by performing qualitative analyses of training corpora, for IR researchers who want to demonstrate new retrieval models integrated into the growing Pyserini ecosystem, and for third parties reproducing the work of other researchers. Spacerini is open source and includes utilities for loading, preprocessing, indexing, and deploying search engines locally and remotely. We demonstrate a portfolio of 13 search engines created with Spacerini for different use cases.
Related papers
- The Use of Generative Search Engines for Knowledge Work and Complex Tasks [26.583783763090732]
We analyze the types and complexity of tasks that people use Bing Copilot for compared to Bing Search.
Findings indicate that people use the generative search engine for more knowledge work tasks that are higher in cognitive complexity than were commonly done with a traditional search engine.
arXiv Detail & Related papers (2024-03-19T18:17:46Z) - GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training
Data Exploration [97.68234051078997]
We discuss how Pyserini can be integrated with the Hugging Face ecosystem of open-source AI libraries and artifacts.
We include a Jupyter Notebook-based walk through the core interoperability features, available on GitHub.
We present GAIA Search - a search engine built following previously laid out principles, giving access to four popular large-scale text collections.
arXiv Detail & Related papers (2023-06-02T12:09:59Z) - Demystifying Map Space Exploration for NPUs [4.817475305740601]
Map Space Exploration is the problem of finding optimized mappings of a Deep Neural Network (DNN) model.
We do a first-of-its-kind apples-to-apples comparison of search techniques leveraged by different mappers.
Next, we propose two new techniques that can augment existing mappers.
arXiv Detail & Related papers (2022-10-07T17:58:45Z) - CrossBeam: Learning to Search in Bottom-Up Program Synthesis [51.37514793318815]
We propose training a neural model to learn a hands-on search policy for bottom-up synthesis.
Our approach, called CrossBeam, uses the neural model to choose how to combine previously-explored programs into new programs.
We observe that CrossBeam learns to search efficiently, exploring much smaller portions of the program space compared to the state-of-the-art.
arXiv Detail & Related papers (2022-03-20T04:41:05Z) - Searching the Search Space of Vision Transformer [98.96601221383209]
Vision Transformer has shown great visual representation power in substantial vision tasks such as recognition and detection.
We propose to use neural architecture search to automate this process, by searching not only the architecture but also the search space.
We provide design guidelines of general vision transformers with extensive analysis according to the space searching process.
arXiv Detail & Related papers (2021-11-29T17:26:07Z) - Boosting Search Engines with Interactive Agents [25.89284695491093]
This paper presents first steps in designing agents that learn meta-strategies for contextual query refinements.
Agents are empowered with simple but effective search operators to exert fine-grained and transparent control over queries and search results.
arXiv Detail & Related papers (2021-09-01T13:11:57Z) - AutoSpace: Neural Architecture Search with Less Human Interference [84.42680793945007]
Current neural architecture search (NAS) algorithms still require expert knowledge and effort to design a search space for network construction.
We propose a novel differentiable evolutionary framework named AutoSpace, which evolves the search space to an optimal one.
With the learned search space, the performance of recent NAS algorithms can be improved significantly compared with using previously manually designed spaces.
arXiv Detail & Related papers (2021-03-22T13:28:56Z) - A New Neural Search and Insights Platform for Navigating and Organizing
AI Research [56.65232007953311]
We introduce a new platform, AI Research Navigator, that combines classical keyword search with neural retrieval to discover and organize relevant literature.
We give an overview of the overall architecture of the system and of the components for document analysis, question answering, search, analytics, expert search, and recommendations.
arXiv Detail & Related papers (2020-10-30T19:12:25Z) - Mapping Researchers with PeopleMap [11.466062262579495]
PeopleMap creates visual maps for researchers based on their research interests and publications.
Requiring only the researchers' Google Scholar profiles as input, PeopleMap generates and visualizes embeddings for the researchers.
PeopleMap has received positive feedback and enthusiasm for expanding its adoption across Georgia Tech.
arXiv Detail & Related papers (2020-08-31T20:46:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.