Related papers: Retrieving and Ranking Relevant JavaScript Technologies from Web Repositories

Retrieving and Ranking Relevant JavaScript Technologies from Web Repositories

URL: http://arxiv.org/abs/2205.15086v1
Date: Mon, 30 May 2022 13:26:05 GMT
Title: Retrieving and Ranking Relevant JavaScript Technologies from Web Repositories
Authors: Hernan C. Vazquez, J. Andres Diaz Pace, Claudia Marcos and Santiago Vidal
Abstract summary: We propose a two-phase approach for assisting developers to retrieve and rank JS technologies. The first-phase (ST-Retrieval) uses a meta-search technique for collecting JS technologies that meet the developer's needs. The second-phase (called ST-Rank), relies on a machine learning technique to infer, based on criteria used by other projects in the Web.
Score: 0.3441021278275805
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The selection of software technologies is an important but complex task. We consider developers of JavaScript (JS) applications, for whom the assessment of JS libraries has become difficult and time-consuming due to the growing number of technology options available. A common strategy is to browse software repositories via search engines (e.g., NPM, or Google), although it brings some problems. First, given a technology need, the engines might return a long list of results, which often causes information overload issues. Second, the results should be ranked according to criteria of interest for the developer. However, deciding how to weight these criteria to make a decision is not straightforward. In this work, we propose a two-phase approach for assisting developers to retrieve and rank JS technologies in a semi-automated fashion. The first-phase (ST-Retrieval) uses a meta-search technique for collecting JS technologies that meet the developer's needs. The second-phase (called ST-Rank), relies on a machine learning technique to infer, based on criteria used by other projects in the Web, a ranking of the output of ST-Retrieval. We evaluated our approach with NPM and obtained satisfactory results in terms of the accuracy of the technologies retrieved and the order in which they were ranked.

Related papers

Generative Pre-trained Ranking Model with Over-parameterization at Web-Scale (Extended Abstract) [73.57710917145212]
Learning to rank is widely employed in web searches to prioritize pertinent webpages based on input queries. We propose a emphulineGenerative ulineSemi-ulineSupervised ulinePre-trained (GS2P) model to address these challenges. We conduct extensive offline experiments on both a publicly available dataset and a real-world dataset collected from a large-scale search engine.
arXiv Detail & Related papers (2024-09-25T03:39:14Z)
Search Engines, LLMs or Both? Evaluating Information Seeking Strategies for Answering Health Questions [3.8984586307450093]
We compare different web search engines, Large Language Models (LLMs) and retrieval-augmented (RAG) approaches. We observed that the quality of webpages potentially responding to a health question does not decline as we navigate further down the ranked lists. According to our evaluation, web engines are less accurate than LLMs in finding correct answers to health questions.
arXiv Detail & Related papers (2024-07-17T10:40:39Z)
MeMemo: On-device Retrieval Augmentation for Private and Personalized Text Generation [36.50320728984937]
We introduce MeMemo, the first open-source JavaScript toolkit that adapts the state-of-the-art approximate nearest neighbor search technique HNSW to browser environments. MeMemo enables exciting new design and research opportunities, such as private and personalized content creation and interactive prototyping.
arXiv Detail & Related papers (2024-07-02T06:08:55Z)
Tree Search for Language Model Agents [69.43007235771383]
We propose an inference-time search algorithm for LM agents to perform exploration and multi-step planning in interactive web environments. Our approach is a form of best-first tree search that operates within the actual environment space. It is the first tree search algorithm for LM agents that shows effectiveness on realistic web tasks.
arXiv Detail & Related papers (2024-07-01T17:07:55Z)
STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases [93.96463520716759]
We develop STARK, a large-scale Semi-structure retrieval benchmark on Textual and Knowledge Bases. Our benchmark covers three domains: product search, academic paper search, and queries in precision medicine. We design a novel pipeline to synthesize realistic user queries that integrate diverse relational information and complex textual properties.
arXiv Detail & Related papers (2024-04-19T22:54:54Z)
Enhanced Facet Generation with LLM Editing [5.4327243200369555]
In information retrieval, facet identification of a user query is an important task. Previous studies can enhance facet prediction by leveraging retrieved documents and related queries obtained through a search engine. However, there are challenges in extending it to other applications when a search engine operates as part of the model.
arXiv Detail & Related papers (2024-03-25T00:43:44Z)
Investigating Technology Usage Span by Analyzing Users' Q&A Traces in Stack Overflow [5.391288287087521]
It is crucial for software developers to find technologies that have a high usage span. C# and Java programming languages have a high usage span, followed by JavaScript. Our study also exposes the emerging technologies such as SwiftUI,.NET-6.0, Visual Studio 2022, and Blazor WebAssembly framework.
arXiv Detail & Related papers (2023-12-05T23:17:48Z)
GEO: Generative Engine Optimization [50.45232692363787]
We formalize the unified framework of generative engines (GEs) GEs use large language models (LLMs) to gather and summarize information to answer user queries. Generative Engines typically satisfy queries by synthesizing information from multiple sources and summarizing them. We introduce Generative Engine Optimization (GEO), the first novel paradigm to aid content creators in improving their content visibility in generative engine responses.
arXiv Detail & Related papers (2023-11-16T10:06:09Z)
Large Language Models are Zero-Shot Rankers for Recommender Systems [76.02500186203929]
This work aims to investigate the capacity of large language models (LLMs) to act as the ranking model for recommender systems. We show that LLMs have promising zero-shot ranking abilities but struggle to perceive the order of historical interactions. We demonstrate that these issues can be alleviated using specially designed prompting and bootstrapping strategies.
arXiv Detail & Related papers (2023-05-15T17:57:39Z)
NeuralSearchX: Serving a Multi-billion-parameter Reranker for Multilingual Metasearch at a Low Cost [4.186775801993103]
We describe NeuralSearchX, a metasearch engine based on a multi-purpose large reranking model to merge results and highlight sentences. We show that our design choices led to a much cost-effective system with competitive QPS while having close to state-of-the-art results on a wide range of public benchmarks.
arXiv Detail & Related papers (2022-10-26T16:36:53Z)
Efficient Neural Query Auto Completion [17.58784759652327]
Three major challenges are observed for a query auto completion system. Traditional QAC systems rely on handcrafted features such as the query candidate frequency in search logs. We propose an efficient neural QAC system with effective context modeling to overcome these challenges.
arXiv Detail & Related papers (2020-08-06T21:28:36Z)
Covidex: Neural Ranking Models and Keyword Search Infrastructure for the COVID-19 Open Research Dataset [87.47567807116204]
Covidex is a search engine that exploits the latest neural ranking models. It provides access to the COVID-19 Open Research dataset curated by the Allen Institute for AI.
arXiv Detail & Related papers (2020-07-14T16:26:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.