Related papers: Retrieval-Augmented Generation for Service Discovery: Chunking Strategies and Benchmarking

Retrieval-Augmented Generation for Service Discovery: Chunking Strategies and Benchmarking

URL: http://arxiv.org/abs/2505.19310v1
Date: Sun, 25 May 2025 20:49:39 GMT
Title: Retrieval-Augmented Generation for Service Discovery: Chunking Strategies and Benchmarking
Authors: Robin D. Pesl, Jerin G. Mathew, Massimo Mecella, Marco Aiello,
Abstract summary: We analyze the usage of Retrieval Augmented Generation for endpoint discovery and the chunking of state-of-practice OpenAPIs to reduce the input oken length.<n>To further reduce the input token length for the composition prompt and improve endpoint retrieval, we propose (ii) a Discovery Agent that only receives a summary of the most relevant endpoints nd retrieves specification details on demand.
Score: 0.6749750044497732
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Integrating multiple (sub-)systems is essential to create advanced Information Systems. Difficulties mainly arise when integrating dynamic environments, e.g., the integration at design time of not yet existing services. This has been traditionally addressed using a registry that provides the API documentation of the endpoints. Large Language Models have shown to be capable of automatically creating system integrations (e.g., as service composition) based on this documentation but require concise input due to input oken limitations, especially regarding comprehensive API descriptions. Currently, it is unknown how best to preprocess these API descriptions. In the present work, we (i) analyze the usage of Retrieval Augmented Generation for endpoint discovery and the chunking, i.e., preprocessing, of state-of-practice OpenAPIs to reduce the input oken length while preserving the most relevant information. To further reduce the input token length for the composition prompt and improve endpoint retrieval, we propose (ii) a Discovery Agent that only receives a summary of the most relevant endpoints nd retrieves specification details on demand. We evaluate RAG for endpoint discovery using (iii) a proposed novel service discovery benchmark SOCBench-D representing a general setting across numerous domains and the real-world RestBench enchmark, first, for the different chunking possibilities and parameters measuring the endpoint retrieval accuracy. Then, we assess the Discovery Agent using the same test data set. The prototype shows how to successfully employ RAG for endpoint discovery to reduce the token count. Our experiments show that endpoint-based approaches outperform naive chunking methods for preprocessing. Relying on an agent significantly improves precision while being prone to decrease recall, disclosing the need for further reasoning capabilities.

Related papers

MetaGen Blended RAG: Higher Accuracy for Domain-Specific Q&A Without Fine-Tuning [0.0]
We propose an approach for Enterprise Search that focuses on enhancing the retriever for a domain-specific corpus through hybrid query indexes and metadata enrichment.<n>This 'MetaGen Blended RAG' method constructs a metadata generation pipeline using key concepts, topics, and acronyms, and then creates a metadata-enriched hybrid index with boosted search queries.<n>On the PubMedQA benchmark for the biomedical domain, the proposed method achieves 82% retrieval accuracy and 77% RAG accuracy, surpassing all previous RAG accuracy results without fine-tuning and sets a new benchmark for zero-shot results while outperforming much larger models like GPT3.5.
arXiv Detail & Related papers (2025-05-23T17:18:45Z)
Purifying, Labeling, and Utilizing: A High-Quality Pipeline for Small Object Detection [83.90563802153707]
PLUSNet is a high-quality Small object detection framework.<n>It comprises three components: the Hierarchical Feature (HFP) framework for purifying upstream features, the Multiple Criteria Label Assignment (MCLA) for improving the quality of midstream training samples, and the Frequency Decoupled Head (FDHead) for more effectively exploiting information to accomplish downstream tasks.
arXiv Detail & Related papers (2025-04-29T10:11:03Z)
Advanced System Integration: Analyzing OpenAPI Chunking for Retrieval-Augmented Generation [0.6749750044497732]
We analyze the usage of Retrieval Augmented Generation for endpoint discovery and chunking of OpenAPIs.<n>We propose a Discovery Agent that only receives a summary of the most relevant endpoints and retrieves details on demand.<n>Our experiments show that for preprocessing, LLM-based and format-specific approaches outperform na"ive chunking methods.
arXiv Detail & Related papers (2024-11-29T16:09:43Z)
Writing in the Margins: Better Inference Pattern for Long Context Retrieval [0.9404560827144429]
Writing in the Margins (WiM) is an inference pattern designed to optimize the handling of long input sequences in retrieval-oriented tasks. We show how the proposed pattern fits into an interactive retrieval design that provides end-users with ongoing updates about the progress of context processing.
arXiv Detail & Related papers (2024-08-27T09:34:38Z)
CoIR: A Comprehensive Benchmark for Code Information Retrieval Models [52.61625841028781]
COIR (Code Information Retrieval Benchmark) is a benchmark specifically designed to assess code retrieval capabilities.<n>COIR comprises ten meticulously curated code datasets, spanning eight distinctive retrieval tasks across seven diverse domains.<n>We evaluate nine widely used retrieval models using COIR, uncovering significant difficulties in performing code retrieval tasks even with state-of-the-art systems.
arXiv Detail & Related papers (2024-07-03T07:58:20Z)
Adaptive REST API Testing with Reinforcement Learning [54.68542517176757]
Current testing tools lack efficient exploration mechanisms, treating all operations and parameters equally. Current tools struggle when response schemas are absent in the specification or exhibit variants. We present an adaptive REST API testing technique incorporates reinforcement learning to prioritize operations during exploration.
arXiv Detail & Related papers (2023-09-08T20:27:05Z)
Building Interpretable and Reliable Open Information Retriever for New Domains Overnight [67.03842581848299]
Information retrieval is a critical component for many down-stream tasks such as open-domain question answering (QA) We propose an information retrieval pipeline that uses entity/event linking model and query decomposition model to focus more accurately on different information units of the query. We show that, while being more interpretable and reliable, our proposed pipeline significantly improves passage coverages and denotation accuracies across five IR and QA benchmarks.
arXiv Detail & Related papers (2023-08-09T07:47:17Z)
Spatial-Temporal Graph Enhanced DETR Towards Multi-Frame 3D Object Detection [54.041049052843604]
We present STEMD, a novel end-to-end framework that enhances the DETR-like paradigm for multi-frame 3D object detection. First, to model the inter-object spatial interaction and complex temporal dependencies, we introduce the spatial-temporal graph attention network. Finally, it poses a challenge for the network to distinguish between the positive query and other highly similar queries that are not the best match.
arXiv Detail & Related papers (2023-07-01T13:53:14Z)
Global Pointer: Novel Efficient Span-based Approach for Named Entity Recognition [7.226094340165499]
Named entity recognition (NER) task aims at identifying entities from a piece of text that belong to predefined semantic types. The state-of-the-art solutions for flat entities NER commonly suffer from capturing the fine-grained semantic information in underlying texts. We propose a novel span-based NER framework, namely Global Pointer (GP), that leverages the relative positions through a multiplicative attention mechanism.
arXiv Detail & Related papers (2022-08-05T09:19:46Z)
Predictive Object-Centric Process Monitoring [10.219621548854343]
This thesis shows that a prediction method utilizing Generative Adversarial Networks (GAN), Long Short-Term Memory (LSTM), and Sequence to Sequence models (Seq2seq) can be augmented with the rich data contained in OCEL. This thesis provides a web interface to predict the next sequence of activities from user input.
arXiv Detail & Related papers (2022-07-20T16:30:47Z)
Towards Zero and Few-shot Knowledge-seeking Turn Detection in Task-orientated Dialogue Systems [40.74708947185302]
This work focuses on identifying user requests that are out of the scope of domain APIs. We propose a novel method, REDE, based on adaptive representation learning and density estimation. We demonstrate REDE's competitive performance on DSTC9 data and our newly collected test set.
arXiv Detail & Related papers (2021-09-18T03:33:19Z)
Target-Aware Object Discovery and Association for Unsupervised Video Multi-Object Segmentation [79.6596425920849]
This paper addresses the task of unsupervised video multi-object segmentation. We introduce a novel approach for more accurate and efficient unseen-temporal segmentation. We evaluate the proposed approach on DAVIS$_17$ and YouTube-VIS, and the results demonstrate that it outperforms state-of-the-art methods both in segmentation accuracy and inference speed.
arXiv Detail & Related papers (2021-04-10T14:39:44Z)
PyODDS: An End-to-end Outlier Detection System with Automated Machine Learning [55.32009000204512]
We present PyODDS, an automated end-to-end Python system for Outlier Detection with Database Support. Specifically, we define the search space in the outlier detection pipeline, and produce a search strategy within the given search space. It also provides unified interfaces and visualizations for users with or without data science or machine learning background.
arXiv Detail & Related papers (2020-03-12T03:30:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.