Related papers: Let's Discover More API Relations: A Large Language Model-based AI Chain for Unsupervised API Relation Inference

Let's Discover More API Relations: A Large Language Model-based AI Chain for Unsupervised API Relation Inference

URL: http://arxiv.org/abs/2311.01266v1
Date: Thu, 2 Nov 2023 14:25:00 GMT
Title: Let's Discover More API Relations: A Large Language Model-based AI Chain for Unsupervised API Relation Inference
Authors: Qing Huang, Yanbang Sun, Zhenchang Xing, Yuanlong Cao, Jieshan Chen, Xiwei Xu, Huan Jin, Jiaxing Lu
Abstract summary: We propose utilizing large language models (LLMs) as a neural knowledge base for API relation inference. This approach leverages the entire Web used to pre-train LLMs as a knowledge base and is insensitive to the context and complexity of input texts. We achieve an average F1 value of 0.76 under the three datasets, significantly higher than the state-of-the-art method's average F1 value of 0.40.
Score: 19.05884373802318
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: APIs have intricate relations that can be described in text and represented as knowledge graphs to aid software engineering tasks. Existing relation extraction methods have limitations, such as limited API text corpus and affected by the characteristics of the input text.To address these limitations, we propose utilizing large language models (LLMs) (e.g., GPT-3.5) as a neural knowledge base for API relation inference. This approach leverages the entire Web used to pre-train LLMs as a knowledge base and is insensitive to the context and complexity of input texts. To ensure accurate inference, we design our analytic flow as an AI Chain with three AI modules: API FQN Parser, API Knowledge Extractor, and API Relation Decider. The accuracy of the API FQN parser and API Relation Decider module are 0.81 and 0.83, respectively. Using the generative capacity of the LLM and our approach's inference capability, we achieve an average F1 value of 0.76 under the three datasets, significantly higher than the state-of-the-art method's average F1 value of 0.40. Compared to CoT-based method, our AI Chain design improves the inference reliability by 67%, and the AI-crowd-intelligence strategy enhances the robustness of our approach by 26%.

Related papers

Knowledge Graph Completion with Relation-Aware Anchor Enhancement [50.50944396454757]
We propose a relation-aware anchor enhanced knowledge graph completion method (RAA-KGC) We first generate anchor entities within the relation-aware neighborhood of the head entity. Then, by pulling the query embedding towards the neighborhoods of the anchors, it is tuned to be more discriminative for target entity matching.
arXiv Detail & Related papers (2025-04-08T15:22:08Z)
Reinforcement Learning for Long-Horizon Interactive LLM Agents [56.9860859585028]
Interactive digital agents (IDAs) leverage APIs of stateful digital environments to perform tasks in response to user requests. We present a reinforcement learning (RL) approach that trains IDAs directly in their target environments. We derive LOOP, a data- and memory-efficient variant of proximal policy optimization.
arXiv Detail & Related papers (2025-02-03T18:35:42Z)
Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation [81.18701211912779]
We introduce an Adaptive Multi-Aspect Retrieval-augmented over KGs (Amar) framework. This method retrieves knowledge including entities, relations, and subgraphs, and converts each piece of retrieved text into prompt embeddings. Our method has achieved state-of-the-art performance on two common datasets.
arXiv Detail & Related papers (2024-12-24T16:38:04Z)
Are Human Rules Necessary? Generating Reusable APIs with CoT Reasoning and In-Context Learning [14.351476383642016]
We propose a novel approach, named Code2API, to automatically perform APIzation for Stack Overflow code snippets. Code2API does not require additional model training or any manual crafting rules. It can be easily deployed on personal computers without relying on other external tools.
arXiv Detail & Related papers (2024-05-06T14:22:17Z)
APIGen: Generative API Method Recommendation [16.541442856821]
APIGen is a generative API recommendation approach through enhanced in-context learning (ICL) APIGen searches for similar posts to the programming queries from the lexical, syntactical, and semantic perspectives. With the reasoning process, APIGen makes recommended APIs better meet the programming requirement of queries.
arXiv Detail & Related papers (2024-01-29T02:35:42Z)
In-Context Language Learning: Architectures and Algorithms [73.93205821154605]
We study ICL through the lens of a new family of model problems we term in context language learning (ICLL) We evaluate a diverse set of neural sequence models on regular ICLL tasks.
arXiv Detail & Related papers (2024-01-23T18:59:21Z)
Leveraging Large Language Models to Improve REST API Testing [51.284096009803406]
RESTGPT takes as input an API specification, extracts machine-interpretable rules, and generates example parameter values from natural-language descriptions in the specification. Our evaluations indicate that RESTGPT outperforms existing techniques in both rule extraction and value generation.
arXiv Detail & Related papers (2023-12-01T19:53:23Z)
Let's Chat to Find the APIs: Connecting Human, LLM and Knowledge Graph through AI Chain [21.27256145010061]
We propose a knowledge-guided query clarification approach for API recommendation. We use a large language model (LLM) guided by knowledge graph (KG) to overcome out-of-vocabulary (OOV) failures. Our approach is designed as an AI chain that consists of five steps, each handled by a separate LLM call.
arXiv Detail & Related papers (2023-09-28T03:31:01Z)
Improving Open Information Extraction with Large Language Models: A Study on Demonstration Uncertainty [52.72790059506241]
Open Information Extraction (OIE) task aims at extracting structured facts from unstructured text. Despite the potential of large language models (LLMs) like ChatGPT as a general task solver, they lag behind state-of-the-art (supervised) methods in OIE tasks.
arXiv Detail & Related papers (2023-09-07T01:35:24Z)
Interpretability at Scale: Identifying Causal Mechanisms in Alpaca [62.65877150123775]
We use Boundless DAS to efficiently search for interpretable causal structure in large language models while they follow instructions. Our findings mark a first step toward faithfully understanding the inner-workings of our ever-growing and most widely deployed language models.
arXiv Detail & Related papers (2023-05-15T17:15:40Z)
On the Effectiveness of Pretrained Models for API Learning [8.788509467038743]
Developers frequently use APIs to implement certain functionalities, such as parsing Excel Files, reading and writing text files line by line, etc. Developers can greatly benefit from automatic API usage sequence generation based on natural language queries for building applications in a faster and cleaner manner. Existing approaches utilize information retrieval models to search for matching API sequences given a query or use RNN-based encoder-decoder to generate API sequences.
arXiv Detail & Related papers (2022-04-05T20:33:24Z)
Holistic Combination of Structural and Textual Code Information for Context based API Recommendation [28.74546332681778]
We propose a novel API recommendation approach called APIRec-CST (API Recommendation by Combining Structural and Textual code information) APIRec-CST is a deep learning model that combines the API usage with the text information in source code based on an API Graph Network and a Code Token Network. We show that our approach achieves a top-5, top-10 accuracy and MRR of 60.3%, 81.5%, 87.7% and 69.4%, and significantly outperforms an existing graph-based statistical approach.
arXiv Detail & Related papers (2020-10-15T04:40:42Z)
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks. Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks. We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z)
Logic-Guided Data Augmentation and Regularization for Consistent Question Answering [55.05667583529711]
This paper addresses the problem of improving the accuracy and consistency of responses to comparison questions. Our method leverages logical and linguistic knowledge to augment labeled training data and then uses a consistency-based regularizer to train the model.
arXiv Detail & Related papers (2020-04-21T17:03:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.