Bamboo: LLM-Driven Discovery of API-Permission Mappings in the Android Framework
- URL: http://arxiv.org/abs/2510.04078v1
- Date: Sun, 05 Oct 2025 07:50:44 GMT
- Title: Bamboo: LLM-Driven Discovery of API-Permission Mappings in the Android Framework
- Authors: Han Hu, Wei Minn, Yonghui Liu, Jiakun Liu, Ferdian Thung, Terry Yue Zhuo, Lwin Khin Shar, Debin Gao, David Lo,
- Abstract summary: The official API documentation by Android chronically suffers from imprecision and incompleteness.<n>Recent efforts in improving permission specification primarily leverage static and dynamic code analyses to uncover API-permission mappings.<n>This paper introduces a pioneering approach utilizing large language models (LLMs) for a systematic examination of API-permission mappings.
- Score: 22.145558720584713
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The permission mechanism in the Android Framework is integral to safeguarding the privacy of users by managing users' and processes' access to sensitive resources and operations. As such, developers need to be equipped with an in-depth understanding of API permissions to build robust Android apps. Unfortunately, the official API documentation by Android chronically suffers from imprecision and incompleteness, causing developers to spend significant effort to accurately discern necessary permissions. This potentially leads to incorrect permission declarations in Android app development, potentially resulting in security violations and app failures. Recent efforts in improving permission specification primarily leverage static and dynamic code analyses to uncover API-permission mappings within the Android framework. Yet, these methodologies encounter substantial shortcomings, including poor adaptability to Android SDK and Framework updates, restricted code coverage, and a propensity to overlook essential API-permission mappings in intricate codebases. This paper introduces a pioneering approach utilizing large language models (LLMs) for a systematic examination of API-permission mappings. In addition to employing LLMs, we integrate a dual-role prompting strategy and an API-driven code generation approach into our mapping discovery pipeline, resulting in the development of the corresponding tool, \tool{}. We formulate three research questions to evaluate the efficacy of \tool{} against state-of-the-art baselines, assess the completeness of official SDK documentation, and analyze the evolution of permission-required APIs across different SDK releases. Our experimental results reveal that \tool{} identifies 2,234, 3,552, and 4,576 API-permission mappings in Android versions 6, 7, and 10 respectively, substantially outprforming existing baselines.
Related papers
- Framework-Aware Code Generation with API Knowledge Graph-Constructed Data: A Study on HarmonyOS [52.483888557864326]
APIKG4SYN is a framework designed to exploit API knowledge graphs for the construction of API-oriented question-code pairs.<n>We build the first benchmark for HarmonyOS code generation using APIKG4SYN.
arXiv Detail & Related papers (2025-11-29T08:13:54Z) - A Comprehensive Analysis of Evolving Permission Usage in Android Apps: Trends, Threats, and Ecosystem Insights [9.172402449557264]
Despite official Android platform documentation on proper permission usage, there are still many cases of permission abuse.<n>This study provides a comprehensive analysis of the Android permission landscape.<n>By distinguishing between benign and malicious applications, we uncover developers' evolving strategies.
arXiv Detail & Related papers (2025-08-04T02:54:10Z) - ExploraCoder: Advancing code generation for multiple unseen APIs via planning and chained exploration [70.26807758443675]
ExploraCoder is a training-free framework that empowers large language models to invoke unseen APIs in code solution.<n> Experimental results demonstrate that ExploraCoder significantly improves performance for models lacking prior API knowledge.
arXiv Detail & Related papers (2024-12-06T19:00:15Z) - A Comprehensive Framework for Evaluating API-oriented Code Generation in Large Language Models [14.665460257371164]
Large language models (LLMs) like GitHub Copilot and ChatGPT have emerged as powerful tools for code generation.
We propose AutoAPIEval, a framework designed to evaluate the capabilities of LLMs in API-oriented code generation.
arXiv Detail & Related papers (2024-09-23T17:22:09Z) - A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How [53.65636914757381]
API suggestion is a critical task in modern software development.
Recent advancements in large code models (LCMs) have shown promise in the API suggestion task.
arXiv Detail & Related papers (2024-09-20T03:12:35Z) - FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability.
Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request.
We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z) - A Large-scale Investigation of Semantically Incompatible APIs behind Compatibility Issues in Android Apps [13.24503570840706]
We conduct a large-scale discovery of incompatible APIs in the Android Open Source Project (AOSP)
We propose a unified framework to detect incompatible APIs, especially for semantic changes.
Our approach detects 5,481 incompatible APIs spanning from version 4 to version 33.
arXiv Detail & Related papers (2024-06-25T10:12:37Z) - SoAy: A Solution-based LLM API-using Methodology for Academic Information Seeking [59.59923482238048]
SoAy is a solution-based LLM API-using methodology for academic information seeking.<n>It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence.<n>Results show a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines.
arXiv Detail & Related papers (2024-05-24T02:44:14Z) - Lightweight Syntactic API Usage Analysis with UCov [0.0]
We present a novel conceptual framework designed to assist library maintainers in understanding the interactions allowed by their APIs.
These customizable models enable library maintainers to improve their design ahead of release, reducing friction during evolution.
We implement these models for Java libraries in a new tool UCov and demonstrate its capabilities on three libraries exhibiting diverse styles of interaction.
arXiv Detail & Related papers (2024-02-19T10:33:41Z) - Investigating Software Developers' Challenges for Android Permissions in
Stack Overflow [0.9821874476902969]
This study investigates the permission-related challenges developers face on the crowdsourcing platform Stack Overflow.
We conducted qualitative and quantitative analyses on 3,327 permission-related questions and 3,271 corresponding answers.
Our study indicates the need for clear, consistent documentation to guide the use of permissions and reduce developer misunderstanding.
arXiv Detail & Related papers (2023-10-31T18:37:03Z) - Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries.
We propose a novel framework that emulates the process of programmers writing private code.
We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z) - Evaluating Embedding APIs for Information Retrieval [51.24236853841468]
We evaluate the capabilities of existing semantic embedding APIs on domain generalization and multilingual retrieval.
We find that re-ranking BM25 results using the APIs is a budget-friendly approach and is most effective in English.
For non-English retrieval, re-ranking still improves the results, but a hybrid model with BM25 works best, albeit at a higher cost.
arXiv Detail & Related papers (2023-05-10T16:40:52Z) - OpenAPI Specification Extended Security Scheme: A method to reduce the prevalence of Broken Object Level Authorization [0.0]
API Security is a topic for concern given the absence of standardized authorization in the OpenAPI standard.
This paper examines the number one vulnerability in API Security: Broken Object Level Authorization(BOLA)
arXiv Detail & Related papers (2022-12-13T14:28:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.