Evaluating Large Language Models in detecting Secrets in Android Apps
- URL: http://arxiv.org/abs/2510.18601v1
- Date: Tue, 21 Oct 2025 12:59:39 GMT
- Title: Evaluating Large Language Models in detecting Secrets in Android Apps
- Authors: Marco Alecci, Jordan Samhi, Tegawendé F. Bissyandé, Jacques Klein,
- Abstract summary: Mobile apps often embed authentication secrets, such as API keys, tokens, and client IDs, to integrate with cloud services.<n>Developers often hardcode these credentials into Android apps, exposing them to extraction through reverse engineering.<n>We propose SecretLoc, an LLM-based approach for detecting hardcoded secrets in Android apps.
- Score: 11.963737068221436
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mobile apps often embed authentication secrets, such as API keys, tokens, and client IDs, to integrate with cloud services. However, developers often hardcode these credentials into Android apps, exposing them to extraction through reverse engineering. Once compromised, adversaries can exploit secrets to access sensitive data, manipulate resources, or abuse APIs, resulting in significant security and financial risks. Existing detection approaches, such as regex-based analysis, static analysis, and machine learning, are effective for identifying known patterns but are fundamentally limited: they require prior knowledge of credential structures, API signatures, or training data. In this paper, we propose SecretLoc, an LLM-based approach for detecting hardcoded secrets in Android apps. SecretLoc goes beyond pattern matching; it leverages contextual and structural cues to identify secrets without relying on predefined patterns or labeled training sets. Using a benchmark dataset from the literature, we demonstrate that SecretLoc detects secrets missed by regex-, static-, and ML-based methods, including previously unseen types of secrets. In total, we discovered 4828 secrets that were undetected by existing approaches, discovering more than 10 "new" types of secrets, such as OpenAI API keys, GitHub Access Tokens, RSA private keys, and JWT tokens, and more. We further extend our analysis to newly crawled apps from Google Play, where we uncovered and responsibly disclosed additional hardcoded secrets. Across a set of 5000 apps, we detected secrets in 2124 apps (42.5%), several of which were confirmed and remediated by developers after we contacted them. Our results reveal a dual-use risk: if analysts can uncover these secrets with LLMs, so can attackers. This underscores the urgent need for proactive secret management and stronger mitigation practices across the mobile ecosystem.
Related papers
- Towards Copyright Protection for Knowledge Bases of Retrieval-augmented Language Models via Reasoning [58.57194301645823]
Large language models (LLMs) are increasingly integrated into real-world personalized applications.<n>The valuable and often proprietary nature of the knowledge bases used in RAG introduces the risk of unauthorized usage by adversaries.<n>Existing methods that can be generalized as watermarking techniques to protect these knowledge bases typically involve poisoning or backdoor attacks.<n>We propose name for harmless' copyright protection of knowledge bases.
arXiv Detail & Related papers (2025-02-10T09:15:56Z) - Cryptanalysis via Machine Learning Based Information Theoretic Metrics [58.96805474751668]
We propose two novel applications of machine learning (ML) algorithms to perform cryptanalysis on any cryptosystem.<n>These algorithms can be readily applied in an audit setting to evaluate the robustness of a cryptosystem.<n>We show that our classification model correctly identifies the encryption schemes that are not IND-CPA secure, such as DES, RSA, and AES ECB, with high accuracy.
arXiv Detail & Related papers (2025-01-25T04:53:36Z) - How Far are App Secrets from Being Stolen? A Case Study on Android [9.880229355258875]
Android apps can hold secret strings of themselves such as cloud service credentials or encryption keys.<n>Leakage of such secret strings can induce unprecedented consequences like monetary losses or leakage of user private information.<n>This study characterizes app secret leakage issues based on 575 potential app secrets sampled from 14,665 popular Android apps on Google Play.
arXiv Detail & Related papers (2025-01-14T03:15:31Z) - Automatically Detecting Checked-In Secrets in Android Apps: How Far Are We? [4.619114660081147]
Developers often overlook the proper storage of such secrets, opting to put them directly into their projects.<n>Checked-in secrets are checked into the projects and can be easily extracted and exploited by malicious adversaries.<n>Unlike open-source projects, the lack of direct access to the source code and the presence of obfuscation complicates the checked-in secret detection for Android apps.
arXiv Detail & Related papers (2024-12-14T18:14:25Z) - LeakAgent: RL-based Red-teaming Agent for LLM Privacy Leakage [78.33839735526769]
LeakAgent is a novel black-box red-teaming framework for privacy leakage.<n>Our framework trains an open-source LLM through reinforcement learning as the attack agent to generate adversarial prompts.<n>We show that LeakAgent significantly outperforms existing rule-based approaches in training data extraction and automated methods in system prompt leakage.
arXiv Detail & Related papers (2024-12-07T20:09:01Z) - Secret Breach Prevention in Software Issue Reports [2.8747015994080285]
This paper presents a novel technique for secret breach detection in software issue reports.<n>We highlight the challenges posed by noise, such as log files, URLs, commit IDs, stack traces, and dummy passwords.<n>We propose an approach combining the strengths of state-of-the-artes with the contextual understanding of language models.
arXiv Detail & Related papers (2024-10-31T06:14:17Z) - AssetHarvester: A Static Analysis Tool for Detecting Secret-Asset Pairs in Software Artifacts [4.778835435164734]
We present AssetHarvester, a static analysis tool to detect secret-asset pairs in a repository.
We curated a benchmark of 1,791 secret-asset pairs of four database types extracted from 188 public repositories to evaluate the performance of AssetHarvester.
Our findings indicate that data flow analysis employed in AssetHarvester detects secret-asset pairs with 0% false positives and aids in improving recall of secret detection tools.
arXiv Detail & Related papers (2024-03-28T00:24:49Z) - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory [82.7042006247124]
We show that even the most capable AI models reveal private information in contexts that humans would not, 39% and 57% of the time, respectively.
Our work underscores the immediate need to explore novel inference-time privacy-preserving approaches, based on reasoning and theory of mind.
arXiv Detail & Related papers (2023-10-27T04:15:30Z) - A Comparative Study of Software Secrets Reporting by Secret Detection
Tools [5.9347272469695245]
According to GitGuardian's monitoring of public GitHub repositories, secrets continued accelerating in 2022 by 67% compared to 2021.
We present an evaluation of five open-source and four proprietary tools against a benchmark dataset.
The top three tools based on precision are: GitHub Secret Scanner (75%), Gitleaks (46%), and Commercial X (25%), and based on recall are: Gitleaks (88%), SpectralOps (67%) and TruffleHog (52%)
arXiv Detail & Related papers (2023-07-03T02:32:09Z) - Black-box Dataset Ownership Verification via Backdoor Watermarking [67.69308278379957]
We formulate the protection of released datasets as verifying whether they are adopted for training a (suspicious) third-party model.
We propose to embed external patterns via backdoor watermarking for the ownership verification to protect them.
Specifically, we exploit poison-only backdoor attacks ($e.g.$, BadNets) for dataset watermarking and design a hypothesis-test-guided method for dataset verification.
arXiv Detail & Related papers (2022-08-04T05:32:20Z) - Simple Transparent Adversarial Examples [65.65977217108659]
We introduce secret embedding and transparent adversarial examples as a simpler way to evaluate robustness.
As a result, they pose a serious threat where APIs are used for high-stakes applications.
arXiv Detail & Related papers (2021-05-20T11:54:26Z) - Mind the GAP: Security & Privacy Risks of Contact Tracing Apps [75.7995398006171]
Google and Apple have jointly provided an API for exposure notification in order to implement decentralized contract tracing apps using Bluetooth Low Energy.
We demonstrate that in real-world scenarios the GAP design is vulnerable to (i) profiling and possibly de-anonymizing persons, and (ii) relay-based wormhole attacks that basically can generate fake contacts.
arXiv Detail & Related papers (2020-06-10T16:05:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.