DePLOI: Applying NL2SQL to Synthesize and Audit Database Access Control
- URL: http://arxiv.org/abs/2402.07332v4
- Date: Thu, 22 May 2025 03:38:57 GMT
- Title: DePLOI: Applying NL2SQL to Synthesize and Audit Database Access Control
- Authors: Pranav Subramaniam, Sanjay Krishnan,
- Abstract summary: This paper introduces a new access control model called Intent-Based Access Control for Databases (IBAC-DB)<n>In IBAC-DB, access control policies are expressed using abstractions that scale to high numbers of database objects, and are traceable with respect to implementations.<n>This paper proposes DePLOI, a system leveraging access control-specific task decompositions to accurately synthesize and audit access control implementation from IBAC-DB abstractions.
- Score: 6.2859996652179
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In every enterprise database, administrators must define an access control policy that specifies which users have access to which tables. Access control straddles two worlds: policy (organization-level principles that define who should have access) and process (database-level primitives that actually implement the policy). Assessing and enforcing process compliance with a policy is a manual and ad-hoc task. This paper introduces a new access control model called Intent-Based Access Control for Databases (IBAC-DB). In IBAC-DB, access control policies are expressed using abstractions that scale to high numbers of database objects, and are traceable with respect to implementations. This paper proposes DePLOI (Deployment Policy Linter for Organization Intents), a LLM-backed system leveraging access control-specific task decompositions to accurately synthesize and audit access control implementation from IBAC-DB abstractions. As DePLOI is the first system of its kind to our knowledge, this paper further proposes IBACBench, the first benchmark for evaluating the synthesis and auditing capabilities of DePLOI. IBACBench leverages a combination of current NL2SQL benchmarks, real-world role hierarchies and access control policies, and LLM-generated data. We find that DePLOI achieves high synthesis accuracies and auditing F1 scores overall, and greatly outperforms other LLM prompting strategies (e.g., by 10 F1 points).
Related papers
- Permissioned LLMs: Enforcing Access Control in Large Language Models [14.935672762016972]
Permissioned LLMs (PerLM) superimpose organizational data access control structures on query responses.<n>PermLLM mechanisms build on Efficient Fine-Tuning to achieve the desired access control.<n>We demonstrate the efficacy of our PermLLM mechanisms through extensive experiments on four public datasets.
arXiv Detail & Related papers (2025-05-28T20:47:02Z) - OrgAccess: A Benchmark for Role Based Access Control in Organization Scale LLMs [7.999158988904784]
Large Language Models (LLMs) serve as unified knowledge repositories and intelligent assistants in enterprise settings.<n> evaluating this crucial capability is inherently difficult due to the proprietary and sensitive nature of real-world corporate data and access control policies.<n>We introduce a synthetic yet representative textbfOrgAccess benchmark consisting of 40 distinct types of permissions commonly relevant across different organizational roles and levels.
arXiv Detail & Related papers (2025-05-25T14:30:15Z) - Can LLMs Generate Tabular Summaries of Science Papers? Rethinking the Evaluation Protocol [83.90769864167301]
Literature review tables are essential for summarizing and comparing collections of scientific papers.<n>We explore the task of generating tables that best fulfill a user's informational needs given a collection of scientific papers.<n>Our contributions focus on three key challenges encountered in real-world use: (i) User prompts are often under-specified; (ii) Retrieved candidate papers frequently contain irrelevant content; and (iii) Task evaluation should move beyond shallow text similarity techniques.
arXiv Detail & Related papers (2025-04-14T14:52:28Z) - Synthesizing Access Control Policies using Large Language Models [0.5762345156477738]
Cloud compute systems allow administrators to write access control policies that govern access to private data.
While policies are written in convenient languages, such as AWS Identity and Access Management Policy Language, manually written policies often become complex and error prone.
In this paper, we investigate whether and how well Large Language Models (LLMs) can be used to synthesize access control policies.
arXiv Detail & Related papers (2025-03-14T16:40:25Z) - Towards Evaluating Large Language Models for Graph Query Generation [49.49881799107061]
Large Language Models (LLMs) are revolutionizing the landscape of Generative Artificial Intelligence (GenAI)
This paper presents a comparative study addressing the challenge of generating queries a powerful language for interacting with graph databases using open-access LLMs.
Our empirical analysis of query generation accuracy reveals that Claude Sonnet 3.5 outperforms its counterparts in this specific domain.
arXiv Detail & Related papers (2024-11-13T09:11:56Z) - Access control in a distributed micro-cloud environment [0.0]
Attribute-Based Access Control models come at the cost of high policy management complexity.
We propose an ABAC model that incorporates user and object hierarchies.
We develop a policy engine that supports the model and present a distributed cloud use case.
arXiv Detail & Related papers (2024-10-26T21:09:09Z) - IBAC Mathematics and Mechanics: The Case for 'Integer Based Access Control' of Data Security in the Age of AI and AI Automation [0.0]
Current methods for data access control, especially regarding AI and AI automation, face unique challenges in ensuring appropriate data access.
We introduce aggregated-Based Access Control (IBAC), addressing the limitations of Role-Based Access Control (RBAC) and Attribute-Based Access Control (ABAC)
IBAC's mathematical foundations enable its application to relational and document authorization.
arXiv Detail & Related papers (2024-10-24T06:19:57Z) - Open-domain Implicit Format Control for Large Language Model Generation [52.83173553689678]
We introduce a novel framework for controlled generation in large language models (LLMs)
This study investigates LLMs' capabilities to follow open-domain, one-shot constraints and replicate the format of the example answers.
We also develop a dataset collection methodology for supervised fine-tuning that enhances the open-domain format control of LLMs without degrading output quality.
arXiv Detail & Related papers (2024-08-08T11:51:45Z) - Relational Database Augmented Large Language Model [59.38841050766026]
Large language models (LLMs) excel in many natural language processing (NLP) tasks.
They can only incorporate new knowledge through training or supervised fine-tuning processes.
This precise, up-to-date, and private information is typically stored in relational databases.
arXiv Detail & Related papers (2024-07-21T06:19:10Z) - Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More? [54.667202878390526]
Long-context language models (LCLMs) have the potential to revolutionize our approach to tasks traditionally reliant on external tools like retrieval systems or databases.
We introduce LOFT, a benchmark of real-world tasks requiring context up to millions of tokens designed to evaluate LCLMs' performance on in-context retrieval and reasoning.
Our findings reveal LCLMs' surprising ability to rival state-of-the-art retrieval and RAG systems, despite never having been explicitly trained for these tasks.
arXiv Detail & Related papers (2024-06-19T00:28:58Z) - Comparison of Access Control Approaches for Graph-Structured Data [0.0]
Graph-structured data requires advanced, flexible, and fine-grained access control due to its complex structure.
Several research works focus on protecting property graph-structured data, enforcing fine-grained access control, and proving the feasibility and applicability of their concept.
We select works from our systematic literature review on authorization and access control for different database models in addition to recent ones.
arXiv Detail & Related papers (2024-05-31T12:31:05Z) - Towards Modular LLMs by Building and Reusing a Library of LoRAs [64.43376695346538]
We study how to best build a library of adapters given multi-task data.
We introduce model-based clustering, MBC, a method that groups tasks based on the similarity of their adapter parameters.
To re-use the library, we present a novel zero-shot routing mechanism, Arrow, which enables dynamic selection of the most relevant adapters.
arXiv Detail & Related papers (2024-05-18T03:02:23Z) - TableLLM: Enabling Tabular Data Manipulation by LLMs in Real Office Usage Scenarios [52.73289223176475]
TableLLM is a robust large language model (LLM) with 13 billion parameters.
TableLLM is purpose-built for proficiently handling data manipulation tasks.
We have released the model checkpoint, source code, benchmarks, and a web application for user interaction.
arXiv Detail & Related papers (2024-03-28T11:21:12Z) - Sparsity-Aware Intelligent Massive Random Access Control in Open RAN: A
Reinforcement Learning Based Approach [61.74489383629319]
Massive random access of devices in the emerging Open Radio Access Network (O-RAN) brings great challenge to the access control and management.
reinforcement-learning (RL)-assisted scheme of closed-loop access control is proposed to preserve sparsity of access requests.
Deep-RL-assisted SAUD is proposed to resolve highly complex environments with continuous and high-dimensional state and action spaces.
arXiv Detail & Related papers (2023-03-05T12:25:49Z) - Distributed-Training-and-Execution Multi-Agent Reinforcement Learning
for Power Control in HetNet [48.96004919910818]
We propose a multi-agent deep reinforcement learning (MADRL) based power control scheme for the HetNet.
To promote cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm for MADRL systems.
In this way, an agent's policy can be learned by other agents more easily, resulting in a more efficient collaboration process.
arXiv Detail & Related papers (2022-12-15T17:01:56Z) - Baihe: SysML Framework for AI-driven Databases [33.47034563589278]
Using Baihe, an existing relational database system may be retrofitted to use learned components for query optimization or other common tasks.
Baihe's high level architecture is based on the following requirements: separation from the core system, minimal third party dependencies, Robustness, stability and fault tolerance.
arXiv Detail & Related papers (2021-12-29T09:00:07Z) - Supervised Off-Policy Ranking [145.3039527243585]
Off-policy evaluation (OPE) leverages data generated by other policies to evaluate a target policy.
We propose supervised off-policy ranking that learns a policy scoring model by correctly ranking training policies with known performance.
Our method outperforms strong baseline OPE methods in terms of both rank correlation and performance gap between the truly best and the best of the ranked top three policies.
arXiv Detail & Related papers (2021-07-03T07:01:23Z) - Learning Attribute-Based and Relationship-Based Access Control Policies
with Unknown Values [0.6662800021628273]
This paper presents the first algorithms for mining ABAC and ReBAC policies from access control lists (ACLs) and incomplete information about entities.
We show that the core of this problem can be viewed as learning a concise three-valued logic formula from a set of labeled feature vectors containing unknowns.
arXiv Detail & Related papers (2020-08-19T13:56:29Z) - An Automatic Attribute Based Access Control Policy Extraction from
Access Logs [5.142415132534397]
An attribute-based access control (ABAC) model provides a more flexible approach for addressing the authorization needs of complex and dynamic systems.
We present a methodology for automatically learning ABAC policy rules from access logs of a system to simplify the policy development process.
arXiv Detail & Related papers (2020-03-16T15:08:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.