Related papers: Using Large Language Models for the Interpretation of Building Regulations

Using Large Language Models for the Interpretation of Building Regulations

URL: http://arxiv.org/abs/2407.21060v1
Date: Fri, 26 Jul 2024 08:30:47 GMT
Title: Using Large Language Models for the Interpretation of Building Regulations
Authors: Stefan Fuchs, Michael Witbrock, Johannes Dimyadi, Robert Amor,
Abstract summary: Large language models (LLMs) can generate logically coherent text and source code responding to user prompts. This paper evaluates the performance of LLMs in translating building regulations into LegalRuleML in a few-shot learning setup.
Score: 7.013802453969655
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Compliance checking is an essential part of a construction project. The recent rapid uptake of building information models (BIM) in the construction industry has created more opportunities for automated compliance checking (ACC). BIM enables sharing of digital building design data that can be used for compliance checking with legal requirements, which are conventionally conveyed in natural language and not intended for machine processing. Creating a computable representation of legal requirements suitable for ACC is complex, costly, and time-consuming. Large language models (LLMs) such as the generative pre-trained transformers (GPT), GPT-3.5 and GPT-4, powering OpenAI's ChatGPT, can generate logically coherent text and source code responding to user prompts. This capability could be used to automate the conversion of building regulations into a semantic and computable representation. This paper evaluates the performance of LLMs in translating building regulations into LegalRuleML in a few-shot learning setup. By providing GPT-3.5 with only a few example translations, it can learn the basic structure of the format. Using a system prompt, we further specify the LegalRuleML representation and explore the existence of expert domain knowledge in the model. Such domain knowledge might be ingrained in GPT-3.5 through the broad pre-training but needs to be brought forth by careful contextualisation. Finally, we investigate whether strategies such as chain-of-thought reasoning and self-consistency could apply to this use case. As LLMs become more sophisticated, the increased common sense, logical coherence, and means to domain adaptation can significantly support ACC, leading to more efficient and effective checking processes.

Related papers

Re:Form -- Reducing Human Priors in Scalable Formal Software Verification with RL in LLMs: A Preliminary Study on Dafny [68.00108157244952]
Large Language Models (LLMs) trained with Reinforcement Learning (RL) face a significant challenge: their verification processes are neither reliable nor scalable.<n>A promising yet largely uncharted alternative is formal language-based reasoning.<n>Grounding LLMs in rigorous formal systems where generative models operate in formal language spaces (e.g., Dafny) enables the automatic and mathematically provable verification of their reasoning processes and outcomes.
arXiv Detail & Related papers (2025-07-22T08:13:01Z)
Leveraging Machine Learning and Enhanced Parallelism Detection for BPMN Model Generation from Text [75.77648333476776]
This paper introduces an automated pipeline for extracting BPMN models from text.<n>A key contribution of this work is the introduction of a newly annotated dataset.<n>We augment the dataset with 15 newly annotated documents containing 32 parallel gateways for model training.
arXiv Detail & Related papers (2025-07-11T07:25:55Z)
Large Language Model-Driven Code Compliance Checking in Building Information Modeling [3.2648052741820166]
This research addresses the time-consuming and error-prone nature of manual code compliance checking in Building Information Modeling.<n>It introduces a Large Language Model (LLM)-driven approach to semi-automate this critical process.<n>The developed system integrates LLMs such as GPT, Claude, Gemini, and Llama, with Revit software to interpret building codes, generate Python scripts, and perform semi-automated compliance checks.
arXiv Detail & Related papers (2025-06-25T15:50:34Z)
Large Language Models are Good Relational Learners [55.40941576497973]
We introduce Rel-LLM, a novel architecture that utilizes a graph neural network (GNN)- based encoder to generate structured relational prompts for large language models (LLMs)<n>Unlike traditional text-based serialization approaches, our method preserves the inherent relational structure of databases while enabling LLMs to process and reason over complex entity relationships.
arXiv Detail & Related papers (2025-06-06T04:07:55Z)
Compliance-to-Code: Enhancing Financial Compliance Checking via Code Generation [36.166087396386445]
We present Compliance-to-Code, the first large-scale Chinese dataset dedicated to financial regulatory compliance.<n> Covering 1,159 annotated clauses from 361 regulations across ten categories, each clause is modularly structured with four logical elements-subject, condition, constraint, and contextual information-along with regulation relations.<n>We provide deterministic Python code mappings, detailed code reasoning, and code explanations to facilitate automated auditing.
arXiv Detail & Related papers (2025-05-26T10:38:32Z)
MoRE-LLM: Mixture of Rule Experts Guided by a Large Language Model [54.14155564592936]
We propose a Mixture of Rule Experts guided by a Large Language Model (MoRE-LLM) MoRE-LLM steers the discovery of local rule-based surrogates during training and their utilization for the classification task. LLM is responsible for enhancing the domain knowledge alignment of the rules by correcting and contextualizing them.
arXiv Detail & Related papers (2025-03-26T11:09:21Z)
Design and implementation of tools to build an ontology of Security Requirements for Internet of Medical Things [2.446672595462589]
In the Internet of Medical Things (IoMT) world, manufacturers or third parties must be aware of the security requirements expressed by both laws and specifications. An ontology charting the relevant laws and specifications (for the European context) is very useful. Due to the very high number and size of the considered specification documents, we have put in place a methodology and tools to simplify the transition from natural text to an ontology.
arXiv Detail & Related papers (2025-01-06T15:04:45Z)
ARCEAK: An Automated Rule Checking Framework Enhanced with Architectural Knowledge [2.0159170788984024]
Automated Rule Checking (ARC) plays a crucial role in advancing the construction industry by addressing the laborious, inconsistent, and error-prone nature of traditional model review conducted by industry professionals. Our study introduces a novel approach that decomposes ARC into two distinct tasks: rule information extraction and verification code generation.
arXiv Detail & Related papers (2024-12-10T10:37:11Z)
A Multi-Agent Framework for Extensible Structured Text Generation in PLCs [9.555744065377148]
A high-level language adhering to the IEC 61131-3 standard is pivotal for PLCs. The lack of comprehensive and standardized documentation for the full semantics of ST has contributed to inconsistencies in how the language is implemented. We present AutoPLC, an LLM-based approach designed to automate the generation of vendor-specific ST code.
arXiv Detail & Related papers (2024-12-03T12:05:56Z)
KRAG Framework for Enhancing LLMs in the Legal Domain [0.48451657575793666]
This paper introduces Knowledge Representation Augmented Generation (KRAG) KRAG is a framework designed to enhance the capabilities of Large Language Models (LLMs) within domain-specific applications. We present Soft PROLEG, an implementation model under KRAG, which uses inference graphs to aid LLMs in delivering structured legal reasoning.
arXiv Detail & Related papers (2024-10-10T02:48:06Z)
Verbalized Machine Learning: Revisiting Machine Learning with Language Models [63.10391314749408]
We introduce the framework of verbalized machine learning (VML) VML constrains the parameter space to be human-interpretable natural language. We empirically verify the effectiveness of VML, and hope that VML can serve as a stepping stone to stronger interpretability.
arXiv Detail & Related papers (2024-06-06T17:59:56Z)
CELA: Cost-Efficient Language Model Alignment for CTR Prediction [70.65910069412944]
Click-Through Rate (CTR) prediction holds a paramount position in recommender systems.<n>Recent efforts have sought to mitigate these challenges by integrating Pre-trained Language Models (PLMs)<n>We propose textbfCost-textbfEfficient textbfLanguage Model textbfAlignment (textbfCELA) for CTR prediction.
arXiv Detail & Related papers (2024-05-17T07:43:25Z)
CODE-ACCORD: A Corpus of building regulatory data for rule generation towards automatic compliance checking [1.9950441865030422]
CODE-ACCORD is a dataset of 862 sentences from the building regulations of England and Finland. It supports a range of ML and Natural Language Processing (NLP) tasks, including text classification, entity recognition, and relation extraction.
arXiv Detail & Related papers (2024-03-04T17:21:19Z)
kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest Neighbor In-Context Learning [50.40636157214161]
Task-Oriented Parsing (TOP) enables conversational assistants to interpret user commands expressed in natural language. LLMs have achieved impressive performance in computer programs based on a natural language prompt. This paper focuses on harnessing the capabilities of LLMs for semantic parsing tasks.
arXiv Detail & Related papers (2023-12-17T17:26:50Z)
From Text to Structure: Using Large Language Models to Support the Development of Legal Expert Systems [0.6249768559720122]
Rule-based expert systems focused on legislation can support laypeople in understanding how legislation applies to them and provide them with helpful context and information. Here, we investigate what degree large language models (LLMs), such as GPT-4, are able to automatically extract structured representations from legislation. We use LLMs to create pathways from legislation, according to the JusticeBot methodology for legal decision support systems, evaluate the pathways and compare them to manually created pathways.
arXiv Detail & Related papers (2023-11-01T18:31:02Z)
Do Language Models Learn about Legal Entity Types during Pretraining? [4.604003661048267]
We show that Llama2 performs well on certain entities and exhibits potential for substantial improvement with optimized prompt templates. Llama2 appears to frequently overlook syntactic cues, a shortcoming less present in BERT-based architectures.
arXiv Detail & Related papers (2023-10-19T18:47:21Z)
Can Large Language Models Understand Real-World Complex Instructions? [54.86632921036983]
Large language models (LLMs) can understand human instructions, but struggle with complex instructions. Existing benchmarks are insufficient to assess LLMs' ability to understand complex instructions. We propose CELLO, a benchmark for evaluating LLMs' ability to follow complex instructions systematically.
arXiv Detail & Related papers (2023-09-17T04:18:39Z)
LLM-FuncMapper: Function Identification for Interpreting Complex Clauses in Building Codes via LLM [3.802984168589694]
LLM-FuncMapper is an approach to identifying predefined functions needed to interpret various regulatory clauses. Almost 100% of computer-processible clauses can be interpreted and represented as computer-executable codes. This study is the first attempt to introduce LLM for understanding and interpreting complex regulatory clauses.
arXiv Detail & Related papers (2023-08-17T01:58:04Z)
Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models [75.75038268227554]
Self-Checker is a framework comprising a set of plug-and-play modules that facilitate fact-checking. This framework provides a fast and efficient way to construct fact-checking systems in low-resource environments.
arXiv Detail & Related papers (2023-05-24T01:46:07Z)
Physics of Language Models: Part 1, Learning Hierarchical Language Structures [51.68385617116854]
Transformer-based language models are effective but complex, and understanding their inner workings is a significant challenge. We introduce a family of synthetic CFGs that produce hierarchical rules, capable of generating lengthy sentences. We demonstrate that generative models like GPT can accurately learn this CFG language and generate sentences based on it.
arXiv Detail & Related papers (2023-05-23T04:28:16Z)
nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models [3.1143846686797314]
We present nl2spec, a framework for applying Large Language Models (LLMs) derive formal specifications from unstructured natural language. We introduce a new methodology to detect and resolve the inherent ambiguity of system requirements in natural language. Users iteratively add, delete, and edit these sub-translations to amend erroneous formalizations, which is easier than manually redrafting the entire formalization.
arXiv Detail & Related papers (2023-03-08T20:08:53Z)
Pre-Trained Language Models for Interactive Decision-Making [72.77825666035203]
We describe a framework for imitation learning in which goals and observations are represented as a sequence of embeddings. We demonstrate that this framework enables effective generalization across different environments. For test tasks involving novel goals or novel scenes, initializing policies with language models improves task completion rates by 43.6%.
arXiv Detail & Related papers (2022-02-03T18:55:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.