Document Data Matching for Blockchain-Supported Real Estate
- URL: http://arxiv.org/abs/2512.24457v1
- Date: Tue, 30 Dec 2025 20:30:48 GMT
- Title: Document Data Matching for Blockchain-Supported Real Estate
- Authors: Henrique Lin, Tiago Dias, Miguel Correia,
- Abstract summary: This work presents a system that integrates optical character recognition (OCR), natural language processing (NLP), and verifiable credentials (VCs) to automate document extraction, verification, and management.<n>The approach standardizes heterogeneous document formats into VCs and applies automated data matching to detect inconsistencies, while the blockchain provides a decentralized trust layer that reinforces transparency and integrity.<n>The proposed framework demonstrates the potential to streamline real estate transactions, strengthen stakeholder trust, and enable scalable, secure digital processes.
- Score: 2.9873162504735133
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The real estate sector remains highly dependent on manual document handling and verification, making processes inefficient and prone to fraud. This work presents a system that integrates optical character recognition (OCR), natural language processing (NLP), and verifiable credentials (VCs) to automate document extraction, verification, and management. The approach standardizes heterogeneous document formats into VCs and applies automated data matching to detect inconsistencies, while the blockchain provides a decentralized trust layer that reinforces transparency and integrity. A prototype was developed that comprises (i) an OCR-NLP extraction pipeline trained on synthetic datasets, (ii) a backend for credential issuance and management, and (iii) a frontend supporting issuer, holder, and verifier interactions. Experimental results show that the models achieve competitive accuracy across multiple document types and that the end-to-end pipeline reduces verification time while preserving reliability. The proposed framework demonstrates the potential to streamline real estate transactions, strengthen stakeholder trust, and enable scalable, secure digital processes.
Related papers
- Composable Attestation: A Generalized Framework for Continuous and Incremental Trust in AI-Driven Distributed Systems [4.2822349607372265]
This paper presents composable attestation as a generalized cryptographic framework for Continuous and Incremental Trust in Distributed Systems.<n>We establish a rigorous mathematical foundation which is defining core properties of such attestation systems.<n>The framework's utility extends to applications such as secure AI model integrity verification, federated learning, and runtime trust assurance.
arXiv Detail & Related papers (2026-03-02T22:45:26Z) - Non-Fungible Blockchain Tokens for Traceable Online-Quality Assurance of Milled Workpieces [67.2286008549719]
This work presents a concept and implementation for the secure storage and transfer of quality-relevant data of milleds from online-quality assurance processes.<n>The concept enables traceability throughout the value chain, minimising the need for time-consuming and costly repetitive manual quality checks.
arXiv Detail & Related papers (2026-02-10T13:53:11Z) - Towards Verifiably Safe Tool Use for LLM Agents [53.55621104327779]
Large language model (LLM)-based AI agents extend capabilities by enabling access to tools such as data sources, APIs, search engines, code sandboxes, and even other agents.<n>LLMs may invoke unintended tool interactions and introduce risks, such as leaking sensitive data or overwriting critical records.<n>Current approaches to mitigate these risks, such as model-based safeguards, enhance agents' reliability but cannot guarantee system safety.
arXiv Detail & Related papers (2026-01-12T21:31:38Z) - Privacy-Preserving AI-Enabled Decentralized Learning and Employment Records System [2.756161833954979]
Learning and Employment Record (LER) systems are emerging as critical infrastructure for securely compiling and sharing educational and work achievements.<n>Existing blockchain-based platforms leverage verifiable credentials but typically lack automated skill-credential generation and the ability to incorporate unstructured evidence of learning.<n>This paper proposes a privacy-preserving, AI-enabled decentralized LER system to address these gaps.
arXiv Detail & Related papers (2026-01-06T05:18:03Z) - ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search [69.60882125603133]
We present ReliabilityRAG, a framework for adversarial robustness that explicitly leverages reliability information of retrieved documents.<n>Our work is a significant step towards more effective, provably robust defenses against retrieved corpus corruption in RAG.
arXiv Detail & Related papers (2025-09-27T22:36:42Z) - Context Lineage Assurance for Non-Human Identities in Critical Multi-Agent Systems [0.08316523707191924]
We introduce a cryptographically grounded mechanism for lineage verification, anchored in append-only Merkle tree structures.<n>Unlike traditional A2A models that primarily secure point-to-point interactions, our approach enables both agents and external verifiers to cryptographically validate multi-hop provenance.<n>In parallel, we augment the A2A agent card to incorporate explicit identity verification primitives, enabling both peer agents and human approvers to authenticate the legitimacy of NHI representations.
arXiv Detail & Related papers (2025-09-22T20:59:51Z) - CTCC: A Robust and Stealthy Fingerprinting Framework for Large Language Models via Cross-Turn Contextual Correlation Backdoor [21.71721725818432]
We introduce CTCC, a novel rule-driven fingerprinting framework.<n>CTCC encodes contextual correlations across multiple dialogue turns, such as counterfactual, rather than relying on token-level or single-turn triggers.<n>Our findings position CTCC as a reliable and practical solution for ownership verification in real-world LLM deployment scenarios.
arXiv Detail & Related papers (2025-09-05T05:59:50Z) - AI-Governed Agent Architecture for Web-Trustworthy Tokenization of Alternative Assets [3.0801485631077457]
Alternative Assets tokenization is transforming non-traditional financial instruments are represented and traded on the web.<n>This paper proposes an AI-governed agent architecture that integrates intelligent agents with blockchain to achieve web-trustworthy tokenization of alternative assets.
arXiv Detail & Related papers (2025-06-30T11:28:51Z) - Towards Robust Fact-Checking: A Multi-Agent System with Advanced Evidence Retrieval [1.515687944002438]
The rapid spread of misinformation in the digital era poses significant challenges to public discourse.<n>Traditional human-led fact-checking methods, while credible, struggle with the volume and velocity of online content.<n>This paper proposes a novel multi-agent system for automated fact-checking that enhances accuracy, efficiency, and explainability.
arXiv Detail & Related papers (2025-06-22T02:39:27Z) - OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMs [59.836774258359945]
OpenFactCheck is a framework for building customized automatic fact-checking systems.<n>It allows users to easily customize an automatic fact-checker and verify the factual correctness of documents and claims.<n>CheckerEVAL is a solution for gauging the reliability of automatic fact-checkers' verification results using human-annotated datasets.
arXiv Detail & Related papers (2024-05-09T07:15:19Z) - Trustless Audits without Revealing Data or Models [49.23322187919369]
We show that it is possible to allow model providers to keep their model weights (but not architecture) and data secret while allowing other parties to trustlessly audit model and data properties.
We do this by designing a protocol called ZkAudit in which model providers publish cryptographic commitments of datasets and model weights.
arXiv Detail & Related papers (2024-04-06T04:43:06Z) - FedSOV: Federated Model Secure Ownership Verification with Unforgeable
Signature [60.99054146321459]
Federated learning allows multiple parties to collaborate in learning a global model without revealing private data.
We propose a cryptographic signature-based federated learning model ownership verification scheme named FedSOV.
arXiv Detail & Related papers (2023-05-10T12:10:02Z) - Auditing and Generating Synthetic Data with Controllable Trust Trade-offs [54.262044436203965]
We introduce a holistic auditing framework that comprehensively evaluates synthetic datasets and AI models.
It focuses on preventing bias and discrimination, ensures fidelity to the source data, assesses utility, robustness, and privacy preservation.
We demonstrate the framework's effectiveness by auditing various generative models across diverse use cases.
arXiv Detail & Related papers (2023-04-21T09:03:18Z) - Lights, Camera, Action! A Framework to Improve NLP Accuracy over OCR
documents [2.6201102730518606]
We demonstrate an effective framework for mitigating OCR errors for any downstream NLP task.
We first address the data scarcity problem for model training by constructing a document synthesis pipeline.
For the benefit of the community, we have made the document synthesis pipeline available as an open-source project.
arXiv Detail & Related papers (2021-08-06T00:32:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.