Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization
- URL: http://arxiv.org/abs/2502.20364v1
- Date: Thu, 27 Feb 2025 18:35:39 GMT
- Title: Bridging Legal Knowledge and AI: Retrieval-Augmented Generation with Vector Stores, Knowledge Graphs, and Hierarchical Non-negative Matrix Factorization
- Authors: Ryan C. Barron, Maksim E. Eren, Olga M. Serafimova, Cynthia Matuszek, Boian S. Alexandrov,
- Abstract summary: Agentic Generative AI, powered by Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG), Knowledge Graphs (KGs), and Vector Stores (VSs)<n>This technology excels at inferring relationships within vast unstructured or semi-structured datasets.<n>We introduce a generative AI system that integrates RAG, VS, and KG, constructed via Non-Negative Matrix Factorization (NMF)
- Score: 6.0045906216050815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Agentic Generative AI, powered by Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG), Knowledge Graphs (KGs), and Vector Stores (VSs), represents a transformative technology applicable to specialized domains such as legal systems, research, recommender systems, cybersecurity, and global security, including proliferation research. This technology excels at inferring relationships within vast unstructured or semi-structured datasets. The legal domain here comprises complex data characterized by extensive, interrelated, and semi-structured knowledge systems with complex relations. It comprises constitutions, statutes, regulations, and case law. Extracting insights and navigating the intricate networks of legal documents and their relations is crucial for effective legal research. Here, we introduce a generative AI system that integrates RAG, VS, and KG, constructed via Non-Negative Matrix Factorization (NMF), to enhance legal information retrieval and AI reasoning and minimize hallucinations. In the legal system, these technologies empower AI agents to identify and analyze complex connections among cases, statutes, and legal precedents, uncovering hidden relationships and predicting legal trends-challenging tasks that are essential for ensuring justice and improving operational efficiency. Our system employs web scraping techniques to systematically collect legal texts, such as statutes, constitutional provisions, and case law, from publicly accessible platforms like Justia. It bridges the gap between traditional keyword-based searches and contextual understanding by leveraging advanced semantic representations, hierarchical relationships, and latent topic discovery. This framework supports legal document clustering, summarization, and cross-referencing, for scalable, interpretable, and accurate retrieval for semi-structured data while advancing computational law and AI.
Related papers
- Graph RAG for Legal Norms: A Hierarchical and Temporal Approach [0.0]
This article proposes an adaptation of Graph Retrieval Augmented Generation (Graph RAG) specifically designed for the analysis and comprehension of legal norms.
By combining structured knowledge graphs with contextually enriched text segments, Graph RAG offers a promising solution to address the inherent complexity and vast volume of legal data.
arXiv Detail & Related papers (2025-04-29T18:36:57Z) - Toward Agentic AI: Generative Information Retrieval Inspired Intelligent Communications and Networking [87.82985288731489]
Agentic AI has emerged as a key paradigm for intelligent communications and networking.<n>This article emphasizes the role of knowledge acquisition, processing, and retrieval in agentic AI for telecom systems.
arXiv Detail & Related papers (2025-02-24T06:02:25Z) - GeAR: Generation Augmented Retrieval [82.20696567697016]
Document retrieval techniques form the foundation for the development of large-scale information systems.<n>The prevailing methodology is to construct a bi-encoder and compute the semantic similarity.<n>We propose a new method called $textbfGe$neration that incorporates well-designed fusion and decoding modules.
arXiv Detail & Related papers (2025-01-06T05:29:00Z) - A Comprehensive Framework for Reliable Legal AI: Combining Specialized Expert Systems and Adaptive Refinement [0.0]
Article proposes a novel framework combining expert systems with a knowledge-based architecture to improve the precision and contextual relevance of AI-driven legal services.<n>This framework utilizes specialized modules, each focusing on specific legal areas, and incorporates structured operational guidelines to enhance decision-making.<n>The proposed approach demonstrates significant improvements over existing AI models, showcasing enhanced performance in legal tasks and offering a scalable solution to provide more accessible and affordable legal services.
arXiv Detail & Related papers (2024-12-29T14:00:11Z) - Unlocking Legal Knowledge with Multi-Layered Embedding-Based Retrieval [0.0]
We propose a multi-layered embedding-based retrieval method for legal and legislative texts.
Our method meets various information needs by allowing the Retrieval Augmented Generation system to provide accurate responses.
arXiv Detail & Related papers (2024-11-12T12:03:57Z) - Leveraging Knowledge Graphs and LLMs to Support and Monitor Legislative Systems [0.0]
This work investigates how Legislative Knowledge Graphs and LLMs can synergize and support legislative processes.
To this aim, we develop Legis AI Platform, an interactive platform focused on Italian legislation that enhances the possibility of conducting legislative analysis.
arXiv Detail & Related papers (2024-09-20T06:21:03Z) - Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents [55.63497537202751]
Article explores the convergence of connectionist and symbolic artificial intelligence (AI)
Traditionally, connectionist AI focuses on neural networks, while symbolic AI emphasizes symbolic representation and logic.
Recent advancements in large language models (LLMs) highlight the potential of connectionist architectures in handling human language as a form of symbols.
arXiv Detail & Related papers (2024-07-11T14:00:53Z) - Report of the 1st Workshop on Generative AI and Law [78.62063815165968]
This report presents the takeaways of the inaugural Workshop on Generative AI and Law (GenLaw)
A cross-disciplinary group of practitioners and scholars from computer science and law convened to discuss the technical, doctrinal, and policy challenges presented by law for Generative AI.
arXiv Detail & Related papers (2023-11-11T04:13:37Z) - Constructing a Knowledge Graph for Vietnamese Legal Cases with
Heterogeneous Graphs [5.168558598888541]
This paper presents a knowledge graph construction method for legal case documents and related laws.
Our approach consists of three main steps: data crawling, information extraction, and knowledge graph deployment.
arXiv Detail & Related papers (2023-09-16T18:31:47Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Finding the Law: Enhancing Statutory Article Retrieval via Graph Neural
Networks [3.5880535198436156]
We propose a novel graph-augmented dense statute retriever (G-DSR) model that incorporates the structure of legislation via a graph neural network to improve dense retrieval performance.
Experimental results show that our approach outperforms strong retrieval baselines on a real-world expert-annotated SAR dataset.
arXiv Detail & Related papers (2023-01-30T12:59:09Z) - Towards an Interface Description Template for AI-enabled Systems [77.34726150561087]
Reuse is a common system architecture approach that seeks to instantiate a system architecture with existing components.
There is currently no framework that guides the selection of necessary information to assess their portability to operate in a system different than the one for which the component was originally purposed.
We present ongoing work on establishing an interface description template that captures the main information of an AI-enabled component.
arXiv Detail & Related papers (2020-07-13T20:30:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.