LAPIS: Language Model-Augmented Police Investigation System
- URL: http://arxiv.org/abs/2407.20248v2
- Date: Wed, 31 Jul 2024 05:16:30 GMT
- Title: LAPIS: Language Model-Augmented Police Investigation System
- Authors: Heedou Kim, Dain Kim, Jiwoo Lee, Chanwoong Yoon, Donghee Choi, Mogan Gim, Jaewoo Kang,
- Abstract summary: We introduce LAPIS (Language Model Augmented Police Investigation System), an automated system that assists police officers to perform rational and legal investigative actions.
We constructed a finetuning dataset and retrieval knowledgebase specialized in crime investigation legal reasoning task.
Experimental results show LAPIS' potential in providing reliable legal guidance for police officers, even better than the proprietary GPT-4 model.
- Score: 16.579861300355343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crime situations are race against time. An AI-assisted criminal investigation system, providing prompt but precise legal counsel is in need for police officers. We introduce LAPIS (Language Model Augmented Police Investigation System), an automated system that assists police officers to perform rational and legal investigative actions. We constructed a finetuning dataset and retrieval knowledgebase specialized in crime investigation legal reasoning task. We extended the dataset's quality by incorporating manual curation efforts done by a group of domain experts. We then finetuned the pretrained weights of a smaller Korean language model to the newly constructed dataset and integrated it with the crime investigation knowledgebase retrieval approach. Experimental results show LAPIS' potential in providing reliable legal guidance for police officers, even better than the proprietary GPT-4 model. Qualitative analysis on the rationales generated by LAPIS demonstrate the model's reasoning ability to leverage the premises and derive legally correct conclusions.
Related papers
- Experimenting with Legal AI Solutions: The Case of Question-Answering for Access to Justice [32.550204238857724]
We propose a human-centric legal NLP pipeline, covering data sourcing, inference, and evaluation.
We release a dataset, LegalQA, with real and specific legal questions spanning from employment law to criminal law.
We show that retrieval-augmented generation from only 850 citations in the train set can match or outperform internet-wide retrieval.
arXiv Detail & Related papers (2024-09-12T02:40:28Z) - CLERC: A Dataset for Legal Case Retrieval and Retrieval-Augmented Analysis Generation [44.67578050648625]
We transform a large open-source legal corpus into a dataset supporting information retrieval (IR) and retrieval-augmented generation (RAG)
This dataset CLERC is constructed for training and evaluating models on their ability to (1) find corresponding citations for a given piece of legal analysis and to (2) compile the text of these citations into a cogent analysis that supports a reasoning goal.
arXiv Detail & Related papers (2024-06-24T23:57:57Z) - InternLM-Law: An Open Source Chinese Legal Large Language Model [72.2589401309848]
InternLM-Law is a specialized LLM tailored for addressing diverse legal queries related to Chinese laws.
We meticulously construct a dataset in the Chinese legal domain, encompassing over 1 million queries.
InternLM-Law achieves the highest average performance on LawBench, outperforming state-of-the-art models, including GPT-4, on 13 out of 20 subtasks.
arXiv Detail & Related papers (2024-06-21T06:19:03Z) - Empowering Prior to Court Legal Analysis: A Transparent and Accessible Dataset for Defensive Statement Classification and Interpretation [5.646219481667151]
This paper introduces a novel dataset tailored for classification of statements made during police interviews, prior to court proceedings.
We introduce a fine-tuned DistilBERT model that achieves state-of-the-art performance in distinguishing truthful from deceptive statements.
We also present an XAI interface that empowers both legal professionals and non-specialists to interact with and benefit from our system.
arXiv Detail & Related papers (2024-05-17T11:22:27Z) - SAILER: Structure-aware Pre-trained Language Model for Legal Case
Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system.
Most existing language models have difficulty understanding the long-distance dependencies between different structures.
We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z) - Criminal Investigation Tracker with Suspect Prediction using Machine
Learning [0.0]
This study provides a novel approach for crime prediction based on real-world data, and criminality incorporation.
An automated approach to identifying offenders in Sri Lanka would be better than the current system.
arXiv Detail & Related papers (2023-02-21T03:24:17Z) - LaMDA: Language Models for Dialog Applications [75.75051929981933]
LaMDA is a family of Transformer-based neural language models specialized for dialog.
Fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements.
arXiv Detail & Related papers (2022-01-20T15:44:37Z) - Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding.
We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z) - The effect of differential victim crime reporting on predictive policing
systems [84.86615754515252]
We show how differential victim crime reporting rates can lead to outcome disparities in common crime hot spot prediction models.
Our results suggest that differential crime reporting rates can lead to a displacement of predicted hotspots from high crime but low reporting areas to high or medium crime and high reporting areas.
arXiv Detail & Related papers (2021-01-30T01:57:22Z) - Multi-officer Routing for Patrolling High Risk Areas Jointly Learned
from Check-ins, Crime and Incident Response Data [6.295207672539996]
We formulate the dynamic crime patrol planning problem for multiple police officers using check-ins, crime, incident response data, and POI information.
We propose a joint learning and non-random optimisation method for the representation of possible solutions.
The performance of the proposed solution is verified and compared with several state-of-art methods using real-world datasets.
arXiv Detail & Related papers (2020-07-31T23:33:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.