Related papers: Conditioning Large Language Models on Legal Systems? Detecting Punishable Hate Speech

Conditioning Large Language Models on Legal Systems? Detecting Punishable Hate Speech

URL: http://arxiv.org/abs/2506.03009v1
Date: Tue, 03 Jun 2025 15:50:27 GMT
Title: Conditioning Large Language Models on Legal Systems? Detecting Punishable Hate Speech
Authors: Florian Ludwig, Torsten Zesch, Frederike Zufall,
Abstract summary: This paper examines different approaches to conditioning Large Language Models (LLMs) at multiple levels of abstraction in legal systems to detect potentially punishable hate speech.<n>We focus on the task of classifying whether a specific social media posts falls under the criminal offense of incitement to hatred as prescribed by the German Criminal Code.<n>The results show that there is still a significant performance gap between models and legal experts in the legal assessment of hate speech, regardless of the level of abstraction with which the models were conditioned.
Score: 3.4300974012019148
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The assessment of legal problems requires the consideration of a specific legal system and its levels of abstraction, from constitutional law to statutory law to case law. The extent to which Large Language Models (LLMs) internalize such legal systems is unknown. In this paper, we propose and investigate different approaches to condition LLMs at different levels of abstraction in legal systems. This paper examines different approaches to conditioning LLMs at multiple levels of abstraction in legal systems to detect potentially punishable hate speech. We focus on the task of classifying whether a specific social media posts falls under the criminal offense of incitement to hatred as prescribed by the German Criminal Code. The results show that there is still a significant performance gap between models and legal experts in the legal assessment of hate speech, regardless of the level of abstraction with which the models were conditioned. Our analysis revealed, that models conditioned on abstract legal knowledge lacked deep task understanding, often contradicting themselves and hallucinating answers, while models using concrete legal knowledge performed reasonably well in identifying relevant target groups, but struggled with classifying target conducts.

Related papers

AUTOLAW: Enhancing Legal Compliance in Large Language Models via Case Law Generation and Jury-Inspired Deliberation [5.732271982985626]
AutoLaw is a novel violation detection framework for domain-specific large language models (LLMs)<n>It combines adversarial data generation with a jury-inspired deliberation process to enhance legal compliance of LLMs.<n>Our results highlight the framework's ability to adaptively probe legal misalignments and deliver reliable, context-aware judgments.
arXiv Detail & Related papers (2025-05-20T07:09:13Z)
LEXam: Benchmarking Legal Reasoning on 340 Law Exams [61.344330783528015]
LEXam is a novel benchmark derived from 340 law exams spanning 116 law school courses across a range of subjects and degree levels.<n>The dataset comprises 4,886 law exam questions in English and German, including 2,841 long-form, open-ended questions and 2,045 multiple-choice questions.
arXiv Detail & Related papers (2025-05-19T08:48:12Z)
AnnoCaseLaw: A Richly-Annotated Dataset For Benchmarking Explainable Legal Judgment Prediction [56.797874973414636]
AnnoCaseLaw is a first-of-its-kind dataset of 471 meticulously annotated U.S. Appeals Court negligence cases.<n>Our dataset lays the groundwork for more human-aligned, explainable Legal Judgment Prediction models.<n>Results demonstrate that LJP remains a formidable task, with application of legal precedent proving particularly difficult.
arXiv Detail & Related papers (2025-02-28T19:14:48Z)
Legal Evalutions and Challenges of Large Language Models [42.51294752406578]
We use the OPENAI o1 model as a case study to evaluate the performance of large models in applying legal provisions. We compare current state-of-the-art LLMs, including open-source, closed-source, and legal-specific models trained specifically for the legal domain.
arXiv Detail & Related papers (2024-11-15T12:23:12Z)
DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment [55.91429725404988]
We introduce DELTA, a discriminative model designed for legal case retrieval. We leverage shallow decoders to create information bottlenecks, aiming to enhance the representation ability. Our approach can outperform existing state-of-the-art methods in legal case retrieval.
arXiv Detail & Related papers (2024-03-27T10:40:14Z)
LLM vs. Lawyers: Identifying a Subset of Summary Judgments in a Large UK Case Law Dataset [0.0]
This study addresses the gap in the literature working with large legal corpora about how to isolate cases, in our case summary judgments, from a large corpus of UK court decisions. We use the Cambridge Law Corpus of 356,011 UK court decisions and determine that the large language model achieves a weighted F1 score of 0.94 versus 0.78 for keywords. We identify and extract 3,102 summary judgment cases, enabling us to map their distribution across various UK courts over a temporal span.
arXiv Detail & Related papers (2024-03-04T10:13:30Z)
SAILER: Structure-aware Pre-trained Language Model for Legal Case Retrieval [75.05173891207214]
Legal case retrieval plays a core role in the intelligent legal system. Most existing language models have difficulty understanding the long-distance dependencies between different structures. We propose a new Structure-Aware pre-traIned language model for LEgal case Retrieval.
arXiv Detail & Related papers (2023-04-22T10:47:01Z)
Exploiting Contrastive Learning and Numerical Evidence for Confusing Legal Judgment Prediction [46.71918729837462]
Given the fact description text of a legal case, legal judgment prediction aims to predict the case's charge, law article and penalty term. Previous studies fail to distinguish different classification errors with a standard cross-entropy classification loss. We propose a moco-based supervised contrastive learning to learn distinguishable representations. We further enhance the representation of the fact description with extracted crime amounts which are encoded by a pre-trained numeracy model.
arXiv Detail & Related papers (2022-11-15T15:53:56Z)
Legal Element-oriented Modeling with Multi-view Contrastive Learning for Legal Case Retrieval [3.909749182759558]
We propose an interaction-focused network for legal case retrieval with a multi-view contrastive learning objective. Case-view contrastive learning minimizes the hidden space distance between relevant legal case representations. We employ a legal element knowledge-aware indicator to detect legal elements of cases.
arXiv Detail & Related papers (2022-10-11T06:47:23Z)
Lawformer: A Pre-trained Language Model for Chinese Legal Long Documents [56.40163943394202]
We release the Longformer-based pre-trained language model, named as Lawformer, for Chinese legal long documents understanding. We evaluate Lawformer on a variety of LegalAI tasks, including judgment prediction, similar case retrieval, legal reading comprehension, and legal question answering.
arXiv Detail & Related papers (2021-05-09T09:39:25Z)
A Legal Approach to Hate Speech: Operationalizing the EU's Legal Framework against the Expression of Hatred as an NLP Task [2.248133901806859]
We propose a 'legal approach' to hate speech detection by operationalization of the decision as to whether a post is subject to criminal law. We show that, by breaking the legal assessment down into a series of simpler sub-decisions, even laypersons can annotate consistently.
arXiv Detail & Related papers (2020-04-07T14:13:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.