Towards Explainability and Fairness in Swiss Judgement Prediction:
Benchmarking on a Multilingual Dataset
- URL: http://arxiv.org/abs/2402.17013v1
- Date: Mon, 26 Feb 2024 20:42:40 GMT
- Title: Towards Explainability and Fairness in Swiss Judgement Prediction:
Benchmarking on a Multilingual Dataset
- Authors: Santosh T.Y.S.S, Nina Baumgartner, Matthias St\"urmer, Matthias
Grabmair, Joel Niklaus
- Abstract summary: This study delves into the realm of explainability and fairness in Legal Judgement Prediction (LJP) models.
We evaluate the explainability performance of state-of-the-art monolingual and multilingual BERT-based LJP models.
We introduce a novel evaluation framework, Lower Court Insertion (LCI), which allows us to quantify the influence of lower court information on model predictions.
- Score: 2.7463268699570134
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The assessment of explainability in Legal Judgement Prediction (LJP) systems
is of paramount importance in building trustworthy and transparent systems,
particularly considering the reliance of these systems on factors that may lack
legal relevance or involve sensitive attributes. This study delves into the
realm of explainability and fairness in LJP models, utilizing Swiss Judgement
Prediction (SJP), the only available multilingual LJP dataset. We curate a
comprehensive collection of rationales that `support' and `oppose' judgement
from legal experts for 108 cases in German, French, and Italian. By employing
an occlusion-based explainability approach, we evaluate the explainability
performance of state-of-the-art monolingual and multilingual BERT-based LJP
models, as well as models developed with techniques such as data augmentation
and cross-lingual transfer, which demonstrated prediction performance
improvement. Notably, our findings reveal that improved prediction performance
does not necessarily correspond to enhanced explainability performance,
underscoring the significance of evaluating models from an explainability
perspective. Additionally, we introduce a novel evaluation framework, Lower
Court Insertion (LCI), which allows us to quantify the influence of lower court
information on model predictions, exposing current models' biases.
Related papers
- Editable Fairness: Fine-Grained Bias Mitigation in Language Models [52.66450426729818]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.
FAST surpasses state-of-the-art baselines with superior debiasing performance.
This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z) - TRACE: TRansformer-based Attribution using Contrastive Embeddings in LLMs [50.259001311894295]
We propose a novel TRansformer-based Attribution framework using Contrastive Embeddings called TRACE.
We show that TRACE significantly improves the ability to attribute sources accurately, making it a valuable tool for enhancing the reliability and trustworthiness of large language models.
arXiv Detail & Related papers (2024-07-06T07:19:30Z) - Enabling Discriminative Reasoning in LLMs for Legal Judgment Prediction [23.046342240176575]
We introduce the Ask-Discriminate-Predict (ADAPT) reasoning framework inspired by human reasoning.
ADAPT involves decomposing case facts, discriminating among potential charges, and predicting the final judgment.
Experiments conducted on two widely-used datasets demonstrate the superior performance of our framework in legal judgment prediction.
arXiv Detail & Related papers (2024-07-02T05:43:15Z) - Empowering Prior to Court Legal Analysis: A Transparent and Accessible Dataset for Defensive Statement Classification and Interpretation [5.646219481667151]
This paper introduces a novel dataset tailored for classification of statements made during police interviews, prior to court proceedings.
We introduce a fine-tuned DistilBERT model that achieves state-of-the-art performance in distinguishing truthful from deceptive statements.
We also present an XAI interface that empowers both legal professionals and non-specialists to interact with and benefit from our system.
arXiv Detail & Related papers (2024-05-17T11:22:27Z) - VALOR-EVAL: Holistic Coverage and Faithfulness Evaluation of Large Vision-Language Models [57.43276586087863]
Large Vision-Language Models (LVLMs) suffer from hallucination issues, wherein the models generate plausible-sounding but factually incorrect outputs.
Existing benchmarks are often limited in scope, focusing mainly on object hallucinations.
We introduce a multi-dimensional benchmark covering objects, attributes, and relations, with challenging images selected based on associative biases.
arXiv Detail & Related papers (2024-04-22T04:49:22Z) - Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model
Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI.
Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems.
Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z) - Explaining Language Models' Predictions with High-Impact Concepts [11.47612457613113]
We propose a complete framework for extending concept-based interpretability methods to NLP.
We optimize for features whose existence causes the output predictions to change substantially.
Our method achieves superior results on predictive impact, usability, and faithfulness compared to the baselines.
arXiv Detail & Related papers (2023-05-03T14:48:27Z) - Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models [51.3422222472898]
We document the capability of large language models (LLMs) like ChatGPT to predict stock price movements using news headlines.
We develop a theoretical model incorporating information capacity constraints, underreaction, limits-to-arbitrage, and LLMs.
arXiv Detail & Related papers (2023-04-15T19:22:37Z) - Knowledge is Power: Understanding Causality Makes Legal judgment
Prediction Models More Generalizable and Robust [3.555105847974074]
Legal Judgment Prediction (LJP) serves as legal assistance to mitigate the great work burden of limited legal practitioners.
Most existing methods apply various large-scale pre-trained language models finetuned in LJP tasks to obtain consistent improvements.
We discover that the state-of-the-art (SOTA) model makes judgment predictions according to irrelevant (or non-casual) information.
arXiv Detail & Related papers (2022-11-06T07:03:31Z) - Deconfounding Legal Judgment Prediction for European Court of Human
Rights Cases Towards Better Alignment with Experts [1.252149409594807]
This work demonstrates that Legal Judgement Prediction systems without expert-informed adjustments can be vulnerable to shallow, distracting surface signals.
To mitigate this, we use domain expertise to strategically identify statistically predictive but legally irrelevant information.
arXiv Detail & Related papers (2022-10-25T08:37:25Z) - Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features.
We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.