Knowledge is Power: Understanding Causality Makes Legal judgment
Prediction Models More Generalizable and Robust
- URL: http://arxiv.org/abs/2211.03046v2
- Date: Tue, 18 Apr 2023 12:06:18 GMT
- Title: Knowledge is Power: Understanding Causality Makes Legal judgment
Prediction Models More Generalizable and Robust
- Authors: Haotian Chen, Lingwei Zhang, Yiran Liu, Fanchao Chen, Yang Yu
- Abstract summary: Legal Judgment Prediction (LJP) serves as legal assistance to mitigate the great work burden of limited legal practitioners.
Most existing methods apply various large-scale pre-trained language models finetuned in LJP tasks to obtain consistent improvements.
We discover that the state-of-the-art (SOTA) model makes judgment predictions according to irrelevant (or non-casual) information.
- Score: 3.555105847974074
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Legal Judgment Prediction (LJP), aiming to predict a judgment based on fact
descriptions according to rule of law, serves as legal assistance to mitigate
the great work burden of limited legal practitioners. Most existing methods
apply various large-scale pre-trained language models (PLMs) finetuned in LJP
tasks to obtain consistent improvements. However, we discover the fact that the
state-of-the-art (SOTA) model makes judgment predictions according to
irrelevant (or non-casual) information. The violation of rule of law not only
weakens the robustness and generalization ability of models but also results in
severe social problems like discrimination. In this paper, we use causal
structural models (SCMs) to theoretically analyze how LJP models learn to make
decisions and why they can succeed in passing the traditional testing paradigm
without learning causality. According to our analysis, we provide two solutions
intervening on data and model by causality, respectively. In detail, we first
distinguish non-causal information by applying the open information extraction
(OIE) technique. Then, we propose a method named the Causal Information
Enhanced SAmpling Method (CIESAM) to eliminate the non-causal information from
data. To validate our theoretical analysis, we further propose another method
using our proposed Causality-Aware Self-Attention Mechanism (CASAM) to guide
the model to learn the underlying causality knowledge in legal texts. The
confidence of CASAM in learning causal information is higher than that of
CIESAM. The extensive experimental results show that both our proposed methods
achieve state-of-the-art (SOTA) performance on three commonly used
legal-specific datasets. The stronger performance of CASAM further demonstrates
that causality is the key to the robustness and generalization ability of
models.
Related papers
- Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model
Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI.
Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems.
Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z) - Interpretable Imitation Learning with Dynamic Causal Relations [65.18456572421702]
We propose to expose captured knowledge in the form of a directed acyclic causal graph.
We also design this causal discovery process to be state-dependent, enabling it to model the dynamics in latent causal graphs.
The proposed framework is composed of three parts: a dynamic causal discovery module, a causality encoding module, and a prediction module, and is trained in an end-to-end manner.
arXiv Detail & Related papers (2023-09-30T20:59:42Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - Did the Models Understand Documents? Benchmarking Models for Language
Understanding in Document-Level Relation Extraction [2.4665182280122577]
Document-level relation extraction (DocRE) attracts more research interest recently.
While models achieve consistent performance gains in DocRE, their underlying decision rules are still understudied.
In this paper, we take the first step toward answering this question and then introduce a new perspective on comprehensively evaluating a model.
arXiv Detail & Related papers (2023-06-20T08:52:05Z) - Preserving Commonsense Knowledge from Pre-trained Language Models via
Causal Inference [20.5696436171006]
Most existing studies attribute it to catastrophic forgetting, and they retain the pre-trained knowledge indiscriminately.
We frame fine-tuning into a causal graph and discover that the crux of catastrophic forgetting lies in the missing causal effects from the pretrained data.
In the experiments, our method outperforms state-of-the-art fine-tuning methods on all six commonsense QA datasets.
arXiv Detail & Related papers (2023-06-19T09:06:44Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - On the Trade-Off between Actionable Explanations and the Right to be
Forgotten [21.26254644739585]
We study the problem of recourse invalidation in the context of data deletion requests.
We show that the removal of as little as 2 data instances from the training set can invalidate up to 95 percent of all recourses output by popular state-of-the-art algorithms.
arXiv Detail & Related papers (2022-08-30T10:35:32Z) - Measuring Causal Effects of Data Statistics on Language Model's
`Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models.
We provide a language for describing how training data influences predictions, through a causal framework.
Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z) - Provably Robust Model-Centric Explanations for Critical Decision-Making [14.367217955827002]
We show that data-centric methods may yield brittle explanations of limited practical utility.
The model-centric framework, however, can offer actionable insights into risks of using AI models in practice.
arXiv Detail & Related papers (2021-10-26T18:05:49Z) - Bounding Information Leakage in Machine Learning [26.64770573405079]
This paper investigates fundamental bounds on information leakage.
We identify and bound the success rate of the worst-case membership inference attack.
We derive bounds on the mutual information between the sensitive attributes and model parameters.
arXiv Detail & Related papers (2021-05-09T08:49:14Z) - Beyond Trivial Counterfactual Explanations with Diverse Valuable
Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction.
We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss.
Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.