Knowledge is Power: Understanding Causality Makes Legal judgment
Prediction Models More Generalizable and Robust
- URL: http://arxiv.org/abs/2211.03046v2
- Date: Tue, 18 Apr 2023 12:06:18 GMT
- Title: Knowledge is Power: Understanding Causality Makes Legal judgment
Prediction Models More Generalizable and Robust
- Authors: Haotian Chen, Lingwei Zhang, Yiran Liu, Fanchao Chen, Yang Yu
- Abstract summary: Legal Judgment Prediction (LJP) serves as legal assistance to mitigate the great work burden of limited legal practitioners.
Most existing methods apply various large-scale pre-trained language models finetuned in LJP tasks to obtain consistent improvements.
We discover that the state-of-the-art (SOTA) model makes judgment predictions according to irrelevant (or non-casual) information.
- Score: 3.555105847974074
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Legal Judgment Prediction (LJP), aiming to predict a judgment based on fact
descriptions according to rule of law, serves as legal assistance to mitigate
the great work burden of limited legal practitioners. Most existing methods
apply various large-scale pre-trained language models (PLMs) finetuned in LJP
tasks to obtain consistent improvements. However, we discover the fact that the
state-of-the-art (SOTA) model makes judgment predictions according to
irrelevant (or non-casual) information. The violation of rule of law not only
weakens the robustness and generalization ability of models but also results in
severe social problems like discrimination. In this paper, we use causal
structural models (SCMs) to theoretically analyze how LJP models learn to make
decisions and why they can succeed in passing the traditional testing paradigm
without learning causality. According to our analysis, we provide two solutions
intervening on data and model by causality, respectively. In detail, we first
distinguish non-causal information by applying the open information extraction
(OIE) technique. Then, we propose a method named the Causal Information
Enhanced SAmpling Method (CIESAM) to eliminate the non-causal information from
data. To validate our theoretical analysis, we further propose another method
using our proposed Causality-Aware Self-Attention Mechanism (CASAM) to guide
the model to learn the underlying causality knowledge in legal texts. The
confidence of CASAM in learning causal information is higher than that of
CIESAM. The extensive experimental results show that both our proposed methods
achieve state-of-the-art (SOTA) performance on three commonly used
legal-specific datasets. The stronger performance of CASAM further demonstrates
that causality is the key to the robustness and generalization ability of
models.
Related papers
- Bayesian scaling laws for in-context learning [72.17734205418502]
In-context learning (ICL) is a powerful technique for getting language models to perform complex tasks with no training updates.
We show that ICL approximates a Bayesian learner and develop a family of novel Bayesian scaling laws for ICL.
arXiv Detail & Related papers (2024-10-21T21:45:22Z) - A Simple Model of Inference Scaling Laws [1.3597551064547502]
We study scaling laws in the context of inference, specifically how performance improves with multiple inference attempts.
Our simple framework sets the ground for incorporating inference scaling with other known scaling laws.
arXiv Detail & Related papers (2024-10-21T18:00:06Z) - Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
We introduce a novel training data attribution method called Debias and Denoise Attribution (DDA)
Our method significantly outperforms existing approaches, achieving an averaged AUC of 91.64%.
DDA exhibits strong generality and scalability across various sources and different-scale models like LLaMA2, QWEN2, and Mistral.
arXiv Detail & Related papers (2024-10-02T07:14:26Z) - The Factuality of Large Language Models in the Legal Domain [8.111302195052641]
This paper investigates the factuality of large language models (LLMs) as knowledge bases in the legal domain.
We design a dataset of diverse factual questions about case law and legislation.
We then use the dataset to evaluate several LLMs under different evaluation methods, including exact, alias, and fuzzy matching.
arXiv Detail & Related papers (2024-09-18T08:30:20Z) - Evaluating Human Alignment and Model Faithfulness of LLM Rationale [66.75309523854476]
We study how well large language models (LLMs) explain their generations through rationales.
We show that prompting-based methods are less "faithful" than attribution-based explanations.
arXiv Detail & Related papers (2024-06-28T20:06:30Z) - Precedent-Enhanced Legal Judgment Prediction with LLM and Domain-Model
Collaboration [52.57055162778548]
Legal Judgment Prediction (LJP) has become an increasingly crucial task in Legal AI.
Precedents are the previous legal cases with similar facts, which are the basis for the judgment of the subsequent case in national legal systems.
Recent advances in deep learning have enabled a variety of techniques to be used to solve the LJP task.
arXiv Detail & Related papers (2023-10-13T16:47:20Z) - Did the Models Understand Documents? Benchmarking Models for Language
Understanding in Document-Level Relation Extraction [2.4665182280122577]
Document-level relation extraction (DocRE) attracts more research interest recently.
While models achieve consistent performance gains in DocRE, their underlying decision rules are still understudied.
In this paper, we take the first step toward answering this question and then introduce a new perspective on comprehensively evaluating a model.
arXiv Detail & Related papers (2023-06-20T08:52:05Z) - Preserving Commonsense Knowledge from Pre-trained Language Models via
Causal Inference [20.5696436171006]
Most existing studies attribute it to catastrophic forgetting, and they retain the pre-trained knowledge indiscriminately.
We frame fine-tuning into a causal graph and discover that the crux of catastrophic forgetting lies in the missing causal effects from the pretrained data.
In the experiments, our method outperforms state-of-the-art fine-tuning methods on all six commonsense QA datasets.
arXiv Detail & Related papers (2023-06-19T09:06:44Z) - Large Language Models with Controllable Working Memory [64.71038763708161]
Large language models (LLMs) have led to a series of breakthroughs in natural language processing (NLP)
What further sets these models apart is the massive amounts of world knowledge they internalize during pretraining.
How the model's world knowledge interacts with the factual information presented in the context remains under explored.
arXiv Detail & Related papers (2022-11-09T18:58:29Z) - On the Trade-Off between Actionable Explanations and the Right to be
Forgotten [21.26254644739585]
We study the problem of recourse invalidation in the context of data deletion requests.
We show that the removal of as little as 2 data instances from the training set can invalidate up to 95 percent of all recourses output by popular state-of-the-art algorithms.
arXiv Detail & Related papers (2022-08-30T10:35:32Z) - Measuring Causal Effects of Data Statistics on Language Model's
`Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models.
We provide a language for describing how training data influences predictions, through a causal framework.
Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.