LEVEN: A Large-Scale Chinese Legal Event Detection Dataset
- URL: http://arxiv.org/abs/2203.08556v1
- Date: Wed, 16 Mar 2022 11:40:02 GMT
- Title: LEVEN: A Large-Scale Chinese Legal Event Detection Dataset
- Authors: Feng Yao, Chaojun Xiao, Xiaozhi Wang, Zhiyuan Liu, Lei Hou, Cunchao
Tu, Juanzi Li, Yun Liu, Weixing Shen, Maosong Sun
- Abstract summary: We present LEVEN, a large-scale Chinese LEgal eVENt detection dataset, with 8,116 legal documents and 150,977 human-annotated event mentions in 108 event types.
LEVEN is the largest Legal Event Detection dataset and has dozens of times the data scale of others, which shall significantly promote the training and evaluation of LED methods.
- Score: 82.44096140591675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recognizing facts is the most fundamental step in making judgments, hence
detecting events in the legal documents is important to legal case analysis
tasks. However, existing Legal Event Detection (LED) datasets only concern
incomprehensive event types and have limited annotated data, which restricts
the development of LED methods and their downstream applications. To alleviate
these issues, we present LEVEN a large-scale Chinese LEgal eVENt detection
dataset, with 8,116 legal documents and 150,977 human-annotated event mentions
in 108 event types. Not only charge-related events, LEVEN also covers general
events, which are critical for legal case understanding but neglected in
existing LED datasets. To our knowledge, LEVEN is the largest LED dataset and
has dozens of times the data scale of others, which shall significantly promote
the training and evaluation of LED methods. The results of extensive
experiments indicate that LED is challenging and needs further effort.
Moreover, we simply utilize legal events as side information to promote
downstream applications. The method achieves improvements of average 2.2 points
precision in low-resource judgment prediction, and 1.5 points mean average
precision in unsupervised case retrieval, which suggests the fundamentality of
LED. The source code and dataset can be obtained from
https://github.com/thunlp/LEVEN.
Related papers
- Event Stream based Sign Language Translation: A High-Definition Benchmark Dataset and A New Algorithm [46.002495818680934]
This paper proposes the use of high-definition Event streams for Sign Language Translation.
Event streams have a high dynamic range and dense temporal signals, which can withstand low illumination and motion blur well.
We propose a novel baseline method that fully leverages the Mamba model's ability to integrate temporal information of CNN features.
arXiv Detail & Related papers (2024-08-20T02:01:30Z) - MAVEN-Fact: A Large-scale Event Factuality Detection Dataset [55.01875707021496]
We introduce MAVEN-Fact, a large-scale and high-quality EFD dataset based on the MAVEN dataset.
MAVEN-Fact includes factuality annotations of 112,276 events, making it the largest EFD dataset.
Experiments demonstrate that MAVEN-Fact is challenging for both conventional fine-tuned models and large language models (LLMs)
arXiv Detail & Related papers (2024-07-22T03:43:46Z) - Comparing Optical Flow and Deep Learning to Enable Computationally Efficient Traffic Event Detection with Space-Filling Curves [0.6322312717516407]
We compare Optical Flow (OF) and Deep Learning (DL) to feed computationally efficient event detection via space-filling curves on video data from a forward-facing, in-vehicle camera.
Our results yield that the OF approach excels in specificity and reduces false positives, while the DL approach demonstrates superior sensitivity.
arXiv Detail & Related papers (2024-07-15T13:44:52Z) - Improving Event Definition Following For Zero-Shot Event Detection [66.27883872707523]
Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types.
We aim to improve zero-shot event detection by training models to better follow event definitions.
arXiv Detail & Related papers (2024-03-05T01:46:50Z) - LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting [65.71129509623587]
Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning.
However, the promising results achieved on current public datasets may not be applicable to practical scenarios.
We introduce the LargeST benchmark dataset, which includes a total of 8,600 sensors in California with a 5-year time coverage.
arXiv Detail & Related papers (2023-06-14T05:48:36Z) - GLEN: General-Purpose Event Detection for Thousands of Types [80.99866527772512]
We build a general-purpose event detection dataset GLEN, which covers 205K event mentions with 3,465 different types.
GLEN is 20x larger in ontology than today's largest event dataset.
We also propose a new multi-stage event detection model CEDAR specifically designed to handle the large size in GLEN.
arXiv Detail & Related papers (2023-03-16T05:36:38Z) - ClassActionPrediction: A Challenging Benchmark for Legal Judgment
Prediction of Class Action Cases in the US [0.0]
We release for the first time a challenging LJP dataset focused on class action cases in the US.
It is the first dataset in the common law system that focuses on the harder and more realistic task involving the complaints as input instead of the often used facts summary written by the court.
Our Longformer model clearly outperforms the human baseline (63%), despite only considering the first 2,048 tokens. Furthermore, we perform a detailed error analysis and find that the Longformer model is significantly better calibrated than the human experts.
arXiv Detail & Related papers (2022-11-01T16:57:59Z) - MAVEN: A Massive General Domain Event Detection Dataset [56.00401399384715]
Event detection (ED) is the first and most fundamental step for extracting event knowledge from plain text.
Existing datasets exhibit issues that limit further development of ED.
We present a MAssive eVENt detection dataset (MAVEN), which contains 4,480 Wikipedia documents, 118,732 event mention instances, and 168 event types.
arXiv Detail & Related papers (2020-04-28T15:25:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.