GLEN: General-Purpose Event Detection for Thousands of Types
- URL: http://arxiv.org/abs/2303.09093v3
- Date: Tue, 31 Oct 2023 17:21:45 GMT
- Title: GLEN: General-Purpose Event Detection for Thousands of Types
- Authors: Qiusi Zhan, Sha Li, Kathryn Conger, Martha Palmer, Heng Ji, Jiawei Han
- Abstract summary: We build a general-purpose event detection dataset GLEN, which covers 205K event mentions with 3,465 different types.
GLEN is 20x larger in ontology than today's largest event dataset.
We also propose a new multi-stage event detection model CEDAR specifically designed to handle the large size in GLEN.
- Score: 80.99866527772512
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: The progress of event extraction research has been hindered by the absence of
wide-coverage, large-scale datasets. To make event extraction systems more
accessible, we build a general-purpose event detection dataset GLEN, which
covers 205K event mentions with 3,465 different types, making it more than 20x
larger in ontology than today's largest event dataset. GLEN is created by
utilizing the DWD Overlay, which provides a mapping between Wikidata Qnodes and
PropBank rolesets. This enables us to use the abundant existing annotation for
PropBank as distant supervision. In addition, we also propose a new multi-stage
event detection model CEDAR specifically designed to handle the large ontology
size in GLEN. We show that our model exhibits superior performance compared to
a range of baselines including InstructGPT. Finally, we perform error analysis
and show that label noise is still the largest challenge for improving
performance for this new dataset. Our dataset, code, and models are released at
\url{https://github.com/ZQS1943/GLEN}.}
Related papers
- Plain-Det: A Plain Multi-Dataset Object Detector [22.848784430833835]
Plain-Det offers flexibility to accommodate new datasets, in performance across diverse datasets, and training efficiency.
We conduct extensive experiments on 13 downstream datasets and Plain-Det demonstrates strong generalization capability.
arXiv Detail & Related papers (2024-07-14T05:18:06Z) - DailyDVS-200: A Comprehensive Benchmark Dataset for Event-Based Action Recognition [51.96660522869841]
DailyDVS-200 is a benchmark dataset tailored for the event-based action recognition community.
It covers 200 action categories across real-world scenarios, recorded by 47 participants, and comprises more than 22,000 event sequences.
DailyDVS-200 is annotated with 14 attributes, ensuring a detailed characterization of the recorded actions.
arXiv Detail & Related papers (2024-07-06T15:25:10Z) - SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection.
Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets.
To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z) - Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline [37.06330707742272]
We first propose a new long-term and large-scale frame-event single object tracking dataset, termed FELT.
It contains 742 videos and 1,594,474 RGB frames and event stream pairs and has become the largest frame-event tracking dataset to date.
We propose a novel associative memory Transformer network as a unified backbone by introducing modern Hopfield layers into multi-head self-attention blocks to fuse both RGB and event data.
arXiv Detail & Related papers (2024-03-09T08:49:50Z) - Improving Event Definition Following For Zero-Shot Event Detection [66.27883872707523]
Existing approaches on zero-shot event detection usually train models on datasets annotated with known event types.
We aim to improve zero-shot event detection by training models to better follow event definitions.
arXiv Detail & Related papers (2024-03-05T01:46:50Z) - MAVEN-Arg: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation [104.6065882758648]
MAVEN-Arg is the first all-in-one dataset supporting event detection, event argument extraction, and event relation extraction.
As an EAE benchmark, MAVEN-Arg offers three main advantages: (1) a comprehensive schema covering 162 event types and 612 argument roles, all with expert-written definitions and examples; (2) a large data scale, containing 98,591 events and 290,613 arguments obtained with laborious human annotation; and (3) the exhaustive annotation supporting all task variants of EAE.
arXiv Detail & Related papers (2023-11-15T16:52:14Z) - BigDetection: A Large-scale Benchmark for Improved Object Detector
Pre-training [44.32782190757813]
We construct a new large-scale benchmark termed BigDetection.
Our dataset has 600 object categories and contains over 3.4M training images with 36M bounding boxes.
arXiv Detail & Related papers (2022-03-24T17:57:29Z) - Document-level Event Extraction via Heterogeneous Graph-based
Interaction Model with a Tracker [23.990907956996413]
Document-level event extraction aims to recognize event information from a whole piece of article.
Existing methods are not effective due to two challenges of this task.
We propose Heterogeneous Graph-based Interaction Model with a Tracker.
arXiv Detail & Related papers (2021-05-31T12:45:03Z) - Open Graph Benchmark: Datasets for Machine Learning on Graphs [86.96887552203479]
We present the Open Graph Benchmark (OGB) to facilitate scalable, robust, and reproducible graph machine learning (ML) research.
OGB datasets are large-scale, encompass multiple important graph ML tasks, and cover a diverse range of domains.
For each dataset, we provide a unified evaluation protocol using meaningful application-specific data splits and evaluation metrics.
arXiv Detail & Related papers (2020-05-02T03:09:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.