Fugu-MT 論文翻訳(概要): Unsupervised Key Event Detection from Massive Text Corpora

論文の概要: Unsupervised Key Event Detection from Massive Text Corpora

arxiv url: http://arxiv.org/abs/2206.04153v1
Date: Wed, 8 Jun 2022 20:31:02 GMT
ステータス: 翻訳完了
システム内更新日: 2022-06-11 05:45:21.595096
Title: Unsupervised Key Event Detection from Massive Text Corpora
Title（参考訳）: 重大テキストコーパスからの教師なしキーイベント検出
Authors: Yunyi Zhang, Fang Guo, Jiaming Shen, Jiawei Han
Abstract要約: 本稿では,ニュースコーパスキーイベントから検出することを目的とした,中間レベルでのキーイベント検出という新たなタスクを提案する。このタスクは、イベントの理解と構造化をブリッジすることができ、キーイベントのテーマと時間的近接性のために本質的に困難である。我々は、新しいttf-itfスコアを用いて、時間的に頻繁なピークフレーズを抽出する、教師なしキーイベント検出フレームワークEvMineを開発した。
参考スコア（独自算出の注目度）: 42.31889135421941
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Automated event detection from news corpora is a crucial task towards mining fast-evolving structured knowledge. As real-world events have different granularities, from the top-level themes to key events and then to event mentions corresponding to concrete actions, there are generally two lines of research: (1) theme detection identifies from a news corpus major themes (e.g., "2019 Hong Kong Protests" vs. "2020 U.S. Presidential Election") that have very distinct semantics; and (2) action extraction extracts from one document mention-level actions (e.g., "the police hit the left arm of the protester") that are too fine-grained for comprehending the event. In this paper, we propose a new task, key event detection at the intermediate level, aiming to detect from a news corpus key events (e.g., "HK Airport Protest on Aug. 12-14"), each happening at a particular time/location and focusing on the same topic. This task can bridge event understanding and structuring and is inherently challenging because of the thematic and temporal closeness of key events and the scarcity of labeled data due to the fast-evolving nature of news articles. To address these challenges, we develop an unsupervised key event detection framework, EvMine, that (1) extracts temporally frequent peak phrases using a novel ttf-itf score, (2) merges peak phrases into event-indicative feature sets by detecting communities from our designed peak phrase graph that captures document co-occurrences, semantic similarities, and temporal closeness signals, and (3) iteratively retrieves documents related to each key event by training a classifier with automatically generated pseudo labels from the event-indicative feature sets and refining the detected key events using the retrieved documents. Extensive experiments and case studies show EvMine outperforms all the baseline methods and its ablations on two real-world news corpora.
Abstract（参考訳）: ニュースコーパスからのイベントの自動検出は、進化の早い構造化知識のマイニングにとって重要なタスクである。 As real-world events have different granularities, from the top-level themes to key events and then to event mentions corresponding to concrete actions, there are generally two lines of research: (1) theme detection identifies from a news corpus major themes (e.g., "2019 Hong Kong Protests" vs. "2020 U.S. Presidential Election") that have very distinct semantics; and (2) action extraction extracts from one document mention-level actions (e.g., "the police hit the left arm of the protester") that are too fine-grained for comprehending the event. 本稿では,ニュースコーパスのキーイベント(例:8月12～14日のHK空港試験)から,特定の時間/場所において,同じ話題に注目することを目的とした,中間レベルでのキーイベント検出という新たなタスクを提案する。この課題は、重要な出来事の主題的・時間的近接性と、ニュース記事の急速な進化の性質によるラベル付きデータの不足により、イベントの理解と構造化を橋渡しすることができる。 To address these challenges, we develop an unsupervised key event detection framework, EvMine, that (1) extracts temporally frequent peak phrases using a novel ttf-itf score, (2) merges peak phrases into event-indicative feature sets by detecting communities from our designed peak phrase graph that captures document co-occurrences, semantic similarities, and temporal closeness signals, and (3) iteratively retrieves documents related to each key event by training a classifier with automatically generated pseudo labels from the event-indicative feature sets and refining the detected key events using the retrieved documents. 大規模な実験とケーススタディにより、EvMineは2つの実世界のニュースコーパスにおいて、すべてのベースライン手法とその改善を上回ります。

論文の概要: Unsupervised Key Event Detection from Massive Text Corpora

関連論文リスト