Fugu-MT 論文翻訳(概要): SSDAU: Structured Semantic Data Augmentation for Joint Entity and Relation Extraction

論文の概要: SSDAU: Structured Semantic Data Augmentation for Joint Entity and Relation Extraction

arxiv url: http://arxiv.org/abs/2605.23440v3
Date: Wed, 27 May 2026 02:26:44 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-28 17:38:54.888234
Title: SSDAU: Structured Semantic Data Augmentation for Joint Entity and Relation Extraction
Title（参考訳）: SSDAU:ジョイントエンティティとリレーショナル抽出のための構造化意味データ拡張
Authors: Jiawei He, Mengyu Shi, Jiawei Liu, Dong Sun, Zhijie Wang, Chunrong Fang, Xikai Yang, Zhenyu Chen,
Abstract要約: 本研究では,拡張中のテキストの意味的構造を保存するために,構造化意味データ拡張(SSDAU)を提案する。 SSDAUはエンティティラベルに基づいてテキストをセグメントし、エンコーダを使用してエンティティの意味的特徴をキャプチャする。その後、エンティティセマンティック再構築を行い、拡張データを生成する。実験により、SSDAUはあいまいさに対して優れた堅牢性を持つセマンティック一貫性のあるデータを生成することが示された。
参考スコア（独自算出の注目度）: 19.139039630736946
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Joint Entity and Relation Extraction (JERE) is highly susceptible to weak generalization due to low-quality training data. Data augmentation is a common strategy to enhance model generalization across different domains. However, existing data augmentation methods often overlook text relevance and may disrupt semantic structures and dependencies, making it difficult to generate effective augmented data for improving model generalization. In this paper, we propose Structured Semantic Data Augmentation (SSDAU), a novel method designed to preserve the semantic structure of text during augmentation. SSDAU segments text based on entity labels and employs an encoder to capture semantic features of entities through context awareness. It then performs entity semantic restructuring to generate augmented data. To distinguish semantically similar entities, SSDAU fuses contextualized embeddings with traditional similarity scores. To mitigate potential topic ambiguity and information loss, we apply the BERTTopic model to filter out irrelevant topics, ensuring topic consistency. We evaluate SSDAU on datasets with different annotation types and compare its performance on five representative JERE models against seven popular data augmentation baselines. Experiments demonstrate that SSDAU generates semantically consistent data with superior robustness against ambiguity (8.26% F1 decrease vs. 31.91% for baselines), significantly outperforming all existing methods across all metrics.
Abstract（参考訳）: 統合エンティティと関係抽出(JERE)は、低品質のトレーニングデータによる弱い一般化の影響を受けやすい。データ拡張は、異なるドメインにわたるモデルの一般化を強化するための一般的な戦略である。しかし、既存のデータ拡張手法は、しばしばテキストの関連性を見落とし、意味構造や依存関係を乱す可能性があるため、モデル一般化を改善するために効果的な拡張データを生成することは困難である。本稿では,拡張中のテキストのセマンティック構造を保存するための新しい手法である構造化意味データ拡張(SSDAU)を提案する。 SSDAUはエンティティラベルに基づいてテキストをセグメンテーションし、コンテキスト認識を通じてエンティティの意味的特徴をキャプチャするエンコーダを使用する。その後、エンティティセマンティック再構築を行い、拡張データを生成する。意味的に類似したエンティティを区別するために、SSDAUはコンテキスト化された埋め込みを従来の類似度スコアで融合する。潜在的な話題の曖昧さと情報損失を軽減するため,無関係なトピックをフィルタリングし,トピックの一貫性を確保するためにBERTTopicモデルを適用した。アノテーションタイプが異なるデータセット上でSSDAUを評価し,5つの代表的なJEREモデルの性能を7つの一般的なデータ拡張ベースラインと比較した。実験の結果、SSDAUは曖昧さに対して優れた堅牢性を持つセマンティック一貫性のあるデータを生成する(F1は8.26%減少し、ベースラインは31.91%減少)。

論文の概要: SSDAU: Structured Semantic Data Augmentation for Joint Entity and Relation Extraction

関連論文リスト