Fugu-MT 論文翻訳(概要): Scaling Performance and Low-Resource Annotation with Many-Shot In-Context Learning for Named Entity Recognition

論文の概要: Scaling Performance and Low-Resource Annotation with Many-Shot In-Context Learning for Named Entity Recognition

arxiv url: http://arxiv.org/abs/2606.21890v1
Date: Sat, 20 Jun 2026 05:39:59 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-26 21:53:28.440867
Title: Scaling Performance and Low-Resource Annotation with Many-Shot In-Context Learning for Named Entity Recognition
Title（参考訳）: 名前付きエンティティ認識のためのマルチショットインコンテキスト学習によるスケールパフォーマンスと低リソースアノテーション
Authors: Qi Zhang, Fangping Lan, Cornelia Caragea, Longin Jan Latecki, Eduard Dragut,
Abstract要約: 大規模言語モデル(LLM)を用いたインコンテキスト学習(ICL)は、名前付きエンティティ認識(NER)のための微調整の強力な代替手段として登場した。我々は,NER 用多ショット ICL の包括的調査を行い,低リソース NER タスクのアノテートおよび精細化におけるその有効性について検討する。実験では, 1) 数百のインコンテキスト例へのスケーリングにより, LLM が完全教師付きBERT モデルの性能にマッチしたり,さらに超えたりすることができること,(2) 約 100 人のラベル付きサンプルをデモとして使用することにより,マルチショットインコンテキストアノテーションが高品質なラベル付きラベルを生成可能であること,などが実証された。
参考スコア（独自算出の注目度）: 52.07624601400713
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In-context learning (ICL) with large language models (LLMs) has emerged as a powerful alternative to fine-tuning for Named Entity Recognition (NER), achieving strong performance with minimal annotation and no additional training. However, prior work has shown that despite their adaptability, LLMs still lag behind fully supervised models such as fine-tuned BERT in structured tasks like NER. While existing studies on ICL for NER have mainly explored few-shot settings, the potential of scaling to hundreds of demonstrations has not been thoroughly investigated. To address this gap, we conduct a comprehensive investigation of many-shot ICL for NER and further explore its effectiveness in annotating and refining data for low-resource NER tasks. Specifically, we evaluate various LLMs across multiple domains using hundreds of ICL examples and then assess the feasibility of using many-shot ICL as a data annotation framework. Our experiments demonstrate that: (1) scaling to hundreds of in-context examples enables LLMs to match or even surpass the performance of fully supervised BERT models; and (2) using about one hundred human-labeled examples as demonstrations, many-shot in-context annotation can generate high-quality labeled data, leading to approximately 10% absolute F1 improvement over existing state-of-the-art approaches when used to fine-tune BERT on low-resource NER.
Abstract（参考訳）: 大規模言語モデル(LLMs)を備えたインコンテキスト学習(ICL)は、名前付きエンティティ認識(NER)の微調整の強力な代替手段として現れ、最小限のアノテーションと追加のトレーニングなしで強力なパフォーマンスを実現している。しかし、以前の研究によると、LLMは適応性にも拘わらず、NERのような構造化タスクにおいて、細調整のBERTのような完全な教師付きモデルに遅れを取っている。 NERのためのICLに関する既存の研究は、主に数ショット設定を探索してきたが、数百のデモにスケールする可能性については、十分には研究されていない。このギャップに対処するため、我々は、NERのための多ショットICLを包括的に調査し、低リソースのNERタスクのアノテートおよび精細化におけるその有効性について検討する。具体的には、数百のICL例を用いて、複数のドメインにまたがる様々なLCMを評価し、マルチショットICLをデータアノテーションフレームワークとしての有用性を評価する。実験では, 1) 数百のコンテキスト内サンプルへのスケーリングにより, LLM が完全教師付きBERT モデルの性能に適合したり,あるいは超越したりすること,(2) 約100人のラベル付きサンプルをデモとして使用することにより,マルチショットインコンテキストアノテーションは高品質なラベル付きデータを生成することができ,低リソース NER 上でBERT を微調整する場合に,既存の最先端アプローチに比べて約10%の絶対的な F1 改善をもたらすことを実証した。

論文の概要: Scaling Performance and Low-Resource Annotation with Many-Shot In-Context Learning for Named Entity Recognition

関連論文リスト