Fugu-MT 論文翻訳(概要): Evaluating Biomedical BERT Models for Vocabulary Alignment at Scale in the UMLS Metathesaurus

論文の概要: Evaluating Biomedical BERT Models for Vocabulary Alignment at Scale in the UMLS Metathesaurus

arxiv url: http://arxiv.org/abs/2109.13348v1
Date: Tue, 14 Sep 2021 16:52:16 GMT
ステータス: 翻訳完了
システム内更新日: 2021-10-03 11:34:08.732446
Title: Evaluating Biomedical BERT Models for Vocabulary Alignment at Scale in the UMLS Metathesaurus
Title（参考訳）: UMLSメタテーラスにおける語彙アライメントのためのバイオメディカルBERTモデルの評価
Authors: Goonmeet Bajaj, Vinh Nguyen, Thilini Wijesiriwardene, Hong Yung Yip, Vishesh Javangula, Srinivasan Parthasarathy, Amit Sheth, Olivier Bodenreider
Abstract要約: 現在のUMLS(Unified Medical Language System)メタテーザウルス構築プロセスは高価でエラーを起こしやすい。自然言語処理の最近の進歩は、下流タスクにおける最先端(SOTA)のパフォーマンスを達成している。 BERTモデルを用いたアプローチがUMLSメタテーラスの同義語予測において,既存のアプローチよりも優れているかどうかを検証することを目的としている。
参考スコア（独自算出の注目度）: 8.961270657070942
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: The current UMLS (Unified Medical Language System) Metathesaurus construction process for integrating over 200 biomedical source vocabularies is expensive and error-prone as it relies on the lexical algorithms and human editors for deciding if the two biomedical terms are synonymous. Recent advances in Natural Language Processing such as Transformer models like BERT and its biomedical variants with contextualized word embeddings have achieved state-of-the-art (SOTA) performance on downstream tasks. We aim to validate if these approaches using the BERT models can actually outperform the existing approaches for predicting synonymy in the UMLS Metathesaurus. In the existing Siamese Networks with LSTM and BioWordVec embeddings, we replace the BioWordVec embeddings with the biomedical BERT embeddings extracted from each BERT model using different ways of extraction. In the Transformer architecture, we evaluate the use of the different biomedical BERT models that have been pre-trained using different datasets and tasks. Given the SOTA performance of these BERT models for other downstream tasks, our experiments yield surprisingly interesting results: (1) in both model architectures, the approaches employing these biomedical BERT-based models do not outperform the existing approaches using Siamese Network with BioWordVec embeddings for the UMLS synonymy prediction task, (2) the original BioBERT large model that has not been pre-trained with the UMLS outperforms the SapBERT models that have been pre-trained with the UMLS, and (3) using the Siamese Networks yields better performance for synonymy prediction when compared to using the biomedical BERT models.
Abstract（参考訳）: 現在のuml(unified medical language system)メタセサウルスは200以上の生物医学的ソース語彙を統合するためのプロセスであり、語彙的アルゴリズムと人間の編集者に依存しており、2つの生物医学的用語が同義語であるかどうかを決定する。 BERTのようなトランスフォーマーモデルのような自然言語処理の最近の進歩と、文脈的単語埋め込みによるバイオメディカルな変形は、下流タスクにおける最先端(SOTA)のパフォーマンスを達成している。 BERTモデルを用いたこれらの手法がUMLSメタセソーラスの同義語予測において既存の手法よりも優れているかどうかを検証することを目的とする。既存のSiamese Networks と LSTM と BioWordVec の埋め込みでは,BioWordVec の埋め込みを,異なる抽出方法を用いて各 BERT モデルから抽出したバイオメディカルBERT 埋め込みに置き換える。トランスフォーマーアーキテクチャでは、異なるデータセットやタスクを用いて事前訓練された様々な生体医学BERTモデルの使用を評価する。 Given the SOTA performance of these BERT models for other downstream tasks, our experiments yield surprisingly interesting results: (1) in both model architectures, the approaches employing these biomedical BERT-based models do not outperform the existing approaches using Siamese Network with BioWordVec embeddings for the UMLS synonymy prediction task, (2) the original BioBERT large model that has not been pre-trained with the UMLS outperforms the SapBERT models that have been pre-trained with the UMLS, and (3) using the Siamese Networks yields better performance for synonymy prediction when compared to using the biomedical BERT models.

関連論文リスト

MedicalBERT: enhancing biomedical natural language processing using pretrained BERT-based model [0.0]
MedicalBERTは、大規模なバイオメディカルデータセットに基づいてトレーニングされた、事前訓練されたBERTモデルである。生物医学用語の理解を深めるドメイン固有の語彙を備えている。 MedicalBERTは、評価されたすべてのタスクで、汎用BERTモデルを平均5.67%上回る。
論文参考訳（メタデータ） (2025-07-06T03:38:05Z)
Multi-level biomedical NER through multi-granularity embeddings and enhanced labeling [3.8599767910528917]
本稿では,複数のモデルの強みを統合するハイブリッドアプローチを提案する。 BERTは、文脈化された単語の埋め込み、文字レベルの情報キャプチャのための事前訓練されたマルチチャネルCNN、およびテキスト内の単語間の依存関係のシーケンスラベリングとモデル化のためのBiLSTM + CRFを提供する。我々は、ベンチマークi2b2/2010データセットを用いて、F1スコア90.11を達成する。
論文参考訳（メタデータ） (2023-12-24T21:45:36Z)
Diversifying Knowledge Enhancement of Biomedical Language Models using Adapter Modules and Knowledge Graphs [54.223394825528665]
我々は、軽量なアダプターモジュールを用いて、構造化された生体医学的知識を事前訓練された言語モデルに注入するアプローチを開発した。バイオメディカル知識システムUMLSと新しいバイオケミカルOntoChemの2つの大きなKGと、PubMedBERTとBioLinkBERTの2つの著名なバイオメディカルPLMを使用している。計算能力の要件を低く保ちながら,本手法がいくつかの事例において性能改善につながることを示す。
論文参考訳（メタデータ） (2023-12-21T14:26:57Z)
Improving Biomedical Entity Linking with Retrieval-enhanced Learning [53.24726622142558]
$k$NN-BioELは、トレーニングコーパス全体から同様のインスタンスを予測のヒントとして参照する機能を備えたBioELモデルを提供する。 k$NN-BioELは、いくつかのデータセットで最先端のベースラインを上回ります。
論文参考訳（メタデータ） (2023-12-15T14:04:23Z)
Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis [84.12658971655253]
本稿では,マルチモーダルタスクのためのBERTベースのアーキテクチャであるAdapted Multimodal BERTを提案する。アダプタはタスクの事前訓練された言語モデルを手動で調整し、融合層はタスク固有の層ワイドな音声視覚情報とテキストBERT表現を融合させる。われわれは、このアプローチがより効率的なモデルにつながり、微調整されたモデルよりも優れ、ノイズの入力に堅牢であることを示した。
論文参考訳（メタデータ） (2022-12-01T17:31:42Z)
BioGPT: Generative Pre-trained Transformer for Biomedical Text Generation and Mining [140.61707108174247]
本稿では,大規模生物医学文献に基づいて事前学習したドメイン固有生成型トランスフォーマー言語モデルであるBioGPTを提案する。 BC5CDRでは44.98%、38.42%、40.76%のF1スコア、KD-DTIとDDIの関係抽出タスクでは78.2%、PubMedQAでは78.2%の精度が得られた。
論文参考訳（メタデータ） (2022-10-19T07:17:39Z)
Fine-Tuning Large Neural Language Models for Biomedical Natural Language Processing [55.52858954615655]
バイオメディカルNLPの微調整安定性に関する系統的研究を行った。我々は、特に低リソース領域において、微調整性能は事前トレーニング設定に敏感であることを示した。これらの技術は低リソースバイオメディカルNLPアプリケーションの微調整性能を大幅に向上させることができることを示す。
論文参考訳（メタデータ） (2021-12-15T04:20:35Z)
A Hybrid Approach to Measure Semantic Relatedness in Biomedical Concepts [0.0]
ELMo, BERT, Sentence BERTモデルを用いて概念優先項を符号化して概念ベクトルを生成した。 SNLIおよびSTSbデータセット上でSiameseネットワークを使用してすべてのBERTモデルをトレーニングし、モデルがセマンティック情報を学ぶことができるようにしました。概念ベクトルにオントロジー知識を注入すると、その品質がさらに向上し、関連性のスコアが向上する。
論文参考訳（メタデータ） (2021-01-25T16:01:27Z)
UmlsBERT: Clinical Domain Knowledge Augmentation of Contextual Embeddings Using the Unified Medical Language System Metathesaurus [73.86656026386038]
事前学習プロセス中にドメイン知識を統合するコンテキスト埋め込みモデルであるUmlsBERTを紹介する。これらの2つの戦略を適用することで、UmlsBERTは、臨床領域の知識を単語埋め込みにエンコードし、既存のドメイン固有モデルより優れている。
論文参考訳（メタデータ） (2020-10-20T15:56:31Z)
BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition [9.05154470433578]
既存のBioNERアプローチはこれらの問題を無視し、最先端(SOTA)モデルを直接採用することが多い。本稿では,大規模バイオメディカルコーパスを用いた効果的なドメイン固有言語モデルであるALBERTを提案する。
論文参考訳（メタデータ） (2020-09-19T12:58:47Z)
Pre-training technique to localize medical BERT and enhance biomedical BERT [0.0]
高品質で大容量のデータベースが公開されていないドメインでうまく機能する特定のBERTモデルを訓練することは困難である。本稿では,アップサンプリングと増幅語彙の同時事前学習という,一つの選択肢による1つの介入を提案する。我が国の医療用BERTは,医学文書分類タスクにおいて,従来のベースラインおよび他のBERTモデルよりも優れていた。
論文参考訳（メタデータ） (2020-05-14T18:00:01Z)
An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining [17.10823632511911]
複数のデコーダを用いたマルチタスク学習モデルについて,生物医学的および臨床的自然言語処理タスクの多様性について検討した。実験結果から,MTL微調整モデルが最先端トランスモデルより優れていることが示された。
論文参考訳（メタデータ） (2020-05-06T13:25:21Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。