Fugu-MT 論文翻訳(概要): On Task Performance and Model Calibration with Supervised and Self-Ensembled In-Context Learning

論文の概要: On Task Performance and Model Calibration with Supervised and Self-Ensembled In-Context Learning

arxiv url: http://arxiv.org/abs/2312.13772v1
Date: Thu, 21 Dec 2023 11:55:10 GMT
ステータス: 翻訳完了
システム内更新日: 2023-12-22 15:04:46.500419
Title: On Task Performance and Model Calibration with Supervised and Self-Ensembled In-Context Learning
Title（参考訳）: 教師付き自己組み立て型インコンテキスト学習によるタスク性能とモデル校正について
Authors: Chengzu Li, Han Zhou, Goran Glava\v{s}, Anna Korhonen, Ivan Vuli\'c
Abstract要約: In-context Learning (ICL) は、近年の大規模言語モデル(LLM)の進歩により、効率的なアプローチとなっている。しかし、両方のパラダイムは、過信の批判的な問題(すなわち、誤校正)に苦しむ傾向にある。
参考スコア（独自算出の注目度）: 71.44986275228747
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Following the standard supervised fine-tuning (SFT) paradigm, in-context learning (ICL) has become an efficient approach propelled by the recent advancements in large language models (LLMs), yielding promising performance across various tasks in few-shot data setups. However, both paradigms are prone to suffer from the critical problem of overconfidence (i.e., miscalibration), especially in such limited data setups. In this work, we deliver an in-depth analysis of the behavior across different choices of learning methods from the perspective of both performance and calibration, as well as their interplay. Through extensive controlled experiments, we find that simultaneous gains for both task performance and calibration are difficult to achieve, and the problem of miscalibration exists across all learning methods in low-resource scenarios.To address this challenging trade-off between performance and calibration, we then investigate the potential of self-ensembling techniques applied at different modeling stages (e.g., variations of in-context examples or variations in prompts or different ensembling strategies). We justify the feasibility of self-ensembling on SFT in addition to ICL, to make the predictions more calibrated and have comparable or even better performance. Our work sheds light on which learning paradigm to choose and how to enhance both task performance and calibration of LLMs.
Abstract（参考訳）: 標準教師付き微調整(SFT)パラダイムに従って、インコンテキスト学習(ICL)は、最近の大規模言語モデル(LLM)の進歩によって推進される効率的なアプローチとなり、数発のデータセットで様々なタスクにわたって有望なパフォーマンスが得られる。しかし、両方のパラダイムは、特にそのような限られたデータ設定において、過信(すなわち誤校正)の致命的な問題に悩まされがちである。本研究では,学習方法の異なる選択に対して,パフォーマンスとキャリブレーションと相互作用の両方の観点から,行動の詳細な分析を行う。 Through extensive controlled experiments, we find that simultaneous gains for both task performance and calibration are difficult to achieve, and the problem of miscalibration exists across all learning methods in low-resource scenarios.To address this challenging trade-off between performance and calibration, we then investigate the potential of self-ensembling techniques applied at different modeling stages (e.g., variations of in-context examples or variations in prompts or different ensembling strategies). ICLに加えて、SFT上での自己理解の可能性も正当化し、予測を校正し、比較や性能の向上を図る。我々の研究は、選択する学習パラダイムと、タスクパフォーマンスとllmのキャリブレーションの両方を強化する方法に光を当てている。

関連論文リスト

Enhancing LLM Robustness to Perturbed Instructions: An Empirical Study [8.827173113748701]
ダウンストリーム性能を著しく低下させるタスク特化命令の文字・単語レベルの編集について検討した。平均的に、自己否定は、代替戦略よりも大幅に高いパフォーマンス向上を達成することが分かっています。
論文参考訳（メタデータ） (2025-04-03T16:17:56Z)
Influences on LLM Calibration: A Study of Response Agreement, Loss Functions, and Prompt Styles [4.477423478591491]
Calib-nは、信頼度推定のための補助モデルをトレーニングする新しいフレームワークである。補助的なモデルベース手法では,数発のプロンプトが最も有効であることが判明した。
論文参考訳（メタデータ） (2025-01-07T18:48:42Z)
What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
モデルの一般化動作は,事前記憶列車の精度と呼ばれるトレーニング指標によって効果的に特徴づけられることがわかった。モデルの学習行動と一般化を結びつけることで、トレーニング戦略に目標とする改善を導くことができる。
論文参考訳（メタデータ） (2024-11-12T09:52:40Z)
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
正規化勾配差(NGDiff)アルゴリズムを導入し、目的間のトレードオフをよりよく制御できるようにする。本研究では,TOFUおよびMUSEデータセットにおける最先端の未学習手法において,NGDiffの優れた性能を実証的に実証し,理論的解析を行った。
論文参考訳（メタデータ） (2024-10-29T14:41:44Z)
Enhancing Training Data Attribution for Large Language Models with Fitting Error Consideration [74.09687562334682]
Debias and Denoise Attribution (DDA) と呼ばれる新しいトレーニングデータ属性法を導入する。提案手法は既存のアプローチよりも優れており,平均91.64%のAUCを実現している。 DDAは、様々なソースとLLaMA2、QWEN2、Mistralのような異なるスケールのモデルに対して、強力な汎用性とスケーラビリティを示す。
論文参考訳（メタデータ） (2024-10-02T07:14:26Z)
Is Difficulty Calibration All We Need? Towards More Practical Membership Inference Attacks [16.064233621959538]
我々は,textbfRe-levertextbfA を直接 textbfRe-levertextbfA を用いて mtextbfItigate the error in textbfDifficulty calibration を提案する。
論文参考訳（メタデータ） (2024-08-31T11:59:42Z)
Uncertainty Aware Learning for Language Model Alignment [97.36361196793929]
異なるタスクシナリオのモデルアライメントを改善するために,不確実性認識学習(UAL)を提案する。トレーニングのラベルの平滑化値を個々のサンプルの不確実性に応じて適応的に設定する。広く使われているベンチマーク実験では、我々のUALは標準教師あり微調整よりも著しく優れています。
論文参考訳（メタデータ） (2024-06-07T11:37:45Z)
Calibrating Large Language Models with Sample Consistency [76.23956851098598]
本稿では,複数サンプルモデル生成系の分布から信頼度を導出する可能性について,一貫性の3つの尺度を用いて検討する。その結果、一貫性に基づくキャリブレーション手法は、既存のポストホック手法よりも優れていることがわかった。種々のLMの特性に合わせて,キャリブレーションに適した整合性指標を選択するための実用的なガイダンスを提供する。
論文参考訳（メタデータ） (2024-02-21T16:15:20Z)
A Study on the Calibration of In-context Learning [27.533223818505682]
In-context Learning (ICL) は静的言語モデルに適切なプロンプトで適応するための一般的な手法である。また,ICL例の増加に伴い,モデルの誤校正が向上し,キャリブレーションの精度が向上することが確認された。再校正手法について検討し,スケーリング結合キャリブレータが一貫した校正誤差を低減できることを見出した。
論文参考訳（メタデータ） (2023-12-07T03:37:39Z)
Latent Alignment with Deep Set EEG Decoders [44.128689862889715]
本稿では,脳波伝達学習大会のベンチマークで優勝した潜在アライメント手法を紹介する。我々は,その定式化を,与えられた被験者の試行セットに適用したディープセットとして提示する。実験の結果,深層学習モデルにおける後段の統計的分布アライメントの実行は,分類精度に有益であることが示唆された。
論文参考訳（メタデータ） (2023-11-29T12:40:45Z)
On the Calibration of Large Language Models and Alignment [63.605099174744865]
信頼性キャリブレーションは、ディープモデルの信頼性を高める重要なツールである。構築プロセス全体を通して、アライメント言語モデルの校正を体系的に検討する。我々の研究は、人気のあるLCMが十分に校正されているか、トレーニングプロセスがモデルの校正にどのように影響するかに光を当てています。
論文参考訳（メタデータ） (2023-11-22T08:57:55Z)
Learning Objective-Specific Active Learning Strategies with Attentive Neural Processes [72.75421975804132]
学びアクティブラーニング(LAL)は、アクティブラーニング戦略自体を学ぶことを提案し、与えられた設定に適応できるようにする。能動学習問題の対称性と独立性を利用した新しい分類法を提案する。私たちのアプローチは、筋電図から学ぶことに基づいており、モデルに標準ではない目的に適応する能力を与えます。
論文参考訳（メタデータ） (2023-09-11T14:16:37Z)
On Calibrating Semantic Segmentation Models: Analyses and An Algorithm [51.85289816613351]
セマンティックセグメンテーションキャリブレーションの問題について検討する。モデルキャパシティ、作物サイズ、マルチスケールテスト、予測精度はキャリブレーションに影響を及ぼす。我々は、単純で統一的で効果的なアプローチ、すなわち選択的スケーリングを提案する。
論文参考訳（メタデータ） (2022-12-22T22:05:16Z)
Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Stochastic Approach [38.76462300149459]
我々は多目的勾配最適化のための多目的補正法(MoCo)を開発した。本手法の特長は,非公正勾配を増大させることなく収束を保証できる点である。
論文参考訳（メタデータ） (2022-10-23T05:54:26Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。