Fugu-MT 論文翻訳(概要): An Efficient Metric for Data Quality Measurement in Imitation Learning

論文の概要: An Efficient Metric for Data Quality Measurement in Imitation Learning

arxiv url: http://arxiv.org/abs/2605.01544v1
Date: Sat, 02 May 2026 17:16:50 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-05 20:33:49.823977
Title: An Efficient Metric for Data Quality Measurement in Imitation Learning
Title（参考訳）: 模倣学習におけるデータ品質測定の効率化
Authors: Noushad Sojib, Momotaz Begum,
Abstract要約: デプロイ環境で収集されたエンドユーザによるデモを伴う、微調整済みのポリシは、この問題に対処するための有望な戦略である。実証データをキュレートするための既存の自動化アプローチは、環境におけるポリシーのロールアウトを必要とする。実演軌跡のパワースペクトル密度(PSD)に基づいて,高速で効率的で完全自動的な実演ランキング尺度を提案する。
参考スコア（独自算出の注目度）: 1.5469452301122175
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Imitation learning (IL) has seen remarkable progress, yet field deployment of IL-powered robots remains hindered by the challenge of out-of-distribution (OOD) scenarios. Fine-tuning pre-trained policies with end-user demonstrations collected in deployment environments is a promising strategy to address this challenge. However, end-user demonstrations are frequently of poor quality, characterized by excessive corrective motions, oscillations, and abrupt adjustments that degrade both learned and fine-tuned policy performance. Existing automated approaches for curating demonstration data require policy rollouts in the environment, making them computationally expensive and impractical for real-world deployment. In this paper, we propose a fast, efficient, and fully automated demonstration ranking metric based on the power spectral density (PSD) of demonstration trajectories. The PSD metric requires no policy learning, environment interaction, or expert labeling, making it well-suited for scalable, in-the-field data curation. Lower PSD values correspond to smoother, higher-quality demonstrations, while higher PSD values indicate erratic, artifact-laden trajectories. We evaluate the proposed metric on two benchmark imitation learning datasets comprising expert and lay-user demonstrations, and through a user study with older adults at a retirement facility, where collected demonstrations are used to fine-tune $\pi0.5$ \cite{intelligence2025pi_} for a daily living task. Results demonstrate that PSD-curated data yields policies with higher task success rates and smoother execution trajectories compared to uncurated baselines and two competitive data-ranking methods.
Abstract（参考訳）: イミテーション・ラーニング(IL)は目覚ましい進歩を遂げてきたが、IL駆動ロボットの現場展開は、アウト・オブ・ディストリビューション(OOD)シナリオの課題によって妨げられている。デプロイ環境で収集されたエンドユーザによるデモを伴う、微調整済みのポリシは、この問題に対処するための有望な戦略である。しかし、エンドユーザーによるデモンストレーションは、過度な修正動作、振動、そして学習と微調整の両方のパフォーマンスを低下させる急激な調整によって特徴付けられる、品質の悪いものが多い。実証データをキュレートするための既存の自動化アプローチでは、環境におけるポリシーのロールアウトが必要であり、実際のデプロイメントには計算コストがかかり実用的ではない。本稿では,実演軌跡のパワースペクトル密度(PSD)に基づいて,高速で効率的で完全自動的な実演ランキング尺度を提案する。 PSDメトリクスは、ポリシー学習、環境相互作用、専門家のラベル付けを必要としないため、スケーラブルで現場でのデータのキュレーションに適しています。低いPSD値はより滑らかで高品質なデモに対応し、高いPSD値は不規則でアーティファクトラデンな軌道を示す。提案手法は,専門家とレイユーザによる実演を含む2つのベンチマーク模擬学習データセットを用いて評価し,定年退職者施設の高齢者を対象としたユーザスタディを通じて,日常の生活作業において,$\pi0.5$ \cite{intelligence2025pi_}を微調整するために,収集された実演を用いて評価した。その結果,PSD処理したデータは,未処理のベースラインと競合する2つの手法と比較して,タスク成功率と実行軌道のスムーズなポリシが得られることがわかった。

論文の概要: An Efficient Metric for Data Quality Measurement in Imitation Learning

関連論文リスト