Fugu-MT 論文翻訳(概要): Phase-Localized Curation Does Not Help: A Negative Result on Per-Phase Metric Selection for Demonstration Filtering

論文の概要: Phase-Localized Curation Does Not Help: A Negative Result on Per-Phase Metric Selection for Demonstration Filtering

arxiv url: http://arxiv.org/abs/2606.15064v1
Date: Sat, 13 Jun 2026 02:45:10 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-16 16:21:32.757639
Title: Phase-Localized Curation Does Not Help: A Negative Result on Per-Phase Metric Selection for Demonstration Filtering
Title（参考訳）: 位相ローカライズされたキュレーションは役に立たない:デモレーションフィルタにおける1相あたりのメトリクス選択の負の結果
Authors: Aarav Bedi,
Abstract要約: 初期リリース構造欠陥を制御した3つの接触リッチLIBEROピック・アンド・プレイス・タスクに対して, 位相当たりの仮説を検証した。条件ごとに3つのタスクと5つのランダムなシードに対して、フェーズゲートキュレーションは最良のキュレーション戦略ではない。
参考スコア（独自算出の注目度）: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Manipulation demonstrations have temporal phase structure, and a natural hypothesis is that demonstration-curation metrics should be applied within phases rather than globally. The idea is to segment each trajectory into phases, score each phase with the metric that is locally most informative, and then aggregate. This follows directly from prior work showing that a single global metric can be the best detector of a defect and yet the worst curator of the resulting policy. We test the per-phase hypothesis on three contact-rich LIBERO pick-and-place tasks with a controlled early-release structural defect, comparing phase-gated curation against the same metrics applied uniformly and against a strong single global metric. Across all three tasks and five random seeds per condition, phase-gated curation is never the best curation strategy, and it is the worst of the three on two of the three tasks (Task 1: 86.0 vs. 92.0 for global; Task 3: 22.7 vs. 48.0 for uniform). We trace the failure to a concrete mechanism. When the defect signal is concentrated in a single phase, rank-aggregating across phases dilutes that signal with uninformative scores from defect-free phases, selecting a worse demonstration subset than simply applying the defect-informative metric everywhere. We further show that the per-phase metric selection does not transfer across tasks, since no phase shares a winning metric between any two tasks, so the selection cannot be reused and must be re-derived per task from a noisy sweep. These results bound a plausible and previously untested method, and they argue that practitioners should prefer identifying a single defect-informative metric over decomposing curation by phase. We release the full pipeline, all metric implementations, and per-seed results.
Abstract（参考訳）: マニピュレーション・デモは時相構造を持ち、自然な仮説として、デモ・キュレーション・メトリクスは世界規模ではなく、フェーズ内で適用されるべきである。この考え方は、各軌道を位相に分割し、各位相を局所的に最も情報的な計量でスコアし、集約する。これは、単一大域計量が欠陥の最良の検出器でありながら、結果として得られるポリシーの最悪のキュレーターであることを示す以前の研究から直接従う。初期リリース構造欠陥を制御した3つの接触リッチLIBEROピック・アンド・プレイスタスクの位相差仮説を検証し、一様かつ強い単一グローバルな測定値と比較した。 3つのタスクと5つのランダムなシードに対して、フェーズゲートのキュレーションは必ずしも最良のキュレーション戦略ではなく、3つのタスクのうちの2つの中で最悪である(タスク1:86.0、グローバルは92.0、タスク3:22.7、ユニフォームは48.0)。私たちはその失敗を具体的なメカニズムに辿った。欠陥信号が単一位相に集中すると、欠陥のない位相から不定値の信号が拡散し、欠陥非定値法を至る所で適用するよりも悪いデモサブセットが選択される。さらに、各相の計量選択は、どの2つのタスク間でも勝利のメートル法を共有できないため、その選択は再利用できず、ノイズの多いスイープからタスク毎に再抽出されなければならないため、タスク間で遷移しないことを示す。これらの結果は、実証不可能で未検証の手法に結びついており、実践者は、フェーズごとにキュレーションを分解するよりも、単一の欠陥情報量を特定することを好む、と彼らは主張する。完全なパイプライン、すべてのメトリック実装、およびシーケンス毎の結果をリリースしています。

論文の概要: Phase-Localized Curation Does Not Help: A Negative Result on Per-Phase Metric Selection for Demonstration Filtering

関連論文リスト