Fugu-MT 論文翻訳(概要): BPL: Bias-adaptive Preference Distillation Learning for Recommender System

論文の概要: BPL: Bias-adaptive Preference Distillation Learning for Recommender System

arxiv url: http://arxiv.org/abs/2510.16076v1
Date: Fri, 17 Oct 2025 11:09:04 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 00:56:38.832644
Title: BPL: Bias-adaptive Preference Distillation Learning for Recommender System
Title（参考訳）: BPL:レコメンダシステムのためのバイアス適応型選好蒸留学習
Authors: SeongKu Kang, Jianxun Lian, Dongha Lee, Wonbin Kweon, Sanghwan Jang, Jaehyun Lee, Jindong Wang, Xing Xie, Hwanjo Yu,
Abstract要約: 本稿では,BPL(Bias-Adaptive Preference distillation Learning)と呼ばれる新たな学習フレームワークを導入し,ユーザの嗜好を徐々に明らかにする。 BPLは、収集したフィードバックに合わせた正確な好みの知識を保持し、実際のテストで高いパフォーマンスをもたらす。信頼性フィルタリングによる自己蒸留により、BPLはトレーニングプロセスを通してその知識を反復的に洗練する。
参考スコア（独自算出の注目度）: 61.916973366625285
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recommender systems suffer from biases that cause the collected feedback to incompletely reveal user preference. While debiasing learning has been extensively studied, they mostly focused on the specialized (called counterfactual) test environment simulated by random exposure of items, significantly degrading accuracy in the typical (called factual) test environment based on actual user-item interactions. In fact, each test environment highlights the benefit of a different aspect: the counterfactual test emphasizes user satisfaction in the long-terms, while the factual test focuses on predicting subsequent user behaviors on platforms. Therefore, it is desirable to have a model that performs well on both tests rather than only one. In this work, we introduce a new learning framework, called Bias-adaptive Preference distillation Learning (BPL), to gradually uncover user preferences with dual distillation strategies. These distillation strategies are designed to drive high performance in both factual and counterfactual test environments. Employing a specialized form of teacher-student distillation from a biased model, BPL retains accurate preference knowledge aligned with the collected feedback, leading to high performance in the factual test. Furthermore, through self-distillation with reliability filtering, BPL iteratively refines its knowledge throughout the training process. This enables the model to produce more accurate predictions across a broader range of user-item combinations, thereby improving performance in the counterfactual test. Comprehensive experiments validate the effectiveness of BPL in both factual and counterfactual tests. Our implementation is accessible via: https://github.com/SeongKu-Kang/BPL.
Abstract（参考訳）: レコメンダシステムは、収集されたフィードバックがユーザの好みを不完全な形で明らかにするバイアスに悩まされる。偏見学習は広く研究されているが、彼らは主に、アイテムのランダムな露出によってシミュレートされた特殊(偽物と呼ばれる)テスト環境に焦点を当て、実際のユーザとテムの相互作用に基づいて、典型的な(実物と呼ばれる)テスト環境での精度を著しく低下させた。事実、各テスト環境は異なる側面の利点を強調している: カウンターファクトテストは長期的なユーザの満足度を強調し、事実テストはプラットフォーム上でのその後のユーザの振る舞いを予測することに焦点を当てている。したがって、1つだけではなく、両方のテストでうまく機能するモデルを持つことが望ましい。本研究では,バイアズ適応型優先蒸留学習(BPL)と呼ばれる新たな学習フレームワークを導入し,利用者の嗜好を2つの蒸留戦略で徐々に明らかにする。これらの蒸留戦略は, 実物と反物の両方で高い性能を発揮するように設計されている。偏りのあるモデルから専門的な形態の教師・学生蒸留を採用することで、BPLは収集したフィードバックに合わせた正確な嗜好知識を保持し、実際のテストで高いパフォーマンスを実現する。さらに、信頼性フィルタリングによる自己蒸留により、BPLはトレーニングプロセスを通じてその知識を反復的に洗練する。これにより、モデルがより広範囲のユーザとイテムの組み合わせにわたってより正確な予測を生成でき、その結果、偽物テストのパフォーマンスが向上する。総合的な実験により,BPLの有効性が実証された。私たちの実装は、https://github.com/SeongKu-Kang/BPLを介してアクセスできます。

論文の概要: BPL: Bias-adaptive Preference Distillation Learning for Recommender System

関連論文リスト