Fugu-MT 論文翻訳(概要): Not Every Subject Should Stay: Machine Unlearning for Noisy Engagement Recognition

論文の概要: Not Every Subject Should Stay: Machine Unlearning for Noisy Engagement Recognition

arxiv url: http://arxiv.org/abs/2605.04713v1
Date: Wed, 06 May 2026 10:03:06 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-07 18:41:07.763082
Title: Not Every Subject Should Stay: Machine Unlearning for Noisy Engagement Recognition
Title（参考訳）: すべての被験者が待機すべきではない: ノイズの多いエンゲージメント認識のための機械学習
Authors: Alexander Vedernikov,
Abstract要約: エンゲージメント認識データセットは典型的には主観的インデクシングであり、しばしば騒々しく主観的な監督を含んでいる。本研究では、この設定を、エンゲージメント認識のためのポストホック衛生機構として、主観レベルマシンアンラーニングを通して研究する。
参考スコア（独自算出の注目度）: 53.005382593686356
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Engagement recognition datasets are typically subject-indexed and often contain noisy, subjective supervision, making post-hoc dataset revision a practical problem. Existing noisy-label and data-cleaning methods largely operate at the sample level before or during training, but do not directly address a different question: once a model has already been trained, can the influence of an entire problematic subject be removed without full retraining? We study this setting through subject-level machine unlearning as a post-hoc sanitization mechanism for engagement recognition. Starting from a baseline trained on all subjects, we rank candidate harmful subjects using a model-dependent proxy, apply a lightweight approximate unlearning update, and compare the result against an oracle model retrained from scratch on the retained subjects only. We instantiate this protocol on DAiSEE and EngageNet using Tensor-Convolution and Convolution-Transformer Network (TCCT-Net) as a fixed platform and evaluate three matched model states under the same removal scenario: baseline, unlearned, and oracle. In representative K=3 forget-set settings, the unlearned model recovers 89.3% and 92.5% of the oracle gain on EngageNet and DAiSEE, respectively, at roughly one quarter of retraining cost. Across the tested small-audit regimes, effectiveness is strongest at an intermediate forget-set size, indicating that approximate subject-level unlearning is a useful low-cost correction mechanism, but one whose benefit depends on subject selection quality and removal regime.
Abstract（参考訳）: エンゲージメント認識データセットは、典型的には主観的インデックスであり、しばしばノイズ、主観的監視を含んでおり、ポストホックデータセットの改訂を実用的な問題にしている。既存のノイズの多いラベルとデータクリーニングの方法は、トレーニング前やトレーニング中は、主にサンプルレベルで動作しますが、別の問題に対処することはできません。本研究では、この設定を、エンゲージメント認識のためのポストホック衛生機構として、主観レベルマシンアンラーニングを通して研究する。すべての被験者を対象に訓練されたベースラインから、モデルに依存したプロキシを用いて候補有害被験者をランク付けし、軽量な近似的未学習更新を適用し、保持対象者のみにスクラッチから再訓練されたオラクルモデルと比較する。我々は、このプロトコルをTensor-Convolution and Convolution-Transformer Network (TCCT-Net) を固定プラットフォームとしてDAiSEEおよびEngageNet上でインスタンス化し、同じ除去シナリオで一致する3つのモデル状態(ベースライン、未学習、オラクル)を評価する。典型的な K=3 の忘れセット設定では、未学習のモデルは、EngageNet と DAiSEE におけるオラクルゲインの89.3%と92.5%を、約4分の1のトレーニングコストで回復する。被検者レベルの未学習は, 被検者選択の質や除去の態勢に左右されるが, 有効性は, 被検者選択の質や除去の態勢に左右される。

論文の概要: Not Every Subject Should Stay: Machine Unlearning for Noisy Engagement Recognition

関連論文リスト