Fugu-MT 論文翻訳(概要): Collaborative Temporal Feature Generation via Critic-Free Reinforcement Learning for Cross-User Sensor-Based Activity Recognition

論文の概要: Collaborative Temporal Feature Generation via Critic-Free Reinforcement Learning for Cross-User Sensor-Based Activity Recognition

arxiv url: http://arxiv.org/abs/2603.16043v1
Date: Tue, 17 Mar 2026 01:03:21 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-18 17:42:07.058211
Title: Collaborative Temporal Feature Generation via Critic-Free Reinforcement Learning for Cross-User Sensor-Based Activity Recognition
Title（参考訳）: クロスユーザセンサによる活動認識のための批判自由強化学習による協調的時間特徴生成
Authors: Xiaozhou Ye, Feng Jiang, Zihan Wang, Xiulai Wang, Yutao Zhang, Kevin I-Kai Wang,
Abstract要約: ウェアラブル慣性センサーを用いたヒューマンアクティビティ認識は、医療モニタリング、フィットネス分析、コンテキスト認識コンピューティングの基礎となる。既存のドメインの一般化アプローチは、センサーストリームの時間的依存関係を無視したり、非現実的なターゲットドメインアノテーションに依存したりする。我々は、強化学習によって制御される協調的な逐次生成プロセスとして、一般化可能な特徴抽出をモデル化する新しいパラダイムを提案する。
参考スコア（独自算出の注目度）: 16.776182784171713
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Human Activity Recognition using wearable inertial sensors is foundational to healthcare monitoring, fitness analytics, and context-aware computing, yet its deployment is hindered by cross-user variability arising from heterogeneous physiological traits, motor habits, and sensor placements. Existing domain generalization approaches either neglect temporal dependencies in sensor streams or depend on impractical target-domain annotations. We propose a different paradigm: modeling generalizable feature extraction as a collaborative sequential generation process governed by reinforcement learning. Our framework, CTFG (Collaborative Temporal Feature Generation), employs a Transformer-based autoregressive generator that incrementally constructs feature token sequences, each conditioned on prior context and the encoded sensor input. The generator is optimized via Group-Relative Policy Optimization, a critic-free algorithm that evaluates each generated sequence against a cohort of alternatives sampled from the same input, deriving advantages through intra-group normalization rather than learned value estimation. This design eliminates the distribution-dependent bias inherent in critic-based methods and provides self-calibrating optimization signals that remain stable across heterogeneous user distributions. A tri-objective reward comprising class discrimination, cross-user invariance, and temporal fidelity jointly shapes the feature space to separate activities, align user distributions, and preserve fine-grained temporal content. Evaluations on the DSADS and PAMAP2 benchmarks demonstrate state-of-the-art cross-user accuracy (88.53\% and 75.22\%), substantial reduction in inter-task training variance, accelerated convergence, and robust generalization under varying action-space dimensionalities.
Abstract（参考訳）: ウェアラブル慣性センサーを用いたヒューマンアクティビティ認識は、医療モニタリング、フィットネス分析、コンテキスト認識コンピューティングの基礎となっているが、その展開は異種生理的特性、運動習慣、センサー配置から生じるユーザ間の多様性によって妨げられている。既存のドメインの一般化アプローチは、センサーストリームの時間的依存関係を無視したり、非現実的なターゲットドメインアノテーションに依存したりする。強化学習によって制御される協調的逐次生成プロセスとして、一般化可能な特徴抽出をモデル化する。我々のフレームワークであるCTFG(Collaborative Temporal Feature Generation)はTransformerベースの自動回帰ジェネレータを用いて、特徴トークンシーケンスをインクリメンタルに構築する。ジェネレータはグループ相対ポリシー最適化(Group-Relative Policy Optimization)によって最適化される。これは、同じ入力からサンプリングされたオルタナティブのコホートに対して各生成されたシーケンスを評価し、学習値推定よりもグループ内正規化による利点を導き出す。この設計は、批判に基づく手法に固有の分布依存バイアスを排除し、不均一なユーザ分布に対して安定な自己校正最適化信号を提供する。クラス識別、クロスユーザ不変性、時間的忠実性を含む三目的報酬は、特徴空間を共同で形成し、アクティビティを分離し、ユーザ分布を調整し、きめ細かい時間的内容を保存する。 DSADSとPAMAP2ベンチマークの評価では、最先端のクロスユーザー精度(88.53\%と75.22\%)、タスク間トレーニングのばらつきの大幅な低減、収束の加速、アクション空間次元の変化による堅牢な一般化が示されている。

論文の概要: Collaborative Temporal Feature Generation via Critic-Free Reinforcement Learning for Cross-User Sensor-Based Activity Recognition

関連論文リスト