Fugu-MT 論文翻訳(概要): DuoGesture: Neuro-Inspired and Biomechanically Informed Dual-Stream Co-Speech Gesture Generation

論文の概要: DuoGesture: Neuro-Inspired and Biomechanically Informed Dual-Stream Co-Speech Gesture Generation

arxiv url: http://arxiv.org/abs/2605.26236v1
Date: Mon, 25 May 2026 18:03:41 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-27 17:51:41.30195
Title: DuoGesture: Neuro-Inspired and Biomechanically Informed Dual-Stream Co-Speech Gesture Generation
Title（参考訳）: DuoGesture:ニューロインスパイアされたバイオメカニカルなDual-Stream Co-Speechジェスチャ生成
Authors: Ferdinand Paar, Lanmiao Liu, Aslı Özyürek, Serge Thill, Esam Ghaleb,
Abstract要約: 既存の全体的ジェスチャーモデルは、語彙的に接地されたセマンティックジェスチャと、しばしば韻律に沿ったビートジェスチャを混合する。 EmphDuoGestureは,共音声のジェスチャー合成をセマンティックストリームとビートストリームに分解する,ニューロインスパイアされたバイオメカニカルな二重ストリームアプローチである。
参考スコア（独自算出の注目度）: 27.296930205552954
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Co-speech gesture generation requires both semantic expressivity and biomechanically plausible rhythmic motion. Existing holistic gesture models mix lexically grounded semantic gestures with frequent prosody-aligned beat gestures. This limits semantic grounding, speech-motion alignment, and kinematic smoothness. We propose \emph{DuoGesture}, a neuro-inspired and biomechanically informed dual-stream approach that decomposes co-speech gesture synthesis into coupled semantic and beat streams. The two streams are coordinated by a \emph{Semantic Variational Information Bottleneck}, a stochastic frame-level gate that learns when semantic gestures should override rhythmic beat motion. The semantic stream is controlled by \emph{Motion-Grounded Semantic Conditioning}, which replaces purely linguistic word embeddings with motion-language representations to provide motion-aligned semantic priors for long-tailed lexical triggers of gestures. The beat stream is further regularised by an \emph{Inertial Beat Prior}, an anthropometry-weighted arm-chain module that reduces jitter and improves rhythmic consistency without constraining semantic frames. Objective evaluations and subjective experiments show that DuoGesture outperforms strong holistic baselines, while component ablations confirm the complementary roles of semantic grounding, stochastic stream selection, and biomechanical regularisation.
Abstract（参考訳）: 共同音声ジェスチャ生成には意味表現性と生体力学的に妥当なリズム運動の両方が必要である。既存の全体的ジェスチャーモデルは、語彙的に接地されたセマンティックジェスチャと、しばしば韻律に沿ったビートジェスチャを混合する。これにより意味的接地、音声と運動のアライメント、運動の滑らかさが制限される。本稿では,共音声のジェスチャー合成を複合意味ストリームとビートストリームに分解する,ニューロインスパイアされたバイオメカニカルなデュアルストリームアプローチである 'emph{DuoGesture} を提案する。この2つのストリームは、セマンティックなジェスチャーがリズミカルなビートの動きをオーバーライドすべきかどうかを学習する確率的なフレームレベルゲートである \emph{Semantic Variational Information Bottleneck} によってコーディネートされる。セマンティックストリームは \emph{Motion-Grounded Semantic Conditioning} によって制御される。これは、純粋に言語的な単語の埋め込みをモーション言語表現に置き換え、ジェスチャーの長い尾の語彙的トリガーに対して、動きに沿ったセマンティックプリエントを提供する。ビートストリームは、人為的に重み付けされたアームチェーンモジュールである \emph{Inertial Beat Prior} によってさらに正規化され、ジッタを低減し、セマンティックフレームを制約することなくリズム整合性を改善する。客観的な評価と主観的な実験により、DuoGestureは強い全体論的ベースラインよりも優れており、一方、コンポーネントの短縮は意味的接地、確率的ストリーム選択、生体力学正則化の相補的な役割を証明している。

論文の概要: DuoGesture: Neuro-Inspired and Biomechanically Informed Dual-Stream Co-Speech Gesture Generation

関連論文リスト