Fugu-MT 論文翻訳(概要): MoRight: Motion Control Done Right

論文の概要: MoRight: Motion Control Done Right

arxiv url: http://arxiv.org/abs/2604.07348v1
Date: Wed, 08 Apr 2026 17:59:22 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-09 17:30:51.672782
Title: MoRight: Motion Control Done Right
Title（参考訳）: MoRight:モーションコントロールが正しい
Authors: Shaowei Liu, Xuanchi Ren, Tianchang Shen, Huan Ling, Saurabh Gupta, Shenlong Wang, Sanja Fidler, Jun Gao,
Abstract要約: 両制約に対処する統合フレームワークであるMoRightを紹介した。オブジェクトの動きは標準静的ビューで指定され、任意のターゲットカメラ視点に転送される。 3つのベンチマークの実験では、生成品質、動作制御性、相互作用認識における最先端のパフォーマンスが示されている。
参考スコア（独自算出の注目度）: 71.36903230523589
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generating motion-controlled videos--where user-specified actions drive physically plausible scene dynamics under freely chosen viewpoints--demands two capabilities: (1) disentangled motion control, allowing users to separately control the object motion and adjust camera viewpoint; and (2) motion causality, ensuring that user-driven actions trigger coherent reactions from other objects rather than merely displacing pixels. Existing methods fall short on both fronts: they entangle camera and object motion into a single tracking signal and treat motion as kinematic displacement without modeling causal relationships between object motion. We introduce MoRight, a unified framework that addresses both limitations through disentangled motion modeling. Object motion is specified in a canonical static-view and transferred to an arbitrary target camera viewpoint via temporal cross-view attention, enabling disentangled camera and object control. We further decompose motion into active (user-driven) and passive (consequence) components, training the model to learn motion causality from data. At inference, users can either supply active motion and MoRight predicts consequences (forward reasoning), or specify desired passive outcomes and MoRight recovers plausible driving actions (inverse reasoning), all while freely adjusting the camera viewpoint. Experiments on three benchmarks demonstrate state-of-the-art performance in generation quality, motion controllability, and interaction awareness.
Abstract（参考訳）: 動作制御ビデオの生成 - ユーザが指定したアクションが、自由選択された視点下で物理的に可塑性なシーンダイナミクスを駆動する - 1) ゆがみのあるモーションコントロール、ユーザが別々にオブジェクトの動きを制御し、カメラの視点を調整できる機能、(2) 動作因果性、そして、ユーザーが駆動するアクションが、単にピクセルを分解するのではなく、他のオブジェクトからのコヒーレントな反応を引き起こすことを保証する。既存の方法は、カメラと物体の動きを単一の追跡信号に絡めて、物体の動き間の因果関係をモデル化することなく運動を運動運動として扱う。両制約に対処する統合フレームワークであるMoRightを紹介した。物体の動きは標準静的ビューで指定され、時間的クロスビューの注意を通して任意の目標カメラ視点に移動される。さらに、動作をアクティブな(ユーザ主導)および受動的(コンシークエンス)コンポーネントに分解し、データから動作因果性を学ぶためのモデルを訓練する。推論では、アクティブな動作を提供し、MoRightは結果(前方推論)を予測するか、望ましい受動的結果を指定するか、MoRightは、カメラの視点を自由に調整しながら、妥当な駆動動作(逆推論)を回復する。 3つのベンチマークの実験では、生成品質、動作制御性、相互作用認識における最先端のパフォーマンスが示されている。

論文の概要: MoRight: Motion Control Done Right

関連論文リスト