Fugu-MT 論文翻訳(概要): MoA-VR: A Mixture-of-Agents System Towards All-in-One Video Restoration

論文の概要: MoA-VR: A Mixture-of-Agents System Towards All-in-One Video Restoration

arxiv url: http://arxiv.org/abs/2510.08508v1
Date: Thu, 09 Oct 2025 17:42:51 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-10 17:54:15.260568
Title: MoA-VR: A Mixture-of-Agents System Towards All-in-One Video Restoration
Title（参考訳）: MoA-VR:オールインワンビデオ再生のためのMixture-of-Agentsシステム
Authors: Lu Liu, Chunlei Cai, Shaocheng Shen, Jianfeng Liang, Weimin Ouyang, Tianxiao Ye, Jian Mao, Huiyu Duan, Jiangchao Yao, Xiaoyun Zhang, Qiang Hu, Guangtao Zhai,
Abstract要約: 実世界のビデオは、ノイズ、圧縮アーティファクト、低照度歪みなどの複雑な劣化に悩まされることが多い。 3つの協調エージェントによる人間のプロの推論・処理手順を模倣したMoA-VRを提案する。具体的には、大規模かつ高解像度なビデオ劣化認識ベンチマークを構築し、視覚言語モデル(VLM)による劣化識別子を構築する。
参考スコア（独自算出の注目度）: 62.929029990341796
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Real-world videos often suffer from complex degradations, such as noise, compression artifacts, and low-light distortions, due to diverse acquisition and transmission conditions. Existing restoration methods typically require professional manual selection of specialized models or rely on monolithic architectures that fail to generalize across varying degradations. Inspired by expert experience, we propose MoA-VR, the first \underline{M}ixture-\underline{o}f-\underline{A}gents \underline{V}ideo \underline{R}estoration system that mimics the reasoning and processing procedures of human professionals through three coordinated agents: Degradation Identification, Routing and Restoration, and Restoration Quality Assessment. Specifically, we construct a large-scale and high-resolution video degradation recognition benchmark and build a vision-language model (VLM) driven degradation identifier. We further introduce a self-adaptive router powered by large language models (LLMs), which autonomously learns effective restoration strategies by observing tool usage patterns. To assess intermediate and final processed video quality, we construct the \underline{Res}tored \underline{V}ideo \underline{Q}uality (Res-VQ) dataset and design a dedicated VLM-based video quality assessment (VQA) model tailored for restoration tasks. Extensive experiments demonstrate that MoA-VR effectively handles diverse and compound degradations, consistently outperforming existing baselines in terms of both objective metrics and perceptual quality. These results highlight the potential of integrating multimodal intelligence and modular reasoning in general-purpose video restoration systems.
Abstract（参考訳）: 実世界のビデオは、様々な取得と送信条件のために、ノイズ、圧縮アーティファクト、低光歪みなどの複雑な劣化に悩まされることが多い。既存の復元法は、通常、専門的なモデルの専門的な手作業による選択を必要とするか、あるいは様々な劣化に対して一般化に失敗するモノリシックなアーキテクチャに依存している。専門家の体験に触発されて,3つの調整されたエージェントによる人的専門家の推論・処理手順を模倣した,第1回 \underline{M}ixture-\underline{o}f-\underline{A}gents \underline{V}ideo \underline{R}estorationシステムであるMoA-VRを提案する。具体的には、大規模かつ高解像度なビデオ劣化認識ベンチマークを構築し、視覚言語モデル(VLM)による劣化識別子を構築する。さらに,大規模言語モデル(LLM)を用いた自己適応型ルータを導入し,ツールの使用パターンを観察することで,効率的な修復戦略を自律的に学習する。中間および最終処理されたビデオ品質を評価するため、修復作業に適した専用のVLMベースのビデオ品質評価(VQA)モデルを構築した。大規模な実験により、MoA-VRは多種多様な複合的な劣化を効果的に処理し、客観的な指標と知覚的品質の両方の観点から、既存のベースラインを一貫して上回ります。これらの結果は、汎用ビデオ修復システムにおいて、マルチモーダルインテリジェンスとモジュール推論を統合する可能性を強調している。

論文の概要: MoA-VR: A Mixture-of-Agents System Towards All-in-One Video Restoration

関連論文リスト