Fugu-MT 論文翻訳(概要): CFRNet: Cycle-Consistent Fixed-Point Training for Real-Time Blind Face Restoration on Consumer Embedded NPUs

論文の概要: CFRNet: Cycle-Consistent Fixed-Point Training for Real-Time Blind Face Restoration on Consumer Embedded NPUs

arxiv url: http://arxiv.org/abs/2606.06850v1
Date: Fri, 05 Jun 2026 02:48:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-08 14:33:29.53115
Title: CFRNet: Cycle-Consistent Fixed-Point Training for Real-Time Blind Face Restoration on Consumer Embedded NPUs
Title（参考訳）: CFRNet:消費者組込みNPUにおけるリアルタイムブラインド顔修復のためのサイクル一貫性固定点トレーニング
Authors: Fuchen Li, Xinyang Wang, Yahui Zhang, Yuhan Chen, Jiahong Guo, Zhuohan Qin, Wenbo Ma,
Abstract要約: デバイス上での使用のための畳み込みリストアであるCFRNetを$256times256$で提示する。ネットワークを1回のパスでトレーニングし、数回手動で実行します。 CFRNetは最高の知覚スコア(LPIPS 0.250は3サイクルで、1サイクルより31%低い)と2サイクルで最高のPSNRとSSIMに達する。
参考スコア（独自算出の注目度）: 5.39140515661076
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Blind face restoration on consumer devices has to balance image quality against speed and memory. Strong methods such as GFPGAN and CodeFormer give good perceptual quality, but they rely on large pretrained generative priors and on operators such as attention, codebook lookup, and style modulation that are hard to compile and quantize on the small neural processing units (NPUs) used in consumer hardware. Small convolutional restorers run fast enough, but they tend to over-smooth and to leave artifacts around the eyes, nose, and mouth. We present CFRNet, a 2.0,M-parameter ResNet-style restorer for on-device use at $256\times256$, the common face-crop size on consumer NPUs. The main idea is Cycle-Consistent Fixed-Point Training (CCFP). Instead of training the network for one pass and then running it several times by hand, we train it to act as a fixed-point operator, so that applying it again to a restored face does not change the face. CCFP uses three training losses, namely progressive multi-cycle supervision, an idempotence loss, and a re-degradation cycle loss, and it adds no cost at inference. To compare fairly under our deployment limits, we retrain all baselines from scratch at the same $256\times256$ resolution. On a 300-image test set, CFRNet reaches the best perceptual score (LPIPS 0.250 at three cycles, which is 31% lower than one cycle) and also the best PSNR and SSIM at two cycles. It runs in about 23,ms per cycle in INT8 on a HiSilicon Hi3402 NPU, while the same baselines cannot be compiled to that chip. The cycle count $k$ acts as a simple quality knob that needs no retraining: PSNR is best at $k\!=\!2$ and LPIPS keeps improving up to $k\!=\!3$. We further show that the same idea works with a plain CNN that is even easier to deploy, and we run the model in real time on an in-car driver-monitoring board.
Abstract（参考訳）: 消費者デバイスにおけるブラインド顔復元は、画像品質と速度とメモリのバランスをとる必要がある。 GFPGANやCodeFormerのような強力な手法は知覚品質が良いが、それらは大きな事前訓練された生成前と、消費者ハードウェアで使用される小さなニューラル処理ユニット(NPU)のコンパイルと定量化が難しい注意、コードブックのルックアップ、スタイル変調といった演算子に依存している。小さな畳み込み修復装置は十分に速く走るが、過度に滑らかになり、目や鼻、口に人工物を残す傾向がある。 CFRNetは、デバイス上で使用するための2.0,MパラメータResNetスタイルのレコーダで、256\times256$で提供します。主な考え方はCCFP(Cycle-Consistent Fixed-Point Training)である。ネットワークを1回のパスでトレーニングし、数回手動で実行する代わりに、固定ポイント演算子として動作するようにトレーニングします。 CCFPは3つのトレーニング損失、すなわちプログレッシブ・マルチサイクルの監督、イデオロジェンス・ロス、そして再劣化サイクルの損失を使用する。デプロイメント制限をかなり下回るように、同じ256\times256$の解像度で、すべてのベースラインをゼロからトレーニングします。 300イメージのテストセットでは、CFRNetは最高の知覚スコア(LPIPS 0.250は3サイクルで、1サイクルより31%低い)と2サイクルで最高のPSNRとSSIMに達する。これはHiSilicon Hi3402 NPU上でINT8の1サイクルあたり約23msで動作するが、同じベースラインをそのチップにコンパイルすることはできない。サイクルカウント$k$は、再トレーニングを必要としない単純なクオリティノブとして機能する。 =\! 2ドルとLPIPSは、最高$k\! =\! 3ドル。さらに、同じアイデアが、より簡単にデプロイできるプレーンなCNNで機能することを示し、車内ドライバ監視ボードでモデルをリアルタイムで実行しています。

論文の概要: CFRNet: Cycle-Consistent Fixed-Point Training for Real-Time Blind Face Restoration on Consumer Embedded NPUs

関連論文リスト