Fugu-MT 論文翻訳(概要): Restore Text First, Enhance Image Later: Two-Stage Scene Text Image Super-Resolution with Glyph Structure Guidance

論文の概要: Restore Text First, Enhance Image Later: Two-Stage Scene Text Image Super-Resolution with Glyph Structure Guidance

arxiv url: http://arxiv.org/abs/2510.21590v1
Date: Fri, 24 Oct 2025 15:59:04 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-28 09:00:15.530781
Title: Restore Text First, Enhance Image Later: Two-Stage Scene Text Image Super-Resolution with Glyph Structure Guidance
Title（参考訳）: Restore Text First, Enhance Image Later: 2段階のScene Text Image Super-Resolution with Glyph Structure Guidance
Authors: Minxing Luo, Linlong Fan, Wang Qiushi, Ge Wu, Yiyan Luo, Yuhang Yu, Jinwei Chen, Yaxing Wang, Qingnan Fan, Jian Yang,
Abstract要約: 生成超解像法は、自然な画像に対して強い性能を示すが、歪んだテキストを示す。 textbfText-textbfImage textbfGuided suptextbfEr-textbfResolution)を導入する。最初は正確なテキスト構造を再構築し、次にフルイメージの超解像を導出する。
参考スコア（独自算出の注目度）: 26.26467179820939
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current generative super-resolution methods show strong performance on natural images but distort text, creating a fundamental trade-off between image quality and textual readability. To address this, we introduce \textbf{TIGER} (\textbf{T}ext-\textbf{I}mage \textbf{G}uided sup\textbf{E}r-\textbf{R}esolution), a novel two-stage framework that breaks this trade-off through a \textit{"text-first, image-later"} paradigm. \textbf{TIGER} explicitly decouples glyph restoration from image enhancement: it first reconstructs precise text structures and then uses them to guide subsequent full-image super-resolution. This glyph-to-image guidance ensures both high fidelity and visual consistency. To support comprehensive training and evaluation, we also contribute the \textbf{UltraZoom-ST} (UltraZoom-Scene Text), the first scene text dataset with extreme zoom (\textbf{$\times$14.29}). Extensive experiments show that \textbf{TIGER} achieves \textbf{state-of-the-art} performance, enhancing readability while preserving overall image quality.
Abstract（参考訳）: 現在の生成超解像法は、自然な画像に対して強い性能を示すが、歪んだテキストは、画像の品質とテキストの可読性の間に根本的なトレードオフをもたらす。これを解決するために、新しい2段階フレームワークである \textbf{TIGER} (\textbf{T}ext-\textbf{I}mage \textbf{G}uided sup\textbf{E}r-\textbf{R}esolution)を紹介する。画像強調からグリフ復元を明示的に切り離し、まず正確なテキスト構造を再構築し、次にフルイメージのスーパーレゾリューションをガイドするために使用する。このグリフ・ツー・イメージのガイダンスは、高い忠実度と視覚的一貫性の両方を保証する。総合的なトレーニングと評価をサポートするため、極端なズームを備えた最初のシーンテキストデータセットである \textbf{UltraZoom-ST} (UltraZoom-Scene Text) も寄贈する(\textbf{$\times$14.29})。拡張実験により, 画像の全体的な品質を保ちながら, 可読性を向上し, \textbf{TIGER} が \textbf{state-of-the-art} 性能を達成することが示された。

論文の概要: Restore Text First, Enhance Image Later: Two-Stage Scene Text Image Super-Resolution with Glyph Structure Guidance

関連論文リスト