Fugu-MT 論文翻訳(概要): Towards Cognitive Defect Analysis in Active Infrared Thermography with Vision-Text Cues

論文の概要: Towards Cognitive Defect Analysis in Active Infrared Thermography with Vision-Text Cues

arxiv url: http://arxiv.org/abs/2603.10549v1
Date: Wed, 11 Mar 2026 08:58:15 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-12 16:22:32.859816
Title: Towards Cognitive Defect Analysis in Active Infrared Thermography with Vision-Text Cues
Title（参考訳）: 視覚情報を用いた能動赤外線サーモグラフィにおける認知的欠陥解析に向けて
Authors: Mohammed Salah, Eman Ouda, Giuseppe Dell'Avvocato, Fabrizio Sarasini, Ester D'Accardi, Jorge Dias, Davor Svetinovic, Stefano Sfarra, Yusra Abdulrahman,
Abstract要約: AIRTと視覚言語モデル(VLM)を用いたCFRPにおける認知欠陥解析のための新しい言語誘導フレームワークを提案する。従来の学習ベースアプローチとは異なり、提案フレームワークは欠陥検出装置を広範囲に訓練するためのトレーニングデータセットの開発を必要としない。提案システムでは, サーモグラフィパターンの生成ゼロショット理解と地下欠陥の自動検出が可能となる。
参考スコア（独自算出の注目度）: 4.186239052492289
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Active infrared thermography (AIRT) is currently witnessing a surge of artificial intelligence (AI) methodologies being deployed for automated subsurface defect analysis of high performance carbon fiber-reinforced polymers (CFRP). Deploying AI-based AIRT methodologies for inspecting CFRPs requires the creation of time consuming and expensive datasets of CFRP inspection sequences to train neural networks. To address this challenge, this work introduces a novel language-guided framework for cognitive defect analysis in CFRPs using AIRT and vision-language models (VLMs). Unlike conventional learning-based approaches, the proposed framework does not require developing training datasets for extensive training of defect detectors, instead it relies solely on pretrained multimodal VLM encoders coupled with a lightweight adapter to enable generative zero-shot understanding and localization of subsurface defects. By leveraging pretrained multimodal encoders, the proposed system enables generative zero-shot understanding of thermographic patterns and automatic detection of subsurface defects. Given the domain gap between thermographic data and natural images used to train VLMs, an AIRT-VLM Adapter is proposed to enhance the visibility of defects while aligning the thermographic domain with the learned representations of VLMs. The proposed framework is validated using three representative VLMs; specifically, GroundingDINO, Qwen-VL-Chat, and CogVLM. Validation is performed on 25 CFRP inspection sequences with impacts introduced at different energy levels, reflecting realistic defects encountered in industrial scenarios. Experimental results demonstrate that the AIRT-VLM adapter achieves signal-to-noise ratio (SNR) gains exceeding 10 dB compared with conventional thermographic dimensionality-reduction methods, while enabling zero-shot defect detection with intersection-over-union values reaching 70%.
Abstract（参考訳）: 能動赤外線サーモグラフィー(AIRT)は現在、高性能炭素繊維強化ポリマー(CFRP)の自動表面欠陥解析のための人工知能(AI)手法の急増を目撃している。 CFRPを検査するためにAIベースのAIRT方法論をデプロイするには、ニューラルネットワークをトレーニングするために、CFRP検査シーケンスの時間を要する高価なデータセットを作成する必要がある。この課題に対処するため,本研究では, AIRT と視覚言語モデル (VLM) を用いた CFRP における認知欠陥解析のための新しい言語誘導フレームワークを提案する。従来の学習ベースアプローチとは異なり、提案フレームワークは欠陥検出装置の広範囲な訓練のためのトレーニングデータセットの開発を必要とせず、より軽量なアダプタと組み合わされた事前訓練されたマルチモーダルVLMエンコーダにのみ依存し、生成ゼロショット理解と地下欠陥の局所化を可能にする。事前学習したマルチモーダルエンコーダを利用することで,熱画像パターンのゼロショット理解と地下欠陥の自動検出が可能となる。熱画像データとVLMの訓練に使用される自然画像との領域ギャップを考慮し, 熱画像領域とVLMの学習表現を整合させて欠陥の視認性を高めるために, AIRT-VLMアダプタを提案する。提案手法は,3つの代表的なVLM,具体的には GroundingDINO, Qwen-VL-Chat, CogVLM を用いて検証した。産業シナリオで遭遇した現実的な欠陥を反映した25個のCFRP検査シーケンスで、異なるエネルギーレベルでの衝撃による検証が行われる。実験結果から,AIRT-VLMアダプタは従来手法に比べて10dBを超える信号対雑音比(SNR)のゲインを達成し,交叉対ユニオン値が70%に達するゼロショット欠陥検出を可能にした。

論文の概要: Towards Cognitive Defect Analysis in Active Infrared Thermography with Vision-Text Cues

関連論文リスト