Fugu-MT 論文翻訳(概要): Resilient Multimodal Industrial Surface Defect Detection with Uncertain Sensors Availability

論文の概要: Resilient Multimodal Industrial Surface Defect Detection with Uncertain Sensors Availability

arxiv url: http://arxiv.org/abs/2509.02962v1
Date: Wed, 03 Sep 2025 03:00:24 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-04 21:40:46.395125
Title: Resilient Multimodal Industrial Surface Defect Detection with Uncertain Sensors Availability
Title（参考訳）: 不確かさセンサを用いたレジリエントマルチモーダル表面欠陥検出
Authors: Shuai Jiang, Yunfeng Ma, Jingyu Zhou, Yuan Bian, Yaonan Wang, Min Liu,
Abstract要約: マルチモーダル工業用表面欠陥検出(MISDD)は,RGBと3Dモダリティを融合させることにより,工業製品の欠陥を特定し,発見することを目的としている。本稿は,MISDDにおけるセンサの不確実性に起因するモダリティ欠落問題に焦点をあてる。
参考スコア（独自算出の注目度）: 36.47453216758195
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multimodal industrial surface defect detection (MISDD) aims to identify and locate defect in industrial products by fusing RGB and 3D modalities. This article focuses on modality-missing problems caused by uncertain sensors availability in MISDD. In this context, the fusion of multiple modalities encounters several troubles, including learning mode transformation and information vacancy. To this end, we first propose cross-modal prompt learning, which includes: i) the cross-modal consistency prompt serves the establishment of information consistency of dual visual modalities; ii) the modality-specific prompt is inserted to adapt different input patterns; iii) the missing-aware prompt is attached to compensate for the information vacancy caused by dynamic modalities-missing. In addition, we propose symmetric contrastive learning, which utilizes text modality as a bridge for fusion of dual vision modalities. Specifically, a paired antithetical text prompt is designed to generate binary text semantics, and triple-modal contrastive pre-training is offered to accomplish multimodal learning. Experiment results show that our proposed method achieves 73.83% I-AUROC and 93.05% P-AUROC with a total missing rate 0.7 for RGB and 3D modalities (exceeding state-of-the-art methods 3.84% and 5.58% respectively), and outperforms existing approaches to varying degrees under different missing types and rates. The source code will be available at https://github.com/SvyJ/MISDD-MM.
Abstract（参考訳）: マルチモーダル工業用表面欠陥検出(MISDD)は,RGBと3Dモダリティを融合させることにより,工業製品の欠陥を特定し,発見することを目的としている。本稿は,MISDDにおけるセンサの不確実性に起因するモダリティ欠落問題に焦点をあてる。この文脈では、複数のモダリティの融合は、学習モード変換や情報空きなど、いくつかの問題に遭遇する。この目的のために,我々はまず,次のようなモーダル・プロンプト・ラーニングを提案する。一双対の視覚的モダリティの情報整合性の確立に資する横断的整合性の促進二異なる入力パターンを適応させるため、モダリティ特化プロンプトを挿入すること。三ダイナミックモダリティの欠落による情報空白を補うために、欠落認識プロンプトを付設する。さらに,テキストのモダリティを2つの視覚的モダリティの融合のためのブリッジとして利用する対称コントラスト学習を提案する。具体的には、対のアンチテティカルテキストプロンプトがバイナリテキストセマンティクスを生成するように設計され、マルチモーダル学習を実現するために、トリプルモーダルコントラスト事前学習が提供される。実験の結果,RGBと3Dモダリティの合計欠落率0.7で73.83%のI-AUROCと93.05%のP-AUROCを達成し,それぞれ3.84%,5.58%で既存手法よりも高い性能を示した。ソースコードはhttps://github.com/SvyJ/MISDD-MMで入手できる。

論文の概要: Resilient Multimodal Industrial Surface Defect Detection with Uncertain Sensors Availability

関連論文リスト