Fugu-MT 論文翻訳(概要): Miller-Index-Based Latent Crystallographic Fracture Plane Reasoning and generation with Vision-Language Models

論文の概要: Miller-Index-Based Latent Crystallographic Fracture Plane Reasoning and generation with Vision-Language Models

arxiv url: http://arxiv.org/abs/2605.20416v2
Date: Sun, 24 May 2026 16:52:15 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-26 16:32:37.766611
Title: Miller-Index-Based Latent Crystallographic Fracture Plane Reasoning and generation with Vision-Language Models
Title（参考訳）: ミラーインデックスによるラテント結晶構造破壊面の共鳴とビジョンランゲージモデルによる生成
Authors: Qinwu Xu, Xiaofu Ma, Yifan Jiang,
Abstract要約: 本研究では, マルチモーダル大規模言語モデル (MLLM) が, フラクチャー幾何学的推論のための構造的潜在表現として結晶面指標 (Miller indices) を活用できるかどうかを検討した。 MLLMは、理想化された設定で遅延推論を確実に実行でき、基礎となる物理がそれをサポートしない場合、遅延表現を拒否できることを示す。
参考スコア（独自算出の注目度）: 4.650392958517514
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study whether multimodal large language models (MLLMs) can leverage crystallographic plane indices (Miller indices) as a structured latent representation for reasoning about fracture geometry. We formulate Miller indices $z = (h,k,l)$ as a latent variable governing idealized planar fracture and evaluate two complementary capabilities: (i) latent inference, where the model maps visual observations to plane hypotheses under physically valid conditions, and (ii) latent applicability assessment, where the model determines whether such a representation is meaningful for a given fracture image. Through extensive experiments spanning synthetic data, controlled 2D--3D geometric pairs, and real-world fracture images across multiple material classes -- including ceramics, glass, metals, and concrete -- we show that MLLMs can reliably perform latent inference in idealized settings and, critically, can reject the latent representation when the underlying physics does not support it. As an exploratory extension, we further examine AI-generated fracture sequences and observe qualitatively plausible brittle-fracture progression behaviors, suggesting that multimodal generative models may encode partial implicit physical priors related to material failure dynamics. These results suggest that MLLMs can act as physics-aware reasoning systems conditioned on structured latent priors, provided that the domain of validity is explicitly modeled.
Abstract（参考訳）: 本研究では, マルチモーダル大規模言語モデル (MLLM) が, フラクチャー幾何学的推論のための構造的潜在表現として結晶面指標 (Miller indices) を活用できるかどうかを検討した。 Miller indices $z = (h,k,l)$ as a Latent variable ruling idealized Planar fracture を定式化し、2つの相補的能力を評価する。一物理的に妥当な条件下で、モデルが視覚観測を平面仮説にマッピングする潜時推論 2) 所定のフラクチャー画像に対してその表現が有意かどうかをモデルが決定する潜時適用性評価。合成データ、制御された2D-3D幾何対、セラミック、ガラス、金属、コンクリートを含む複数の材料クラスにわたる実世界のフラクチャー画像を通じて、MLLMは理想化された環境で遅延推論を確実に実行でき、基礎となる物理がそれをサポートしない場合に潜時表現を拒否できることを示す。探索的拡張として,AI生成したフラクチャーシーケンスを調べ,定性的に不安定なフラクチャー進行挙動を観察し,多モード生成モデルが物質破壊のダイナミクスに関連する部分的な物理的先行を符号化する可能性が示唆された。これらの結果から,MLLMは論理的推論システムとして機能し,妥当性の領域が明示的にモデル化されることが示唆された。

論文の概要: Miller-Index-Based Latent Crystallographic Fracture Plane Reasoning and generation with Vision-Language Models

関連論文リスト