Fugu-MT 論文翻訳(概要): BRepCLIP: Contrastive Multimodal Pretraining on BRep Primitives for CAD Understanding

論文の概要: BRepCLIP: Contrastive Multimodal Pretraining on BRep Primitives for CAD Understanding

arxiv url: http://arxiv.org/abs/2606.05515v1
Date: Wed, 03 Jun 2026 23:41:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-06 06:55:34.620063
Title: BRepCLIP: Contrastive Multimodal Pretraining on BRep Primitives for CAD Understanding
Title（参考訳）: BRepCLIP:CAD理解のためのBRep Primitivesを用いたコントラストマルチモーダルプレトレーニング
Authors: Muhammad Usama, Didier Stricker, Mohammad Sadil Khan, Muhammad Zeshan Afzal,
Abstract要約: BRepの幾何学を言語や画像の埋め込みと整合させる最初のフレームワークであるBRepCLIPを紹介する。トランスフォーマーエンコーダはこれらのトークンをグローバルなBRep埋め込みに集約し、CLIPのテキストとイメージエンコーダと整列する。 BRepCLIPは、既存の点ベースの代替よりも差別的で意味論的に基礎付けられた埋め込みを生成する。
参考スコア（独自算出の注目度）: 22.960026011165706
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Learning representations of CAD models is a largely open problem. While 3D representation learning has flourished around point clouds and meshes, the native format of CAD - boundary representations BReps, which encodes exact parametric surfaces, curves, and their topology, has received little attention as a representation learning substrate. We introduce BRepCLIP, the first framework to align BRep geometry with language and image embeddings through contrastive pretraining. We model each CAD object as a sequence of face and edge tokens with separate discrete vocabularies for surface and curve geometry, augmented with spatial and semantic descriptors that capture surface types (e.g., cylindrical, torus, NURBS) and curve primitives (e.g., line, arc, B-spline). A transformer encoder aggregates these tokens into a global BRep embedding, aligned with CLIP's text and image encoders via a joint contrastive objective. BRepCLIP generates more discriminative and semantically grounded embeddings than existing point-based alternatives, improving Top-1 retrieval over OpenShape by 40.4%, 22.0%, and 23.9% on ABC, CADParser, and Automate, respectively, and improving zero-shot classification on FabWave by 15% in Top-1 score. We further demonstrate its utility as a CAD-aware similarity metric for evaluating text and image-conditioned CAD generation, establishing the importance of structure-aware pretraining for multimodal CAD understanding. Project page is available at https://muhammadusama100.github.io/BrepClip2026/
Abstract（参考訳）: CADモデルの表現を学習することは、ほとんどオープンな問題である。 3D表現学習は、点雲やメッシュを中心に盛んに行われているが、CADのネイティブフォーマットである境界表現BRepsは、正確なパラメトリック曲面、曲線、およびそれらのトポロジを符号化しているが、表現学習基板としてはほとんど注目されていない。本稿では,BRep 幾何学を言語や画像の埋め込みと整合させる最初のフレームワークである BRepCLIP を紹介する。我々は,各CADオブジェクトを面と辺のトークンの列として表層と曲線を分離した語彙でモデル化し,曲面タイプ(例えば,円筒形,トーラス,NURBS)と曲線プリミティブ(例えば,直線,弧,B-スプライン)をキャプチャする空間的および意味的記述子で拡張する。トランスフォーマーエンコーダは、これらのトークンをグローバルなBRep埋め込みに集約し、CLIPのテキストと画像エンコーダとをジョイントコントラストの目的によって一致させる。 BRepCLIPは既存のポイントベースよりも差別的でセマンティックな埋め込みを生成し、ABC、CADParser、AutomateでOpenShape上のTop-1検索を40.4%、22.0%、23.9%改善し、Top-1スコアでFabWaveのゼロショット分類を15%改善した。さらに、テキストと画像調和CAD生成の評価のためのCAD対応類似度指標としての有用性を実証し、マルチモーダルCAD理解のための構造認識事前学習の重要性を確立した。プロジェクトページはhttps://muhammadusama100.github.io/BrepClip2026/で公開されている。

論文の概要: BRepCLIP: Contrastive Multimodal Pretraining on BRep Primitives for CAD Understanding

関連論文リスト