Fugu-MT 論文翻訳(概要): SpectralEarth-FM: Bringing Hyperspectral Imagery into Multimodal Earth Observation Pretraining

論文の概要: SpectralEarth-FM: Bringing Hyperspectral Imagery into Multimodal Earth Observation Pretraining

arxiv url: http://arxiv.org/abs/2605.21075v1
Date: Wed, 20 May 2026 12:08:16 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-21 19:19:56.655663
Title: SpectralEarth-FM: Bringing Hyperspectral Imagery into Multimodal Earth Observation Pretraining
Title（参考訳）: スペクトラルアースFM:マルチモーダル地球観測事前訓練にハイパースペクトル画像をもたらす
Authors: Nassim Ait Ali Braham, Aaron Banze, Conrad M. Albrecht, Julien Mairal, Jocelyn Chanussot, Xiao Xiang Zhu,
Abstract要約: 異種スペクトル次元を持つマルチセンサEO入力のための階層変換器であるSpectralEarth-FMを紹介する。 SpectralEarth-FMを事前訓練するために、3つの衛星搭載センサーからHSIを同時配置するデータセットであるSpectralEarth-MMをキュレートする。我々は、PANGAEAプロトコルに従って、ハイパースペクトルダウンストリームタスクと標準EOベンチマークのSpectralEarth-FMを評価する。
参考スコア（独自算出の注目度）: 39.65346191345367
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Earth observation (EO) foundation models (FMs) are increasingly trained on multisensor data, spanning multispectral imagery (MSI), synthetic aperture radar (SAR), and derived geospatial layers, but hyperspectral imagery (HSI) remains underrepresented. Conversely, existing hyperspectral FMs are trained on HSI alone, leaving joint pretraining and fusion of HSI with co-located EO sensors unexplored. We introduce SpectralEarth-FM, a hierarchical transformer for multisensor EO input with heterogeneous spectral dimensionality. The architecture combines spectral tokenization for hyperspectral inputs, sensor-specific encoders, a cross-sensor fusion module, and a shared hierarchical encoder, enabling joint processing of HSI and lower-channel observations. To pretrain SpectralEarth-FM, we curate SpectralEarth-MM, a dataset that co-locates HSI from three spaceborne sensors (EnMAP, EMIT, DESIS) with Sentinel-2, Landsat-8/9 optical imagery, Landsat land surface temperature (LST), and Sentinel-1 SAR, over common geographic footprints. It comprises approximately 2M globally distributed locations, 25M georeferenced patches, and over 40TB of data. Pretraining uses a Joint-Embedding Predictive Architecture (JEPA)-style objective that matches representations between global views and single-sensor local views from the same location. We evaluate SpectralEarth-FM on hyperspectral downstream tasks and standard EO benchmarks following the PANGAEA protocol, achieving state-of-the-art results across both evaluation settings.
Abstract（参考訳）: 地球観測(EO)基礎モデル(FM)は、マルチスペクトル画像(MSI)、合成開口レーダ(SAR)、誘導地理空間層にまたがるマルチセンサデータでますます訓練されているが、ハイパースペクトル画像(HSI)はいまだに不足している。逆に、既存のハイパースペクトルFMはHSIだけで訓練されており、HSIと共同位置のEOセンサーの融合は未調査のままである。異種スペクトル次元を持つマルチセンサEO入力のための階層変換器であるSpectralEarth-FMを紹介する。このアーキテクチャは、ハイパースペクトル入力のためのスペクトルトークン化、センサー固有のエンコーダ、クロスセンサー融合モジュール、および共有階層エンコーダを組み合わせて、HSIと低チャネル観測の共同処理を可能にする。 SpectralEarth-FMをプリトレーニングするために、衛星搭載センサー(EnMAP, EMIT, DESIS)とSentinel-2、Landsat-8/9光画像、LST、Sentinel-1 SARの3つのHSIを併用したデータセットであるSpectralEarth-MMを解析した。約200万のグローバルなロケーション、25万のジオリファレンスパッチ、40TB以上のデータで構成されている。 PretrainingはJEPA(Joint-Embedding Predictive Architecture)スタイルの目標を使用しており、グローバルビューと同一場所からのシングルセンサーローカルビューの表現にマッチする。我々は、PANGAEAプロトコルに従って、ハイパースペクトルダウンストリームタスクと標準EOベンチマークに基づいてSpectralEarth-FMを評価し、両方の評価設定で最先端の結果を得る。

論文の概要: SpectralEarth-FM: Bringing Hyperspectral Imagery into Multimodal Earth Observation Pretraining

関連論文リスト