OpenEarthSensing: Large-Scale Fine-Grained Benchmark for Open-World Remote Sensing
- URL: http://arxiv.org/abs/2502.20668v2
- Date: Wed, 30 Jul 2025 06:04:46 GMT
- Title: OpenEarthSensing: Large-Scale Fine-Grained Benchmark for Open-World Remote Sensing
- Authors: Xiang Xiang, Zhuo Xu, Yao Deng, Qinhao Zhou, Yifan Liang, Ke Chen, Qingfang Zheng, Yaowei Wang, Xilin Chen, Wen Gao,
- Abstract summary: We introduce textbfOpenEarthSensing (OES), a large-scale fine-grained benchmark for open-world remote sensing.<n>OES includes 189 scene and object categories, covering the vast majority of potential semantic shifts that may occur in the real world.
- Score: 57.050679160659705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The advancement of remote sensing, including satellite systems, facilitates the continuous acquisition of remote sensing imagery globally, introducing novel challenges for achieving open-world tasks. Deployed models need to continuously adjust to a constant influx of new data, which frequently exhibits diverse shifts from the data encountered during the training phase. To effectively handle the new data, models are required to detect semantic shifts, adapt to covariate shifts, and continuously update their parameters without forgetting learned knowledge, as has been considered in works on a variety of open-world tasks. However, existing studies are typically conducted within a single dataset to simulate realistic conditions, with a lack of large-scale benchmarks capable of evaluating multiple open-world tasks. In this paper, we introduce \textbf{OpenEarthSensing (OES)}, a large-scale fine-grained benchmark for open-world remote sensing. OES includes 189 scene and object categories, covering the vast majority of potential semantic shifts that may occur in the real world. Additionally, to provide a more comprehensive testbed for evaluating the generalization performance, OES encompasses five data domains with significant covariate shifts, including two RGB satellite domains, one RGB aerial domain, one multispectral RGB domain, and one infrared domain. We evaluate the baselines and existing methods for diverse tasks on OES, demonstrating that it serves as a meaningful and challenging benchmark for open-world remote sensing. The proposed dataset OES is available at https://haiv-lab.github.io/OES.
Related papers
- A Dataset for Semantic Segmentation in the Presence of Unknowns [49.795683850385956]
Existing datasets allow evaluation of only knowns or unknowns - but not both.<n>We propose a novel anomaly segmentation dataset, ISSU, that features a diverse set of anomaly inputs from cluttered real-world environments.<n>The dataset is twice larger than existing anomaly segmentation datasets.
arXiv Detail & Related papers (2025-03-28T10:31:01Z) - OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images [45.40710102095654]
We propose OpenRSD, a universal open-prompt RS object detection framework.
OpenRSD supports multimodal prompts and integrates multi-task detection heads to balance accuracy and real-time requirements.
Compared to YOLO-World, OpenRSD exhibits an 8.7% higher average precision and achieves an inference speed of 20.8 FPS.
arXiv Detail & Related papers (2025-03-08T10:08:46Z) - EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision [72.84868704100595]
This paper presents a dataset specifically designed for self-supervision on remote sensing data, intended to enhance deep learning applications on Earth monitoring tasks.<n>The dataset spans 15 tera pixels of global remote-sensing data, combining imagery from a diverse range of sources, including NEON, Sentinel, and a novel release of 1m spatial resolution data from Satellogic.<n>Accompanying the dataset is EarthMAE, a tailored Masked Autoencoder developed to tackle the distinct challenges of remote sensing data.
arXiv Detail & Related papers (2025-01-14T13:42:22Z) - EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues [46.601134018876955]
EarthDial is a conversational assistant specifically designed for Earth Observation (EO) data.<n>It transforms complex, multi-sensory Earth observations into interactive, natural language dialogues.<n>EarthDial supports multi-spectral, multi-temporal, and multi-resolution imagery.
arXiv Detail & Related papers (2024-12-19T18:57:13Z) - OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection [47.9080685468069]
We introduce OpenAD, the first real-world open-world autonomous driving benchmark for 3D object detection.<n>OpenAD is built on a corner case discovery and annotation pipeline integrating with a multimodal large language model (MLLM)
arXiv Detail & Related papers (2024-11-26T01:50:06Z) - Generalized Few-Shot Semantic Segmentation in Remote Sensing: Challenge and Benchmark [18.636210870172675]
Few-shot semantic segmentation can encourage deep learning models to learn from few labelled examples for novel classes not seen during the training.
The generalized few-shot segmentation setting has an additional challenge which encourages models not only to adapt to the novel classes but also to maintain strong performance on the training base classes.
We release the dataset augmenting OpenEarthMap with additional classes labelled for the generalized few-shot evaluation setting.
arXiv Detail & Related papers (2024-09-17T14:20:47Z) - An Open-World, Diverse, Cross-Spatial-Temporal Benchmark for Dynamic Wild Person Re-Identification [58.5877965612088]
Person re-identification (ReID) has made great strides thanks to the data-driven deep learning techniques.
The existing benchmark datasets lack diversity, and models trained on these data cannot generalize well to dynamic wild scenarios.
We develop a new Open-World, Diverse, Cross-Spatial-Temporal dataset named OWD with several distinct features.
arXiv Detail & Related papers (2024-03-22T11:21:51Z) - SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection.
Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets.
To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z) - SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery [35.550999964460466]
We present SkySense, a generic billion-scale model, pre-trained on a curated multi-modal Remote Sensing dataset with 21.5 million temporal sequences.
To our best knowledge, SkySense is the largest Multi-Modal to date, whose modules can be flexibly combined or used individually to accommodate various tasks.
arXiv Detail & Related papers (2023-12-15T09:57:21Z) - GAMUS: A Geometry-aware Multi-modal Semantic Segmentation Benchmark for
Remote Sensing Data [27.63411386396492]
This paper introduces a new benchmark dataset for multi-modal semantic segmentation based on RGB-Height (RGB-H) data.
The proposed benchmark consists of 1) a large-scale dataset including co-registered RGB and nDSM pairs and pixel-wise semantic labels; 2) a comprehensive evaluation and analysis of existing multi-modal fusion strategies for both convolutional and Transformer-based networks on remote sensing data.
arXiv Detail & Related papers (2023-05-24T09:03:18Z) - Neural-Sim: Learning to Generate Training Data with NeRF [31.81496344354997]
We present the first fully differentiable synthetic data pipeline that uses Neural Radiance Fields (NeRFs) in a closed-loop with a target application's loss function.
Our approach generates data on-demand, with no human labor, to maximize accuracy for a target task.
arXiv Detail & Related papers (2022-07-22T22:48:33Z) - Unseen Object Instance Segmentation with Fully Test-time RGB-D
Embeddings Adaptation [14.258456366985444]
Recently, a popular solution is leveraging RGB-D features of large-scale synthetic data and applying the model to unseen real-world scenarios.
We re-emphasize the adaptation process across Sim2Real domains in this paper.
We propose a framework to conduct the Fully Test-time RGB-D Embeddings Adaptation (FTEA) based on parameters of the BatchNorm layer.
arXiv Detail & Related papers (2022-04-21T02:35:20Z) - MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake
Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos.
We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z) - Bayesian Embeddings for Few-Shot Open World Recognition [60.39866770427436]
We extend embedding-based few-shot learning algorithms to the open-world recognition setting.
We benchmark our framework on open-world extensions of the common MiniImageNet and TieredImageNet few-shot learning datasets.
arXiv Detail & Related papers (2021-07-29T00:38:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.