OpenEarthSensing: Large-Scale Fine-Grained Benchmark for Open-World Remote Sensing
- URL: http://arxiv.org/abs/2502.20668v1
- Date: Fri, 28 Feb 2025 02:49:52 GMT
- Title: OpenEarthSensing: Large-Scale Fine-Grained Benchmark for Open-World Remote Sensing
- Authors: Xiang Xiang, Zhuo Xu, Yao Deng, Qinhao Zhou, Yifan Liang, Ke Chen, Qingfang Zheng, Yaowei Wang, Xilin Chen, Wen Gao,
- Abstract summary: We introduce OpenEarthSensing, a large-scale fine-grained benchmark for open-world remote sensing.<n>OpenEarthSensing includes 189 scene and objects, covering the vast majority of potential semantic shifts that may occur in the real world.<n>We conduct the baseline evaluation of current mainstream open-world tasks and methods on OpenEarthSensing.
- Score: 57.050679160659705
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In open-world remote sensing, deployed models must continuously adapt to a steady influx of new data, which often exhibits various shifts compared to what the model encountered during the training phase. To effectively handle the new data, models are required to detect semantic shifts, adapt to covariate shifts, and continuously update themselves. These challenges give rise to a variety of open-world tasks. However, existing open-world remote sensing studies typically train and test within a single dataset to simulate open-world conditions. Currently, there is a lack of large-scale benchmarks capable of evaluating multiple open-world tasks. In this paper, we introduce OpenEarthSensing, a large-scale fine-grained benchmark for open-world remote sensing. OpenEarthSensing includes 189 scene and objects categories, covering the vast majority of potential semantic shifts that may occur in the real world. Additionally, OpenEarthSensing encompasses five data domains with significant covariate shifts, including two RGB satellite domians, one RGB aerial domian, one MS RGB domian, and one infrared domian. The various domains provide a more comprehensive testbed for evaluating the generalization performance of open-world models. We conduct the baseline evaluation of current mainstream open-world tasks and methods on OpenEarthSensing, demonstrating that it serves as a challenging benchmark for open-world remote sensing.
Related papers
- OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images [45.40710102095654]
We propose OpenRSD, a universal open-prompt RS object detection framework.
OpenRSD supports multimodal prompts and integrates multi-task detection heads to balance accuracy and real-time requirements.
Compared to YOLO-World, OpenRSD exhibits an 8.7% higher average precision and achieves an inference speed of 20.8 FPS.
arXiv Detail & Related papers (2025-03-08T10:08:46Z) - EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision [72.84868704100595]
This paper presents a dataset specifically designed for self-supervision on remote sensing data, intended to enhance deep learning applications on Earth monitoring tasks.<n>The dataset spans 15 tera pixels of global remote-sensing data, combining imagery from a diverse range of sources, including NEON, Sentinel, and a novel release of 1m spatial resolution data from Satellogic.<n>Accompanying the dataset is EarthMAE, a tailored Masked Autoencoder developed to tackle the distinct challenges of remote sensing data.
arXiv Detail & Related papers (2025-01-14T13:42:22Z) - EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues [46.601134018876955]
EarthDial is a conversational assistant specifically designed for Earth Observation (EO) data.<n>It transforms complex, multi-sensory Earth observations into interactive, natural language dialogues.<n>EarthDial supports multi-spectral, multi-temporal, and multi-resolution imagery.
arXiv Detail & Related papers (2024-12-19T18:57:13Z) - OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection [47.9080685468069]
We introduce OpenAD, the first real-world open-world autonomous driving benchmark for 3D object detection.<n>OpenAD is built on a corner case discovery and annotation pipeline integrating with a multimodal large language model (MLLM)
arXiv Detail & Related papers (2024-11-26T01:50:06Z) - Generalized Few-Shot Semantic Segmentation in Remote Sensing: Challenge and Benchmark [18.636210870172675]
Few-shot semantic segmentation can encourage deep learning models to learn from few labelled examples for novel classes not seen during the training.
The generalized few-shot segmentation setting has an additional challenge which encourages models not only to adapt to the novel classes but also to maintain strong performance on the training base classes.
We release the dataset augmenting OpenEarthMap with additional classes labelled for the generalized few-shot evaluation setting.
arXiv Detail & Related papers (2024-09-17T14:20:47Z) - An Open-World, Diverse, Cross-Spatial-Temporal Benchmark for Dynamic Wild Person Re-Identification [58.5877965612088]
Person re-identification (ReID) has made great strides thanks to the data-driven deep learning techniques.
The existing benchmark datasets lack diversity, and models trained on these data cannot generalize well to dynamic wild scenarios.
We develop a new Open-World, Diverse, Cross-Spatial-Temporal dataset named OWD with several distinct features.
arXiv Detail & Related papers (2024-03-22T11:21:51Z) - SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection [79.23689506129733]
We establish a new benchmark dataset and an open-source method for large-scale SAR object detection.
Our dataset, SARDet-100K, is a result of intense surveying, collecting, and standardizing 10 existing SAR detection datasets.
To the best of our knowledge, SARDet-100K is the first COCO-level large-scale multi-class SAR object detection dataset ever created.
arXiv Detail & Related papers (2024-03-11T09:20:40Z) - SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery [35.550999964460466]
We present SkySense, a generic billion-scale model, pre-trained on a curated multi-modal Remote Sensing dataset with 21.5 million temporal sequences.
To our best knowledge, SkySense is the largest Multi-Modal to date, whose modules can be flexibly combined or used individually to accommodate various tasks.
arXiv Detail & Related papers (2023-12-15T09:57:21Z) - Bayesian Embeddings for Few-Shot Open World Recognition [60.39866770427436]
We extend embedding-based few-shot learning algorithms to the open-world recognition setting.
We benchmark our framework on open-world extensions of the common MiniImageNet and TieredImageNet few-shot learning datasets.
arXiv Detail & Related papers (2021-07-29T00:38:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.