A large-scale image-text dataset benchmark for farmland segmentation
- URL: http://arxiv.org/abs/2503.23106v1
- Date: Sat, 29 Mar 2025 14:55:46 GMT
- Title: A large-scale image-text dataset benchmark for farmland segmentation
- Authors: Chao Tao, Dandan Zhong, Weiliang Mu, Zhuofei Du, Haiyang Wu,
- Abstract summary: This article introduces language based descriptions of farmland and developed FarmSeg-VL dataset, the first fine-text image-text dataset designed for farmland segmentation.<n>In terms of the temporal dimension,it covers all four seasons.<n>In addition, in terms of the spatial dimension,it covers eight typical agricultural regions across China.
- Score: 2.3412548557474797
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The traditional deep learning paradigm that solely relies on labeled data has limitations in representing the spatial relationships between farmland elements and the surrounding environment.It struggles to effectively model the dynamic temporal evolution and spatial heterogeneity of farmland. Language,as a structured knowledge carrier,can explicitly express the spatiotemporal characteristics of farmland, such as its shape, distribution,and surrounding environmental information.Therefore,a language-driven learning paradigm can effectively alleviate the challenges posed by the spatiotemporal heterogeneity of farmland.However,in the field of remote sensing imagery of farmland,there is currently no comprehensive benchmark dataset to support this research direction.To fill this gap,we introduced language based descriptions of farmland and developed FarmSeg-VL dataset,the first fine-grained image-text dataset designed for spatiotemporal farmland segmentation.Firstly, this article proposed a semi-automatic annotation method that can accurately assign caption to each image, ensuring high data quality and semantic richness while improving the efficiency of dataset construction.Secondly,the FarmSeg-VL exhibits significant spatiotemporal characteristics.In terms of the temporal dimension,it covers all four seasons.In terms of the spatial dimension,it covers eight typical agricultural regions across China.In addition, in terms of captions,FarmSeg-VL covers rich spatiotemporal characteristics of farmland,including its inherent properties,phenological characteristics, spatial distribution,topographic and geomorphic features,and the distribution of surrounding environments.Finally,we present a performance analysis of VLMs and the deep learning models that rely solely on labels trained on the FarmSeg-VL,demonstrating its potential as a standard benchmark for farmland segmentation.
Related papers
- Concept-Aware LoRA for Domain-Aligned Segmentation Dataset Generation [66.66243874361103]
dataset generation faces two key challenges: 1) aligning generated samples with the target domain and 2) producing informative samples beyond the training data.
We propose Concept-Aware LoRA, a novel fine-tuning approach that selectively identifies and updates only the weights associated with necessary concepts for domain alignment.
We demonstrate its effectiveness in generating datasets for urban-scene segmentation, outperforming baseline and state-of-the-art methods in in-domain settings.
arXiv Detail & Related papers (2025-03-28T06:23:29Z) - Weakly Supervised Framework Considering Multi-temporal Information for Large-scale Cropland Mapping with Satellite Imagery [11.157693752084214]
This study presents a weakly supervised framework considering multi-temporal information for large-scale cropland mapping.<n>We extract high-quality labels according to their consistency among global land cover (GLC) products to construct the supervised learning signal.<n>The proposed framework has been experimentally validated for strong adaptability across three study areas in large-scale cropland mapping.
arXiv Detail & Related papers (2024-11-27T16:11:52Z) - Enhancing Ecological Monitoring with Multi-Objective Optimization: A Novel Dataset and Methodology for Segmentation Algorithms [17.802456388479616]
We introduce a unique semantic segmentation dataset of 6,096 high-resolution aerial images capturing indigenous and invasive grass species in Bega Valley, New South Wales, Australia.
This dataset presents a challenging task due to the overlap and distribution of grass species.
The dataset and code will be made publicly available, aiming to drive research in computer vision, machine learning, and ecological studies.
arXiv Detail & Related papers (2024-07-25T18:27:27Z) - SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models [68.13636352687257]
We introduce Spatial Region GPT (SpatialRGPT) to enhance VLMs' spatial perception and reasoning capabilities.
During inference, when provided with user-specified region proposals, SpatialRGPT can accurately perceive their relative directions and distances.
Our results demonstrate that SpatialRGPT significantly enhances performance in spatial reasoning tasks, both with and without local region prompts.
arXiv Detail & Related papers (2024-06-03T17:59:06Z) - StrideNET: Swin Transformer for Terrain Recognition with Dynamic Roughness Extraction [0.0]
This paper presents StrideNET, a novel dual-branch architecture designed for terrain recognition and implicit properties estimation.
The implications of this work extend to various applications, including environmental monitoring, land use and land cover (LULC) classification, disaster response, precision agriculture.
arXiv Detail & Related papers (2024-04-20T04:51:59Z) - Biological Valuation Map of Flanders: A Sentinel-2 Imagery Analysis [12.025312586542318]
We present a densely labeled ground truth map of Flanders paired with Sentinel-2 satellite imagery.
Our methodology includes a formalized dataset division and sampling method, utilizing the topographic map layout 'Kaartbladversnijdingen,' and a detailed semantic segmentation model training pipeline.
arXiv Detail & Related papers (2024-01-26T22:21:39Z) - PhenoBench -- A Large Dataset and Benchmarks for Semantic Image Interpretation in the Agricultural Domain [29.395926321984565]
We present an annotated dataset and benchmarks for the semantic interpretation of real agricultural fields.
Our dataset recorded with a UAV provides high-quality, pixel-wise annotations of crops and weeds, but also crop leaf instances at the same time.
We provide benchmarks for various tasks on a hidden test set comprised of different fields.
arXiv Detail & Related papers (2023-06-07T16:04:08Z) - Concept Drift and Long-Tailed Distribution in Fine-Grained Visual Categorization: Benchmark and Method [84.68818879525568]
We present a Concept Drift and Long-Tailed Distribution dataset.
The characteristics of instances tend to vary with time and exhibit a long-tailed distribution.
We propose a feature recombination framework to address the learning challenges associated with CDLT.
arXiv Detail & Related papers (2023-06-04T12:42:45Z) - SIRI: Spatial Relation Induced Network For Spatial Description
Resolution [64.38872296406211]
We propose a novel relationship induced (SIRI) network for language-guided localization.
We show that our method is around 24% better than the state-of-the-art method in terms of accuracy, measured by an 80-pixel radius.
Our method also generalizes well on our proposed extended dataset collected using the same settings as Touchdown.
arXiv Detail & Related papers (2020-10-27T14:04:05Z) - Semi-Supervised Semantic Segmentation in Earth Observation: The
MiniFrance Suite, Dataset Analysis and Multi-task Network Study [82.02173199363571]
We introduce a novel large-scale dataset for semi-supervised semantic segmentation in Earth Observation, the MiniFrance suite.
MiniFrance has several unprecedented properties: it is large-scale, containing over 2000 very high resolution aerial images, accounting for more than 200 billions samples (pixels)
We present tools for data representativeness analysis in terms of appearance similarity and a thorough study of MiniFrance data, demonstrating that it is suitable for learning and generalizes well in a semi-supervised setting.
arXiv Detail & Related papers (2020-10-15T15:36:58Z) - Farmland Parcel Delineation Using Spatio-temporal Convolutional Networks [77.63950365605845]
Farm parcel delineation provides cadastral data that is important in developing and managing climate change policies.
This data can also be useful for the agricultural insurance sector for assessing compensations following damages associated with extreme weather events.
Using satellite imaging can be a scalable and cost effective manner to perform the task of farm parcel delineation.
arXiv Detail & Related papers (2020-04-11T19:49:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.