Segment Any Building
- URL: http://arxiv.org/abs/2310.01164v4
- Date: Thu, 26 Oct 2023 17:08:34 GMT
- Title: Segment Any Building
- Authors: Lei Li
- Abstract summary: This manuscript accentuates the potency of harnessing diversified datasets in tandem with cutting-edge representation learning paradigms for building segmentation in such images.
Our avant-garde joint training regimen underscores the merit of our approach, bearing significant implications in pivotal domains such as urban infrastructural development, disaster mitigation strategies, and ecological surveillance.
The outcomes of this research both fortify the foundations for ensuing scholarly pursuits and presage a horizon replete with innovative applications in the discipline of building segmentation.
- Score: 8.12405696290333
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The task of identifying and segmenting buildings within remote sensing
imagery has perennially stood at the forefront of scholarly investigations.
This manuscript accentuates the potency of harnessing diversified datasets in
tandem with cutting-edge representation learning paradigms for building
segmentation in such images. Through the strategic amalgamation of disparate
datasets, we have not only expanded the informational horizon accessible for
model training but also manifested unparalleled performance metrics across
multiple datasets. Our avant-garde joint training regimen underscores the merit
of our approach, bearing significant implications in pivotal domains such as
urban infrastructural development, disaster mitigation strategies, and
ecological surveillance. Our methodology, predicated upon the fusion of
datasets and gleaning insights from pre-trained models, carves a new benchmark
in the annals of building segmentation endeavors. The outcomes of this research
both fortify the foundations for ensuing scholarly pursuits and presage a
horizon replete with innovative applications in the discipline of building
segmentation.
Related papers
- Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Topological Perspectives on Optimal Multimodal Embedding Spaces [0.0]
This paper delves into a comparative analysis between CLIP and its recent counterpart, CLOOB.
Our approach encompasses a comprehensive examination of the modality gap drivers, the clustering structures existing across both high and low dimensions, and the pivotal role that dimension collapse plays in shaping their respective embedding spaces.
arXiv Detail & Related papers (2024-05-29T08:28:23Z) - Explore In-Context Segmentation via Latent Diffusion Models [132.26274147026854]
latent diffusion model (LDM) is an effective minimalist for in-context segmentation.
We build a new and fair in-context segmentation benchmark that includes both image and video datasets.
arXiv Detail & Related papers (2024-03-14T17:52:31Z) - Fine-grained building roof instance segmentation based on domain adapted
pretraining and composite dual-backbone [13.09940764764909]
We propose a framework to fulfill semantic interpretation of individual buildings with high-resolution optical satellite imagery.
Specifically, the leveraged domain adapted pretraining strategy and composite dual-backbone greatly facilitates the discnative feature learning.
Experiment results show that our approach ranks in the first place of the 2023 IEEE GRSS Data Fusion Contest.
arXiv Detail & Related papers (2023-08-10T05:54:57Z) - Uncovering the Inner Workings of STEGO for Safe Unsupervised Semantic
Segmentation [68.8204255655161]
Self-supervised pre-training strategies have recently shown impressive results for training general-purpose feature extraction backbones in computer vision.
The DINO self-distillation technique has interesting emerging properties, such as unsupervised clustering in the latent space and semantic correspondences of the produced features without using explicit human-annotated labels.
The STEGO method for unsupervised semantic segmentation contrast distills feature correspondences of a DINO-pre-trained Vision Transformer and recently set a new state of the art.
arXiv Detail & Related papers (2023-04-14T15:30:26Z) - Few Shot Semantic Segmentation: a review of methodologies, benchmarks, and open challenges [5.0243930429558885]
Few-Shot Semantic is a novel task in computer vision, which aims at designing models capable of segmenting new semantic classes with only a few examples.
This paper consists of a comprehensive survey of Few-Shot Semantic, tracing its evolution and exploring various model designs.
arXiv Detail & Related papers (2023-04-12T13:07:37Z) - Towards Geospatial Foundation Models via Continual Pretraining [22.825065739563296]
We propose a novel paradigm for building highly effective foundation models with minimal resource cost and carbon impact.
We first construct a compact yet diverse dataset from multiple sources to promote feature diversity, which we term GeoPile.
Then, we investigate the potential of continual pretraining from large-scale ImageNet-22k models and propose a multi-objective continual pretraining paradigm.
arXiv Detail & Related papers (2023-02-09T07:39:02Z) - FloorLevel-Net: Recognizing Floor-Level Lines with
Height-Attention-Guided Multi-task Learning [49.30194762653723]
This work tackles the problem of locating floor-level lines in street-view images, using a supervised deep learning approach.
We first compile a new dataset and develop a new data augmentation scheme to synthesize training samples.
Next, we design FloorLevel-Net, a multi-task learning network that associates explicit features of building facades and implicit floor-level lines.
arXiv Detail & Related papers (2021-07-06T08:17:59Z) - Self-supervised Segmentation via Background Inpainting [96.10971980098196]
We introduce a self-supervised detection and segmentation approach that can work with single images captured by a potentially moving camera.
We exploit a self-supervised loss function that we exploit to train a proposal-based segmentation network.
We apply our method to human detection and segmentation in images that visually depart from those of standard benchmarks and outperform existing self-supervised methods.
arXiv Detail & Related papers (2020-11-11T08:34:40Z) - Bidirectional Graph Reasoning Network for Panoptic Segmentation [126.06251745669107]
We introduce a Bidirectional Graph Reasoning Network (BGRNet) to mine the intra-modular and intermodular relations within and between foreground things and background stuff classes.
BGRNet first constructs image-specific graphs in both instance and semantic segmentation branches that enable flexible reasoning at the proposal level and class level.
arXiv Detail & Related papers (2020-04-14T02:32:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.