Polygonizer: An auto-regressive building delineator
- URL: http://arxiv.org/abs/2304.04048v1
- Date: Sat, 8 Apr 2023 15:36:48 GMT
- Title: Polygonizer: An auto-regressive building delineator
- Authors: Maxim Khomiakov, Michael Riis Andersen, Jes Frellsen
- Abstract summary: We present an Image-to-Sequence model that allows for direct shape inference and is ready for vector-based out of the box.
We demonstrate the model's performance in various ways, including perturbations to the image input that correspond to variations or artifacts commonly encountered in remote sensing applications.
- Score: 12.693238093510072
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In geospatial planning, it is often essential to represent objects in a
vectorized format, as this format easily translates to downstream tasks such as
web development, graphics, or design. While these problems are frequently
addressed using semantic segmentation, which requires additional
post-processing to vectorize objects in a non-trivial way, we present an
Image-to-Sequence model that allows for direct shape inference and is ready for
vector-based workflows out of the box. We demonstrate the model's performance
in various ways, including perturbations to the image input that correspond to
variations or artifacts commonly encountered in remote sensing applications.
Our model outperforms prior works when using ground truth bounding boxes (one
object per image), achieving the lowest maximum tangent angle error.
Related papers
- Segmentation-guided Layer-wise Image Vectorization with Gradient Fills [6.037332707968933]
We propose a segmentation-guided vectorization framework to convert images into concise vector graphics with gradient fills.
With the guidance of an embedded gradient-aware segmentation, our approach progressively appends gradient-filled B'ezier paths to the output.
arXiv Detail & Related papers (2024-08-28T12:08:25Z) - FUSE-ing Language Models: Zero-Shot Adapter Discovery for Prompt Optimization Across Tokenizers [55.2480439325792]
We propose FUSE, an approach to approximating an adapter layer that maps from one model's textual embedding space to another, even across different tokenizers.
We show the efficacy of our approach via multi-objective optimization over vision-language and causal language models for image captioning and sentiment-based image captioning.
arXiv Detail & Related papers (2024-08-09T02:16:37Z) - SHIC: Shape-Image Correspondences with no Keypoint Supervision [106.99157362200867]
Canonical surface mapping generalizes keypoint detection by assigning each pixel of an object to a corresponding point in a 3D template.
Popularised by DensePose for the analysis of humans, authors have attempted to apply the concept to more categories.
We introduce SHIC, a method to learn canonical maps without manual supervision which achieves better results than supervised methods for most categories.
arXiv Detail & Related papers (2024-07-26T17:58:59Z) - SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis [66.44553285020066]
SuperSVG is a superpixel-based vectorization model that achieves fast and high-precision image vectorization.
We propose a two-stage self-training framework, where a coarse-stage model is employed to reconstruct the main structure and a refinement-stage model is used for enriching the details.
Experiments demonstrate the superior performance of our method in terms of reconstruction accuracy and inference time compared to state-of-the-art approaches.
arXiv Detail & Related papers (2024-06-14T07:43:23Z) - DeTra: A Unified Model for Object Detection and Trajectory Forecasting [68.85128937305697]
Our approach formulates the union of the two tasks as a trajectory refinement problem.
To tackle this unified task, we design a refinement transformer that infers the presence, pose, and multi-modal future behaviors of objects.
In our experiments, we observe that ourmodel outperforms the state-of-the-art on Argoverse 2 Sensor and Open dataset.
arXiv Detail & Related papers (2024-06-06T18:12:04Z) - Hierarchical Vector Quantized Transformer for Multi-class Unsupervised
Anomaly Detection [24.11900895337062]
Unsupervised image Anomaly Detection (UAD) aims to learn robust and discriminative representations of normal samples.
This paper focuses on building a unified framework for multiple classes.
arXiv Detail & Related papers (2023-10-22T08:20:33Z) - A Generalist Framework for Panoptic Segmentation of Images and Videos [61.61453194912186]
We formulate panoptic segmentation as a discrete data generation problem, without relying on inductive bias of the task.
A diffusion model is proposed to model panoptic masks, with a simple architecture and generic loss function.
Our method is capable of modeling video (in a streaming setting) and thereby learns to track object instances automatically.
arXiv Detail & Related papers (2022-10-12T16:18:25Z) - BoundarySqueeze: Image Segmentation as Boundary Squeezing [104.43159799559464]
We propose a novel method for fine-grained high-quality image segmentation of both objects and scenes.
Inspired by dilation and erosion from morphological image processing techniques, we treat the pixel level segmentation problems as squeezing object boundary.
Our method yields large gains on COCO, Cityscapes, for both instance and semantic segmentation and outperforms previous state-of-the-art PointRend in both accuracy and speed under the same setting.
arXiv Detail & Related papers (2021-05-25T04:58:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.