Auto-Panoptic: Cooperative Multi-Component Architecture Search for
Panoptic Segmentation
- URL: http://arxiv.org/abs/2010.16119v1
- Date: Fri, 30 Oct 2020 08:34:35 GMT
- Title: Auto-Panoptic: Cooperative Multi-Component Architecture Search for
Panoptic Segmentation
- Authors: Yangxin Wu, Gengwei Zhang, Hang Xu, Xiaodan Liang, Liang Lin
- Abstract summary: We propose an efficient framework to simultaneously search for all main components including backbone, segmentation branches, and feature fusion module.
Our searched architecture, namely Auto-Panoptic, achieves the new state-of-the-art on the challenging COCO and ADE20K benchmarks.
- Score: 144.50154657257605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Panoptic segmentation is posed as a new popular test-bed for the
state-of-the-art holistic scene understanding methods with the requirement of
simultaneously segmenting both foreground things and background stuff. The
state-of-the-art panoptic segmentation network exhibits high structural
complexity in different network components, i.e. backbone, proposal-based
foreground branch, segmentation-based background branch, and feature fusion
module across branches, which heavily relies on expert knowledge and tedious
trials. In this work, we propose an efficient, cooperative and highly automated
framework to simultaneously search for all main components including backbone,
segmentation branches, and feature fusion module in a unified panoptic
segmentation pipeline based on the prevailing one-shot Network Architecture
Search (NAS) paradigm. Notably, we extend the common single-task NAS into the
multi-component scenario by taking the advantage of the newly proposed
intra-modular search space and problem-oriented inter-modular search space,
which helps us to obtain an optimal network architecture that not only performs
well in both instance segmentation and semantic segmentation tasks but also be
aware of the reciprocal relations between foreground things and background
stuff classes. To relieve the vast computation burden incurred by applying NAS
to complicated network architectures, we present a novel path-priority greedy
search policy to find a robust, transferrable architecture with significantly
reduced searching overhead. Our searched architecture, namely Auto-Panoptic,
achieves the new state-of-the-art on the challenging COCO and ADE20K
benchmarks. Moreover, extensive experiments are conducted to demonstrate the
effectiveness of path-priority policy and transferability of Auto-Panoptic
across different datasets. Codes and models are available at:
https://github.com/Jacobew/AutoPanoptic.
Related papers
- NAS-based Recursive Stage Partial Network (RSPNet) for Light-Weight
Semantic Segmentation [16.019616787091202]
Current NAS-based semantic segmentation methods focus on accuracy improvements rather than light-weight design.
We propose a two-stage framework to design our NAS-based RSPNet model for light-weight semantic segmentation.
The proposed architecture is very efficient, simple, and effective that both the macro- and micro- structure searches can be completed in five days of computation.
arXiv Detail & Related papers (2022-10-03T03:25:29Z) - Pruning-as-Search: Efficient Neural Architecture Search via Channel
Pruning and Structural Reparameterization [50.50023451369742]
Pruning-as-Search (PaS) is an end-to-end channel pruning method to search out desired sub-network automatically and efficiently.
Our proposed architecture outperforms prior arts by around $1.0%$ top-1 accuracy on ImageNet-1000 classification task.
arXiv Detail & Related papers (2022-06-02T17:58:54Z) - A Unified Transformer Framework for Group-based Segmentation:
Co-Segmentation, Co-Saliency Detection and Video Salient Object Detection [59.21990697929617]
Humans tend to mine objects by learning from a group of images or several frames of video since we live in a dynamic world.
Previous approaches design different networks on similar tasks separately, and they are difficult to apply to each other.
We introduce a unified framework to tackle these issues, term as UFO (UnifiedObject Framework for Co-Object Framework)
arXiv Detail & Related papers (2022-03-09T13:35:19Z) - Boundary-Aware Segmentation Network for Mobile and Web Applications [60.815545591314915]
Boundary-Aware Network (BASNet) is integrated with a predict-refine architecture and a hybrid loss for highly accurate image segmentation.
BASNet runs at over 70 fps on a single GPU which benefits many potential real applications.
Based on BASNet, we further developed two (close to) commercial applications: AR COPY & PASTE, in which BASNet is augmented reality for "COPY" and "PASTING" real-world objects, and OBJECT CUT, which is a web-based tool for automatic object background removal.
arXiv Detail & Related papers (2021-01-12T19:20:26Z) - AutoPose: Searching Multi-Scale Branch Aggregation for Pose Estimation [96.29533512606078]
We present AutoPose, a novel neural architecture search(NAS) framework.
It is capable of automatically discovering multiple parallel branches of cross-scale connections towards accurate and high-resolution 2D human pose estimation.
arXiv Detail & Related papers (2020-08-16T22:27:43Z) - A novel Region of Interest Extraction Layer for Instance Segmentation [3.5493798890908104]
This paper is motivated by the need to overcome the limitations of existing RoI extractors.
The proposed layer (called Generic RoI Extractor - GRoIE) introduces non-local building blocks and attention mechanisms to boost the performance.
GRoIE can be integrated seamlessly with every two-stage architecture for both object detection and instance segmentation tasks.
arXiv Detail & Related papers (2020-04-28T17:07:32Z) - EfficientPS: Efficient Panoptic Segmentation [13.23676270963484]
We introduce the Efficient Panoptic (EfficientPS) architecture that efficiently encodes and fuses semantically rich multi-scale features.
We incorporate a semantic head that aggregates fine and contextual features coherently and a new variant of Mask R-CNN as the instance head.
We also introduce the KITTI panoptic segmentation dataset that contains panoptic annotations for the popularly challenging KITTI benchmark.
arXiv Detail & Related papers (2020-04-05T20:15:59Z) - DCNAS: Densely Connected Neural Architecture Search for Semantic Image
Segmentation [44.46852065566759]
We propose a Densely Connected NAS (DCNAS) framework, which directly searches the optimal network structures for the multi-scale representations of visual information.
Specifically, by connecting cells with each other using learnable weights, we introduce a densely connected search space to cover an abundance of mainstream network designs.
We demonstrate that the architecture obtained from our DCNAS algorithm achieves state-of-the-art performances on public semantic image segmentation benchmarks.
arXiv Detail & Related papers (2020-03-26T13:21:33Z) - AutoSTR: Efficient Backbone Search for Scene Text Recognition [80.7290173000068]
Scene text recognition (STR) is very challenging due to the diversity of text instances and the complexity of scenes.
We propose automated STR (AutoSTR) to search data-dependent backbones to boost text recognition performance.
Experiments demonstrate that, by searching data-dependent backbones, AutoSTR can outperform the state-of-the-art approaches on standard benchmarks.
arXiv Detail & Related papers (2020-03-14T06:51:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.