Boundary-Aware Segmentation Network for Mobile and Web Applications
- URL: http://arxiv.org/abs/2101.04704v1
- Date: Tue, 12 Jan 2021 19:20:26 GMT
- Title: Boundary-Aware Segmentation Network for Mobile and Web Applications
- Authors: Xuebin Qin and Deng-Ping Fan and Chenyang Huang and Cyril Diagne and
Zichen Zhang and Adri\`a Cabeza Sant'Anna and Albert Su\`arez and Martin
Jagersand and Ling Shao
- Abstract summary: Boundary-Aware Network (BASNet) is integrated with a predict-refine architecture and a hybrid loss for highly accurate image segmentation.
BASNet runs at over 70 fps on a single GPU which benefits many potential real applications.
Based on BASNet, we further developed two (close to) commercial applications: AR COPY & PASTE, in which BASNet is augmented reality for "COPY" and "PASTING" real-world objects, and OBJECT CUT, which is a web-based tool for automatic object background removal.
- Score: 60.815545591314915
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Although deep models have greatly improved the accuracy and robustness of
image segmentation, obtaining segmentation results with highly accurate
boundaries and fine structures is still a challenging problem. In this paper,
we propose a simple yet powerful Boundary-Aware Segmentation Network (BASNet),
which comprises a predict-refine architecture and a hybrid loss, for highly
accurate image segmentation. The predict-refine architecture consists of a
densely supervised encoder-decoder network and a residual refinement module,
which are respectively used to predict and refine a segmentation probability
map. The hybrid loss is a combination of the binary cross entropy, structural
similarity and intersection-over-union losses, which guide the network to learn
three-level (ie, pixel-, patch- and map- level) hierarchy representations. We
evaluate our BASNet on two reverse tasks including salient object segmentation,
camouflaged object segmentation, showing that it achieves very competitive
performance with sharp segmentation boundaries. Importantly, BASNet runs at
over 70 fps on a single GPU which benefits many potential real applications.
Based on BASNet, we further developed two (close to) commercial applications:
AR COPY & PASTE, in which BASNet is integrated with augmented reality for
"COPYING" and "PASTING" real-world objects, and OBJECT CUT, which is a
web-based tool for automatic object background removal. Both applications have
already drawn huge amount of attention and have important real-world impacts.
The code and two applications will be publicly available at:
https://github.com/NathanUA/BASNet.
Related papers
- DDU-Net: A Domain Decomposition-based CNN for High-Resolution Image Segmentation on Multiple GPUs [46.873264197900916]
A domain decomposition-based U-Net architecture is introduced, which partitions input images into non-overlapping patches.
A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context.
Results show that the approach achieves a $2-3,%$ higher intersection over union (IoU) score compared to the same network without inter-patch communication.
arXiv Detail & Related papers (2024-07-31T01:07:21Z) - Simple and Efficient Architectures for Semantic Segmentation [50.1563637917129]
We show that a simple encoder-decoder architecture with a ResNet-like backbone and a small multi-scale head, performs on-par or better than complex semantic segmentation architectures such as HRNet, FANet and DDRNet.
We present a family of such simple architectures for desktop as well as mobile targets, which match or exceed the performance of complex models on the Cityscapes dataset.
arXiv Detail & Related papers (2022-06-16T15:08:34Z) - Collaborative Attention Memory Network for Video Object Segmentation [3.8520227078236013]
We propose Collaborative Attention Memory Network with an enhanced segmentation head.
We also propose an ensemble network to combine STM network with all these new refined CFBI network.
Finally, we evaluate our approach on the 2021 Youtube-VOS challenge where we obtain 6th place with an overall score of 83.5%.
arXiv Detail & Related papers (2022-05-17T03:40:11Z) - A Unified Architecture of Semantic Segmentation and Hierarchical
Generative Adversarial Networks for Expression Manipulation [52.911307452212256]
We develop a unified architecture of semantic segmentation and hierarchical GANs.
A unique advantage of our framework is that on forward pass the semantic segmentation network conditions the generative model.
We evaluate our method on two challenging facial expression translation benchmarks, AffectNet and RaFD, and a semantic segmentation benchmark, CelebAMask-HQ.
arXiv Detail & Related papers (2021-12-08T22:06:31Z) - A Novel Adaptive Deep Network for Building Footprint Segmentation [0.0]
We propose a novel network-based on Pix2Pix methodology to solve the problem of inaccurate boundaries obtained by converting satellite images into maps.
Our framework includes two generators where the first generator extracts localization features in order to merge them with the boundary features extracted from the second generator to segment all detailed building edges.
Different strategies are implemented to enhance the quality of the proposed networks' results, implying that the proposed network outperforms state-of-the-art networks in segmentation accuracy with a large margin for all evaluation metrics.
arXiv Detail & Related papers (2021-02-27T18:13:48Z) - A Holistically-Guided Decoder for Deep Representation Learning with
Applications to Semantic Segmentation and Object Detection [74.88284082187462]
One common strategy is to adopt dilated convolutions in the backbone networks to extract high-resolution feature maps.
We propose one novel holistically-guided decoder which is introduced to obtain the high-resolution semantic-rich feature maps.
arXiv Detail & Related papers (2020-12-18T10:51:49Z) - CARAFE++: Unified Content-Aware ReAssembly of FEatures [132.49582482421246]
We propose unified Content-Aware ReAssembly of FEatures (CARAFE++), a universal, lightweight and highly effective operator to fulfill this goal.
CARAFE++ generates adaptive kernels on-the-fly to enable instance-specific content-aware handling.
It shows consistent and substantial gains across all the tasks with negligible computational overhead.
arXiv Detail & Related papers (2020-12-07T07:34:57Z) - Auto-Panoptic: Cooperative Multi-Component Architecture Search for
Panoptic Segmentation [144.50154657257605]
We propose an efficient framework to simultaneously search for all main components including backbone, segmentation branches, and feature fusion module.
Our searched architecture, namely Auto-Panoptic, achieves the new state-of-the-art on the challenging COCO and ADE20K benchmarks.
arXiv Detail & Related papers (2020-10-30T08:34:35Z) - Multi-scale Attention U-Net (MsAUNet): A Modified U-Net Architecture for
Scene Segmentation [1.713291434132985]
We propose a novel multi-scale attention network for scene segmentation by using contextual information from an image.
This network can map local features with their global counterparts with improved accuracy and emphasize on discriminative image regions.
We have evaluated our model on two standard datasets named PascalVOC2012 and ADE20k.
arXiv Detail & Related papers (2020-09-15T08:03:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.