INSTA-YOLO: Real-Time Instance Segmentation
- URL: http://arxiv.org/abs/2102.06777v1
- Date: Fri, 12 Feb 2021 21:17:29 GMT
- Title: INSTA-YOLO: Real-Time Instance Segmentation
- Authors: Eslam Mohamed, Abdelrahman Shaker, Hazem Rashed, Ahmad El-Sallab,
Mayada Hadhoud
- Abstract summary: We propose Insta-YOLO, a novel one-stage end-to-end deep learning model for real-time instance segmentation.
Instead of pixel-wise prediction, our model predicts instances as object contours represented by 2D points in Cartesian space.
We evaluate our model on three datasets, namely, Carvana,Cityscapes and Airbus.
- Score: 2.9769485817170387
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Instance segmentation has gained recently huge attention in various computer
vision applications. It aims at providing different IDs to different objects of
the scene, even if they belong to the same class. Instance segmentation is
usually performed as a two-stage pipeline. First, an object is detected, then
semantic segmentation within the detected box area is performed which involves
costly up-sampling. In this paper, we propose Insta-YOLO, a novel one-stage
end-to-end deep learning model for real-time instance segmentation. Instead of
pixel-wise prediction, our model predicts instances as object contours
represented by 2D points in Cartesian space. We evaluate our model on three
datasets, namely, Carvana,Cityscapes and Airbus. We compare our results to the
state-of-the-art models for instance segmentation. The results show our model
achieves competitive accuracy in terms of mAP at twice the speed on GTX-1080
GPU.
Related papers
- Moving Object Segmentation: All You Need Is SAM (and Flow) [82.78026782967959]
We investigate two models for combining SAM with optical flow that harness the segmentation power of SAM with the ability of flow to discover and group moving objects.
In the first model, we adapt SAM to take optical flow, rather than RGB, as an input. In the second, SAM takes RGB as an input, and flow is used as a segmentation prompt.
These surprisingly simple methods, without any further modifications, outperform all previous approaches by a considerable margin in both single and multi-object benchmarks.
arXiv Detail & Related papers (2024-04-18T17:59:53Z) - PARSAC: Accelerating Robust Multi-Model Fitting with Parallel Sample
Consensus [26.366299016589256]
We present a real-time method for robust estimation of multiple instances of geometric models from noisy data.
A neural network segments the input data into clusters representing potential model instances.
We demonstrate state-of-the-art performance on these as well as multiple established datasets, with inference times as small as five milliseconds per image.
arXiv Detail & Related papers (2024-01-26T14:54:56Z) - Segmenting Moving Objects via an Object-Centric Layered Representation [100.26138772664811]
We introduce an object-centric segmentation model with a depth-ordered layer representation.
We introduce a scalable pipeline for generating synthetic training data with multiple objects.
We evaluate the model on standard video segmentation benchmarks.
arXiv Detail & Related papers (2022-07-05T17:59:43Z) - Sparse Instance Activation for Real-Time Instance Segmentation [72.23597664935684]
We propose a conceptually novel, efficient, and fully convolutional framework for real-time instance segmentation.
SparseInst has extremely fast inference speed and achieves 40 FPS and 37.9 AP on the COCO benchmark.
arXiv Detail & Related papers (2022-03-24T03:15:39Z) - Prototypical Cross-Attention Networks for Multiple Object Tracking and
Segmentation [95.74244714914052]
Multiple object tracking and segmentation requires detecting, tracking, and segmenting objects belonging to a set of given classes.
We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich-temporal information online.
PCAN outperforms current video instance tracking and segmentation competition winners on Youtube-VIS and BDD100K datasets.
arXiv Detail & Related papers (2021-06-22T17:57:24Z) - Learning to Associate Every Segment for Video Panoptic Segmentation [123.03617367709303]
We learn coarse segment-level matching and fine pixel-level matching together.
We show that our per-frame computation model can achieve new state-of-the-art results on Cityscapes-VPS and VIPER datasets.
arXiv Detail & Related papers (2021-06-17T13:06:24Z) - Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection
Consistency [114.02182755620784]
We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion and depth in a monocular camera setup without supervision.
Our framework is shown to outperform the state-of-the-art depth and motion estimation methods.
arXiv Detail & Related papers (2021-02-04T14:26:42Z) - Vec2Instance: Parameterization for Deep Instance Segmentation [0.17205106391379021]
We describe a new deep convolutional neural network architecture called Vec2Instance for instance segmentation.
Vec2Instance provides a framework for parametrization of instances, allowing convolutional neural networks to efficiently estimate the complex shapes of instances around their centroids.
Total pixel-wise accuracy of our approach is 89%, near the accuracy of the state-of-the-art Mask RCNN (91%)
arXiv Detail & Related papers (2020-10-06T13:51:02Z) - Monocular Instance Motion Segmentation for Autonomous Driving: KITTI
InstanceMotSeg Dataset and Multi-task Baseline [5.000331633798637]
Moving object segmentation is a crucial task for autonomous vehicles as it can be used to segment objects in a class agnostic manner.
Although pixel-wise motion segmentation has been studied in autonomous driving literature, it has been rarely addressed at the instance level.
We create a new InstanceMotSeg dataset comprising of 12.9K samples improving upon our KITTIMoSeg dataset.
arXiv Detail & Related papers (2020-08-16T21:47:09Z) - OccuSeg: Occupancy-aware 3D Instance Segmentation [39.71517989569514]
"3D occupancy size" is the number of voxels occupied by each instance.
"OccuSeg" is an occupancy-aware 3D instance segmentation scheme.
"State-of-the-art performance" on 3 real-world datasets.
arXiv Detail & Related papers (2020-03-14T02:48:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.