PMODE: Prototypical Mask based Object Dimension Estimation
- URL: http://arxiv.org/abs/2212.13281v1
- Date: Mon, 26 Dec 2022 19:24:25 GMT
- Title: PMODE: Prototypical Mask based Object Dimension Estimation
- Authors: Thariq Khalid, Mohammed Yahya Hakami, Riad Souissi
- Abstract summary: We propose a method to estimate the dimensions of a quadrilateral object of interest in videos using a monocular camera.
We trained the system with three different random cameras achieving 22% MAPE for the test dataset for the dimension estimation.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Can a neural network estimate an object's dimension in the wild? In this
paper, we propose a method and deep learning architecture to estimate the
dimensions of a quadrilateral object of interest in videos using a monocular
camera. The proposed technique does not use camera calibration or handcrafted
geometric features; however, features are learned with the help of coefficients
of a segmentation neural network during the training process. A real-time
instance segmentation-based Deep Neural Network with a ResNet50 backbone is
employed, giving the object's prototype mask and thus provides a region of
interest to regress its dimensions. The instance segmentation network is
trained to look at only the nearest object of interest. The regression is
performed using an MLP head which looks only at the mask coefficients of the
bounding box detector head and the prototype segmentation mask. We trained the
system with three different random cameras achieving 22% MAPE for the test
dataset for the dimension estimation
Related papers
- LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion [79.22197702626542]
This paper introduces a framework that explores amodal segmentation for robotic grasping in cluttered scenes.
We propose a Linear-fusion Attention-guided Convolutional Network (LAC-Net)
The results on different datasets show that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-08-06T14:50:48Z) - Intelligent Debris Mass Estimation Model for Autonomous Underwater
Vehicle [0.0]
Marine debris poses a significant threat to the survival of marine wildlife, often leading to entanglement and starvation.
Instance segmentation is an advanced form of object detection that identifies objects and precisely locates and separates them.
AUVs use image segmentation to analyze images captured by their cameras to navigate underwater environments.
arXiv Detail & Related papers (2023-09-19T13:47:31Z) - HAISTA-NET: Human Assisted Instance Segmentation Through Attention [3.073046540587735]
We propose a novel approach to enable more precise predictions and generate higher-quality segmentation masks.
Our human-assisted segmentation model, HAISTA-NET, augments the existing Strong Mask R-CNN network to incorporate human-specified partial boundaries.
We show that HAISTA-NET outperforms state-of-the art methods such as Mask R-CNN, Strong Mask R-CNN, and Mask2Former.
arXiv Detail & Related papers (2023-05-04T18:39:14Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - Neural Volumetric Object Selection [126.04480613166194]
We introduce an approach for selecting objects in neural volumetric 3D representations, such as multi-plane images (MPI) and neural radiance fields (NeRF)
Our approach takes a set of foreground and background 2D user scribbles in one view and automatically estimates a 3D segmentation of the desired object, which can be rendered into novel views.
arXiv Detail & Related papers (2022-05-30T08:55:20Z) - A singular Riemannian geometry approach to Deep Neural Networks II.
Reconstruction of 1-D equivalence classes [78.120734120667]
We build the preimage of a point in the output manifold in the input space.
We focus for simplicity on the case of neural networks maps from n-dimensional real spaces to (n - 1)-dimensional real spaces.
arXiv Detail & Related papers (2021-12-17T11:47:45Z) - DONet: Learning Category-Level 6D Object Pose and Size Estimation from
Depth Observation [53.55300278592281]
We propose a method of Category-level 6D Object Pose and Size Estimation (COPSE) from a single depth image.
Our framework makes inferences based on the rich geometric information of the object in the depth channel alone.
Our framework competes with state-of-the-art approaches that require labeled real-world images.
arXiv Detail & Related papers (2021-06-27T10:41:50Z) - Contour Primitive of Interest Extraction Network Based on One-Shot
Learning for Object-Agnostic Vision Measurement [37.552192926136065]
We propose the contour primitive of interest extraction network (CPieNet) based on the one-shot learning framework.
For the novel CPI extraction task, we built the Object Contour Primitives dataset using online public images, and the Robotic Object Contour Measurement dataset using a camera mounted on a robot.
arXiv Detail & Related papers (2020-10-07T11:00:30Z) - Monocular 3D Detection with Geometric Constraints Embedding and
Semi-supervised Training [3.8073142980733]
We propose a novel framework for monocular 3D objects detection using only RGB images, called KM3D-Net.
We design a fully convolutional model to predict object keypoints, dimension, and orientation, and then combine these estimations with perspective geometry constraints to compute position attribute.
arXiv Detail & Related papers (2020-09-02T00:51:51Z) - PointINS: Point-based Instance Segmentation [117.38579097923052]
Mask representation in instance segmentation with Point-of-Interest (PoI) features is challenging because learning a high-dimensional mask feature for each instance requires a heavy computing burden.
We propose an instance-aware convolution, which decomposes this mask representation learning task into two tractable modules.
Along with instance-aware convolution, we propose PointINS, a simple and practical instance segmentation approach.
arXiv Detail & Related papers (2020-03-13T08:24:58Z) - SDOD:Real-time Segmenting and Detecting 3D Object by Depth [5.97602869680438]
This paper proposes a real-time framework that segmenting and detecting 3D objects by depth.
We discretize the objects' depth into depth categories and transform the instance segmentation task into a pixel-level classification task.
Experiments on the challenging KITTI dataset show that our approach outperforms LklNet about 1.8 times on the speed of segmentation and 3D detection.
arXiv Detail & Related papers (2020-01-26T09:06:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.