Non-planar Object Detection and Identification by Features Matching and Triangulation Growth
- URL: http://arxiv.org/abs/2506.13769v1
- Date: Mon, 19 May 2025 06:20:07 GMT
- Title: Non-planar Object Detection and Identification by Features Matching and Triangulation Growth
- Authors: Filippo Leveni,
- Abstract summary: We propose a feature-based approach for detecting and identifying distorted occurrences of a given template in a scene image.<n>We consider the Delaunay triangulation of template features as an useful tool through which to be guided in this approach.<n>Our solution allows the identification of the object in situations where geometric models (e.g. homography) does not hold.
- Score: 1.450405446885067
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Object detection and identification is surely a fundamental topic in the computer vision field; it plays a crucial role in many applications such as object tracking, industrial robots control, image retrieval, etc. We propose a feature-based approach for detecting and identifying distorted occurrences of a given template in a scene image by incremental grouping of feature matches between the image and the template. For this purpose, we consider the Delaunay triangulation of template features as an useful tool through which to be guided in this iterative approach. The triangulation is treated as a graph and, starting from a single triangle, neighboring nodes are considered and the corresponding features are identified; then matches related to them are evaluated to determine if they are worthy to be grouped. This evaluation is based on local consistency criteria derived from geometric and photometric properties of local features. Our solution allows the identification of the object in situations where geometric models (e.g. homography) does not hold, thus enable the detection of objects such that the template is non planar or when it is planar but appears distorted in the image. We show that our approach performs just as well or better than application of homography-based RANSAC in scenarios in which distortion is nearly absent, while when the deformation becomes relevant our method shows better description performance.
Related papers
- Mismatched: Evaluating the Limits of Image Matching Approaches and Benchmarks [9.388897214344572]
Three-dimensional (3D) reconstruction from two-dimensional images is an active research field in computer vision.
Traditionally, parametric techniques have been employed for this task.
Recent advancements have seen a shift towards learning-based methods.
arXiv Detail & Related papers (2024-08-29T11:16:34Z) - ZeroReg: Zero-Shot Point Cloud Registration with Foundation Models [77.84408427496025]
State-of-the-art 3D point cloud registration methods rely on labeled 3D datasets for training.<n>We introduce ZeroReg, a zero-shot registration approach that utilizes 2D foundation models to predict 3D correspondences.
arXiv Detail & Related papers (2023-12-05T11:33:16Z) - ODSmoothGrad: Generating Saliency Maps for Object Detectors [0.0]
We present ODSmoothGrad, a tool for generating saliency maps for the classification and the bounding box parameters in object detectors.
Given the noisiness of saliency maps, we also apply the SmoothGrad algorithm to visually enhance the pixels of interest.
arXiv Detail & Related papers (2023-04-15T18:21:56Z) - Adaptive Graph Convolution Module for Salient Object Detection [7.278033100480174]
We propose an adaptive graph convolution module (AGCM) to deal with complex scenes.
Prototype features are extracted from the input image using a learnable region generation layer.
The proposed AGCM dramatically improves the SOD performance both quantitatively and quantitatively.
arXiv Detail & Related papers (2023-03-17T07:07:17Z) - DisPositioNet: Disentangled Pose and Identity in Semantic Image
Manipulation [83.51882381294357]
DisPositioNet is a model that learns a disentangled representation for each object for the task of image manipulation using scene graphs.
Our framework enables the disentanglement of the variational latent embeddings as well as the feature representation in the graph.
arXiv Detail & Related papers (2022-11-10T11:47:37Z) - LEAD: Self-Supervised Landmark Estimation by Aligning Distributions of
Feature Similarity [49.84167231111667]
Existing works in self-supervised landmark detection are based on learning dense (pixel-level) feature representations from an image.
We introduce an approach to enhance the learning of dense equivariant representations in a self-supervised fashion.
We show that having such a prior in the feature extractor helps in landmark detection, even under drastically limited number of annotations.
arXiv Detail & Related papers (2022-04-06T17:48:18Z) - Fusing Local Similarities for Retrieval-based 3D Orientation Estimation
of Unseen Objects [70.49392581592089]
We tackle the task of estimating the 3D orientation of previously-unseen objects from monocular images.
We follow a retrieval-based strategy and prevent the network from learning object-specific features.
Our experiments on the LineMOD, LineMOD-Occluded, and T-LESS datasets show that our method yields a significantly better generalization to unseen objects than previous works.
arXiv Detail & Related papers (2022-03-16T08:53:00Z) - Leveraging Unsupervised Image Registration for Discovery of Landmark
Shape Descriptor [5.40076482533193]
This paper proposes a self-supervised deep learning approach for discovering landmarks from images that can directly be used as a shape descriptor for subsequent analysis.
We use landmark-driven image registration as the primary task to force the neural network to discover landmarks that register the images well.
The proposed method circumvents segmentation and preprocessing and directly produces a usable shape descriptor using just 2D or 3D images.
arXiv Detail & Related papers (2021-11-13T01:02:10Z) - Unsupervised Domain Adaption of Object Detectors: A Survey [87.08473838767235]
Recent advances in deep learning have led to the development of accurate and efficient models for various computer vision applications.
Learning highly accurate models relies on the availability of datasets with a large number of annotated images.
Due to this, model performance drops drastically when evaluated on label-scarce datasets having visually distinct images.
arXiv Detail & Related papers (2021-05-27T23:34:06Z) - Localization and Mapping using Instance-specific Mesh Models [12.235379548921061]
This paper focuses on building semantic maps, containing object poses and shapes, using a monocular camera.
Our contribution is an instance-specific mesh model of object shape that can be optimized online based on semantic information extracted from camera images.
arXiv Detail & Related papers (2021-03-08T00:24:23Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.